-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compare performance of CWL implementations #103
Comments
It's also worth considering how many of the open issues here are specifically related to toil. |
WIP on this right now. Reconfiguring dev to use cwltool and will rerun a job that succeeded with cwltoil |
cwl-airflow summary:
|
Rabix Bunny summary:
|
@johnbradley what about performance for rabix bunny? Does it run scatter tasks concurrently? |
Since I already had an idle VM for airflow testing with the data on it, I installed rabix bunny. Initially it failed due to relative file paths in the input job order, but a quick
|
There is also an issue on rabix bunny with scatter and TES/funnel: rabix/bunny#382. |
I haven't had much luck finding logs from the failed rabix jobs, but I was able to get docker logs from the failed job:
picard's BedToIntervalList expects a secondary |
bunny secondaryFiles as inputs bug discussed in rabix/bunny#211, but the state discussed there (working in local but not with TES) isn't consistent with what I'm seeing |
Quote from that rabix/bunny issue above:
|
Looking into Arvados. However running this on ubuntu 16.04 it gets stuck starting up and keeps printing this:
I let it run for an hour and it didn't get past this step. There are separate manual installation instructions that I will look into: https://doc.arvados.org/install/index.html |
Some promising initial results with cwl-tes and funnel
Funnel does appear to support some clusters/schedulers so there may be a way to limit this. |
Actually it does appear that cwl-tes extracts ResourceRequirements (CPU/RAM/Disk) out of the CWL and provides them to the TES server |
Funnel's built-in web interface is pretty handy, and the tasks are named based on the CWL step name: Not a lot of info about an These tasks to have the memory and CPU requirements from the workflow annotated on them, and I do believe funnel is doing its best to schedule them. |
Hi everyone! I am the author of cwl-tes and a lead developer of Funnel. Regarding Funnel's chattiness; that is a configurable option in the worker that can be turned off. We just cut a new Funnel release today with breaking changes (release notes). Please let me know if you have any questions/comments regarding either project. My colleagues and I would be happy to help. |
Currently running a workflow under |
Happy with performance in a single VM using |
Reopening since this will be useful for future enhancements/scalability. |
Hey, https://github.com/duke-gcb/calrissian looks pretty good 😆 |
Toil adds a lot of complexity to running the workflows and makes them harder to debug. We assume this is the best CWL implementation to use because of parallelism, but don't have data to back that up. Even if it's faster, it's not clear if the performance improvements over cwl-runner, for example outweigh the complexity at this point.
The text was updated successfully, but these errors were encountered: