-
Notifications
You must be signed in to change notification settings - Fork 121
Wait method for jobs / higher level job API #240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @JonaOtto For the Job API, I'm currently working in #224 to rework the whole API structure a bit, to support more features and to hopefully make it easier to interact with the job interface, i.e supporting more methods like cancel, update, suspend, hold and so on... But there are still some things to do until it's done :) Anyway, for your specific problem right now with the current codebase: This could easily be replicated in pyslurm I guess, making a function as you said I could take a look at this when I have the time, otherwise if you want to give it a try and do a PR afterwards, go ahead :) |
Hi @tazend, |
* Fix introduced typo in partition information dictionary key. (#241) * Added wait_finished method to job class (#240). * Added test method for wait_finished method of the job class. * Added _load_single_job method to the job class to extract the slurm_load_job functionality. * Updated find_id and wait_finished to use _load_single_job. Co-authored-by: Jonathan Goodson <jonathan.goodson@gmail.com>
* Fix introduced typo in partition information dictionary key. (#241) * Added wait_finished method to job class (#240). * Added test method for wait_finished method of the job class. * Added _load_single_job method to the job class to extract the slurm_load_job functionality. * Updated find_id and wait_finished to use _load_single_job. Co-authored-by: Jonathan Goodson <jonathan.goodson@gmail.com>
Hello pyslurm developers,
I work on an HPC performance tool for my university. We want to enable the tool to dispatch measurement executions of a target code to our cluster, which uses SLURM. Ideally, we want to use pyslurm for this.
What we need is a way to:
job.submit_batch_job
.job.wait(job_id)
would be nice, which you could call to wait for a job (referenced by the job_id) to finish.I'm a pyslurm newbie, but as far as I understand, there is no such thing in pyslurm at the moment. As far as I understand there would be several possibilities building such behavior with some combinations of the
find
,find_id
andget
methods from the job class.How do you think would be the approach to do this? Would you think it would be applicable to build such behavior into pyslurm? Or that this is a thing that our tool should care about?
I have to dive deeper into the code, but if there is a thing on this topic I can help with, I would be happy to do so. Generally, we would like to offer to contribute back our knowledge we may obtain during the process, if it is in code or not. It would maybe also be a possibility just to see how it turns out on our side, and we contribute back our code/interface we developed, or even just some comments for others on how we did it.
Thanks for doing this great project, I'm exited to hear your thoughts!
Best,
Jonathan
The text was updated successfully, but these errors were encountered: