-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add scheduler overhead testing #48
base: main
Are you sure you want to change the base?
Conversation
- Put sleep_placeholder back to actually sleeping for #36 - Create core.data_placeholder for doing random compute on dask.array objects
- Use prefect.serve to create a sleep placeholder flow to run concurrently for scaling tests
- Split scheduler_deploy off from scheduler_overhead - Update overhead test script to create subflows using the correct list range - Update .gitignore
- Change serve call to set a `limit` equal to number of available CPUs
- Decrease sleep minimum and maximum - Enable loop over range of task sizes
- Differentiate between task -> flow (via deployment) and subflow -> flow (via threadpoolexecutor) testing - Preserve both tests in the same file - Add exception handling to skip the subflow -> flow scaling tests if the deployment script hasn't been run
…ome have temporary placeholder values.
- Patch core.sleep_placeholder to accept optional min_sleep argument - Define some "populate via argument" properties of the results file
… understanding how to get a result from run_deploy.
…ardcode of 0 for T_sum_flow_times accidently left over from merge, and clean up unneeded comments.
@krlberry can you please split out the csv writing functionality to a separate module so that we can reuse it for the Dask and later airflow tests? @amcnicho @krlberry @taktsutsumi Do you want to add the DaskTaskRunner to this request or another one? If I understand correctly, the DaskTaskRunner requires setting up a Dask cluster. |
Creating a DaskTaskRunner that allows deployment to both local process and k3d deployment is fairly straightforward so I think it makes sense to include it in this PR. We can probably reuse/adapt prefect_workflow/resource_management.py |
- Add a dask task runner version of the task_scaling_test - Use resource_management.connect_to_scheduler to set the dask task runner - Update the task sizes used in the different scaling tests
When I try running the latest version of the code I get the following error:
Have any of you encountered this? |
Part of the problem is that tr = connect_to_scheduler() should not be outside if name == "main": |
I was also getting an To fix it I had to create the dask cluster instead of using resource_management import connect_to_scheduler. Code:
|
The Dask tests were taking a long time due to log printouts. After adding:
It is completing a lot quicker now. |
<WARN: puts task tests into a broken state> - Try to condense task submission tests into re-using a single flow - FutureWarning: Artifact creation outside of a flow or task run is deprecated and will be removed in a later version. - TypeError: 'fn' must be callable
- Fix dynamic flow creation
- Clean up reporting of runner types - Fix RuntimeWarning: Enable tracemalloc to get the object allocation traceback
I think this is ready to merge into main and then run tests in all the dev environments. |
I'd agree about the merge to main. All the tests ran successfully in my environment earlier today. |
Adds the following overhead tests for #36 :
The flow_scaling_test is run conditionally if an existing deployment is detected.