-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Moving closer to Dask #87
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very interesting! I don't feel like I have a good feel for the implications of implementing this but I think you pointed out some good reasons to move closer to Dask.
I think your concern is also mine: what is the effect of these optimizations and parallel performance when our chunks are small and data request sizes limited by our web-application.
I find this very hard to wrap my head around too. I am not sure if it is worth the effort, or if we should wait for the dask improvements to be released. In the meantime, we could focus on the infrastructure part rather than rewriting dask-geomodeling.
For web-application, we effectively do not parallelize across multiple chunks. This is not necessary as the chunks are so small. We may parallelize operations if the graph allows it (e.g. reading the same chunk from multiple sources concurrently instead of after each other). |
I am trying to get some ideas written down to move this project closer to dask. The current state of the document is a comparison of relevant parts of dask and dask-geomodeling.