Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrate prefect_aws.s3 async tasks to async_dispatch #16096

Merged
merged 11 commits into from
Nov 26, 2024

Conversation

zzstoatzz
Copy link
Collaborator

@zzstoatzz zzstoatzz commented Nov 22, 2024

related to #15008

adds explicit async methods to all public tasks and methods in prefect_aws.s3 and removes use of sync_compatable

old notes

i intentionally did not do the S3Bucket methods here, since there's an open question I want to discuss with @desertaxle

question

on S3Bucket

async def get_directory()

is not actually async at all, not even a sending sync stuff to a worker thread

the established pattern would say we do

async def aget_directory(self):
   # send to worker threads

@async_dispatch(aget_directory)
def get_directory(self):
   # original impl

that seems like it was duplicate a lot for not a very compelling reason

but because stopped supporting async in sync we might need to do this

also renamed tasks in backword compat fashion per this comment

@github-actions github-actions bot added the enhancement An improvement of an existing feature label Nov 23, 2024
@zzstoatzz zzstoatzz force-pushed the async-dispatch-aws-s3 branch from 6fa4f6b to 42ac696 Compare November 25, 2024 15:12
@zzstoatzz zzstoatzz marked this pull request as ready for review November 25, 2024 15:13
@zzstoatzz zzstoatzz requested a review from cicdw as a code owner November 25, 2024 15:13
@zzstoatzz zzstoatzz requested a review from desertaxle November 25, 2024 15:52
Copy link
Member

@desertaxle desertaxle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of nits on docstring examples. Also, it'd be good to have some tests covering the new async functions and methods.

src/integrations/prefect-aws/prefect_aws/s3.py Outdated Show resolved Hide resolved
src/integrations/prefect-aws/prefect_aws/s3.py Outdated Show resolved Hide resolved
src/integrations/prefect-aws/prefect_aws/s3.py Outdated Show resolved Hide resolved
src/integrations/prefect-aws/prefect_aws/s3.py Outdated Show resolved Hide resolved
src/integrations/prefect-aws/prefect_aws/s3.py Outdated Show resolved Hide resolved
src/integrations/prefect-aws/prefect_aws/s3.py Outdated Show resolved Hide resolved
@zzstoatzz zzstoatzz changed the title migrate prefect_aws.s3 async tasks to async_dispatch pt1 migrate prefect_aws.s3 async tasks to async_dispatch Nov 25, 2024
@zzstoatzz
Copy link
Collaborator Author

updated the docstrings and added some coverage for the new async tasks

@zzstoatzz zzstoatzz requested a review from desertaxle November 25, 2024 21:47
Copy link
Contributor

@discdiver discdiver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I added a note for each of the new/changed methods so that future SDK docs will reflect when a method was added/changed.

I assume a minor version bump is coming, but please change the versions noted if not.

key: Optional[str] = None,
) -> str:
"""
Uploads data to an S3 bucket.
Copy link
Contributor

@discdiver discdiver Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Uploads data to an S3 bucket.
Uploads data to an S3 bucket. Added in version 0.5.3.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@discdiver do you think we should have a minor version bump because of the new async versions of tasks?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd vote for it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We reserve minor version bumps in integration packages for breaking changes, so we should not release a new minor version for these changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Then just 0.5.3 for each note. I can edit the suggestions.

aws_client_parameters: AwsClientParameters = AwsClientParameters(),
) -> bytes:
"""
Downloads an object with a given key from a given S3 bucket.
Copy link
Contributor

@discdiver discdiver Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Downloads an object with a given key from a given S3 bucket.
Downloads an object with a given key from a given S3 bucket.
Added in version 0.5.3.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or whichever is the next version if not a minor bump.

the credentials must have permission to read and delete the source object and write
to the target object. If the credentials do not have those permissions, this method
will raise an error. If the credentials have permission to read the source object
but not delete it, the object will be copied but not deleted.
Copy link
Contributor

@discdiver discdiver Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
but not delete it, the object will be copied but not deleted.
but not delete it, the object will be copied but not deleted.
Added in version 0.5.3.


def _list_objects_sync(page_iterator: PageIterator):
"""
Synchronous method to collect S3 objects into a list
Copy link
Contributor

@discdiver discdiver Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Synchronous method to collect S3 objects into a list
Synchronous method to collect S3 objects into a list.
Added in version 0.5.3.

self,
from_folder: str,
to_folder: Optional[Union[str, Path]] = None,
**download_kwargs: Dict[str, Any],
) -> Path:
"""
Downloads objects *within* a folder (excluding the folder itself)
Asynchronously downloads objects *within* a folder (excluding the folder itself)
from the S3 bucket to a folder.
Copy link
Contributor

@discdiver discdiver Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from the S3 bucket to a folder.
from the S3 bucket to a folder. Changed in version 0.5.3.

@zzstoatzz
Copy link
Collaborator Author

@discdiver I'm not sure I like the idea of hardcoding versions where things in appeared in docstrings. also, nothing here is "now asynchronous", there are new tasks which are asynchronous, and the sync tasks will work the same

cc @desertaxle on the docstring question

@discdiver
Copy link
Contributor

@discdiver I'm not sure I like the idea of hardcoding versions where things in appeared in docstrings. also, nothing here is "now asynchronous", there are new tasks which are asynchronous, and the sync tasks will work the same

cc @desertaxle on the docstring question

Thanks @zzstoatzz. Removed note on async. The official Python docs use an .rst directive and then the version change note only shows up in the docs and not the inline docstring. I've asked the Mintlify team if they could add a keyword or symbol that did something similar.

I can see arguments both ways. It's nice to keep the docstrings uncluttered. And the note feels pretty small and could be helpful in some cases.

@zzstoatzz
Copy link
Collaborator Author

zzstoatzz commented Nov 26, 2024

And the note feels pretty small and could be helpful in some cases.

fair point! I think its a fine compromise to have them in there 👍

thank you for adding them!

@zzstoatzz zzstoatzz force-pushed the async-dispatch-aws-s3 branch from 62d0ee8 to 5df582e Compare November 26, 2024 16:12
@zzstoatzz zzstoatzz merged commit f58d064 into main Nov 26, 2024
12 checks passed
@zzstoatzz zzstoatzz deleted the async-dispatch-aws-s3 branch November 26, 2024 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An improvement of an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants