Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DIG-1671: Adding date format to manifest #66

Merged
merged 16 commits into from
Jun 7, 2024
Merged

Conversation

yavyx
Copy link
Contributor

@yavyx yavyx commented May 24, 2024

What's new

  • Added a required date_format field to the manifest. It specifies the order of day, month and year to parse ambiguous dates properly.
  • Updated test data and manifest with date formats.
  • Improved mapping exceptions to be more readable and specific

How to test

  • Add date_format to the manifest
  • Run CSVConvert.py using a dataset with ambiguous dates for that field and try different patterns in the template to confirm they are parsed as indicated.
  • Test removing date resolutions and references to see the exceptions.

Previously:
Added a date_format parameter to the date_interval mapping function to specify the order of day, month and year to parse ambiguous dates properly.

How to test
- Add date_format to any date field in the template
DONOR.INDEX.date_of_birth, {date_interval(Donor.date_of_birth, date_format="DMY")}
- Run CSVConvert.py using a dataset with ambiguous dates for that field and try different patterns in the template to confirm they are parsed as indicated.

@yavyx yavyx requested review from mshadbolt and daisieh May 24, 2024 00:36
Copy link
Member

@daisieh daisieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I change the template line to

DONOR.INDEX.date_of_birth, {date_interval(Donor.date_of_birth, date_format="MDY")}

I don't think the date_format gets passed in? If I add a print statement to the top of date_interval in mappings.py, like

def date_interval(data_values, date_format=None):
    print(f"hello {date_format}")

I get a lot of "hello None" in the output.

I don't think that you can pass in extra arguments like this into the template file, at least not as I've written it.

I appreciate the idea behind making it more flexible for date formats, but I'm not sure that it's possible to do it this way. I think you might have to write a wrapper method in your own .py file that parses the date with a format.

@yavyx
Copy link
Contributor Author

yavyx commented Jun 4, 2024

Re-submitting PR, DIG-1671

What's new

  • Added a required date_format field to the manifest. It specifies the order of day, month and year to parse ambiguous dates properly.
  • Updated test data and manifest with date formats.
  • Improved mapping exceptions to be more readable and specific

How to test

  • Add date_format to the manifest
  • Run CSVConvert.py using a dataset with ambiguous dates for that field and try different patterns in the template to confirm they are parsed as indicated.
  • Test removing date resolutions and references to see the exceptions.

@yavyx yavyx requested a review from daisieh June 4, 2024 00:26
@mshadbolt mshadbolt changed the title Adding date format to date intervals DIG-1671: Adding date format to manifest Jun 4, 2024
@mshadbolt
Copy link
Contributor

I added some try catch blocks to the part that is reading the manifest so that if the fields are blank, the script exits as well as if the manifest doesn't have those specified at all, also added a catch to ensure the date format is some form of 'DMY/MDY/YDM'

Copy link
Contributor

@mshadbolt mshadbolt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me and works as expected

Copy link
Member

@daisieh daisieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny changes

src/clinical_etl/mappings.py Show resolved Hide resolved
src/clinical_etl/mappings.py Outdated Show resolved Hide resolved
@yavyx
Copy link
Contributor Author

yavyx commented Jun 7, 2024

@daisieh I addressed the changes and Marion's fix is working

@yavyx yavyx requested a review from daisieh June 7, 2024 00:10
Copy link
Member

@daisieh daisieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thanks to both @yavyx and @mshadbolt for taking over so much of the coding in this project!

@yavyx yavyx merged commit a584ca5 into develop Jun 7, 2024
2 checks passed
@yavyx yavyx deleted the yavyx/date-formats branch June 7, 2024 00:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants