Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stable candidate v3.1.1 #91

Merged
merged 82 commits into from
Nov 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
82 commits
Select commit Hold shift + click to select a range
3197a90
adding date format to date intervals
yavyx May 24, 2024
fac45cc
addded date format
yavyx Jun 3, 2024
bd2c200
updated docs
yavyx Jun 3, 2024
4bdf2c3
updated test date formats
yavyx Jun 3, 2024
720ce67
make exceptions more readable
yavyx Jun 4, 2024
13a4b72
make exception specific
yavyx Jun 4, 2024
ac74741
cleanup
yavyx Jun 4, 2024
82026ea
cleanup
yavyx Jun 4, 2024
7ad091b
Update src/clinical_etl/CSVConvert.py
yavyx Jun 4, 2024
7378507
exception when parser fails
yavyx Jun 4, 2024
9497029
update reference date docs
yavyx Jun 4, 2024
fef07c0
improved error catching in manifest
mshadbolt Jun 4, 2024
9e09957
better error msg
mshadbolt Jun 4, 2024
2ae7583
fix month intervals
mshadbolt Jun 6, 2024
89036d6
remove import
mshadbolt Jun 6, 2024
a646208
minor fixes
yavyx Jun 6, 2024
a584ca5
Merge pull request #66 from CanDIG/yavyx/date-formats
yavyx Jun 7, 2024
75046bf
add sample redcap files (#67)
kcranston Jul 5, 2024
24cf834
create mohschemav3 class
yavyx Jul 24, 2024
259c0e1
validate v3 donors
yavyx Jul 24, 2024
2a6404d
validate primary diagnoses
yavyx Jul 24, 2024
1397b69
validate specimens
yavyx Jul 24, 2024
5413077
validation logic fix
yavyx Jul 25, 2024
a3690a8
validate treatments
yavyx Jul 25, 2024
3688cd9
treatment validation fix
yavyx Jul 25, 2024
5391ff9
validate systemic & radiation therapies
yavyx Jul 25, 2024
742856c
validate followups
yavyx Jul 25, 2024
5a0eda6
replace match with if, less indentation
yavyx Jul 26, 2024
c3f6574
validate exposures & comorbidities
yavyx Jul 29, 2024
697fa3c
rename v3 to lowercase
yavyx Jul 30, 2024
a78085a
sample registration validation (required)
yavyx Jul 30, 2024
5681aff
validate surgeries
yavyx Jul 31, 2024
7aaccfd
added missing argument
yavyx Jul 31, 2024
e7b4dfd
fix nested schemas
yavyx Jul 31, 2024
56d900b
update references to v3
yavyx Jul 31, 2024
f3e7c26
fix test
yavyx Jul 31, 2024
5883699
fix test, using v2 by default
yavyx Jul 31, 2024
4c16e19
update default templates
yavyx Jul 31, 2024
f113fdb
update test data files
mshadbolt Aug 2, 2024
4631e59
manual mapping function changes to csv templates
mshadbolt Aug 2, 2024
a3343b1
update test yamls
mshadbolt Aug 2, 2024
694c784
update tests
mshadbolt Aug 2, 2024
1a24ce9
add biomarker validation, edit resolution to work
mshadbolt Aug 2, 2024
74c3f64
update templates
mshadbolt Aug 6, 2024
ad06530
fix typo
mshadbolt Aug 6, 2024
a643f76
fix missing comma
mshadbolt Aug 6, 2024
38f3ec0
remove extra staging validation
mshadbolt Aug 7, 2024
bce78b8
rename v2 template
yavyx Aug 7, 2024
ec5b414
added multisheet line for testing
yavyx Aug 7, 2024
22c32fb
stage group validation fix
yavyx Aug 8, 2024
8118a43
validate systemic therapy dates
yavyx Aug 9, 2024
4e1c46e
test systemic therapy date validation
yavyx Aug 9, 2024
66b5ddc
add sample redcap files (#67)
kcranston Jul 5, 2024
59c0cfe
Merge branch 'develop' into yavyx/moh-v3
yavyx Aug 9, 2024
b849e87
Merge pull request #68 from CanDIG/yavyx/moh-v3
yavyx Aug 9, 2024
e57ca01
Fix up some of the validations (#69)
mshadbolt Aug 10, 2024
c532299
update to version 3 (#70)
mshadbolt Aug 15, 2024
94434d2
Add RedCap export splitting script (#78)
mshadbolt Sep 5, 2024
1214313
Update validation and test data for model 3.1 (#80)
mshadbolt Sep 18, 2024
32fe6a8
update schema urls
daisieh Sep 19, 2024
aa8793d
more schema changes in v3 templates
yavyx Sep 21, 2024
8b02362
Merge pull request #82 from CanDIG/daisieh/update-schema-url
daisieh Sep 21, 2024
89cfb28
DIG-1772 & DIG-1782: Handle -99 and 'Not available' as missing (#83)
mshadbolt Sep 24, 2024
52487a6
Merge branch 'stable' into develop
mshadbolt Sep 30, 2024
ced45ae
fix templates rm dists (#85)
mshadbolt Oct 2, 2024
3ae6c6f
Allows to be run as a stand alone. (#87)
DavidBrownlee Oct 25, 2024
077b617
Trying to fix .github/workflow Compare to moh_v3_template.csv test. …
DavidBrownlee Oct 25, 2024
4da6290
github workflow corrected.
DavidBrownlee Oct 25, 2024
b57146f
DIG-1819: Warn instead of error when reference date missing & add val…
mshadbolt Nov 8, 2024
cda272d
move missing cases to validation file
yavyx Nov 14, 2024
bbd1433
remove redundant map file save
yavyx Nov 14, 2024
bee7427
update readme
yavyx Nov 14, 2024
c404630
simplify required fields validation
yavyx Nov 15, 2024
4ecf928
update test
yavyx Nov 15, 2024
7d9796b
restore missing warning
yavyx Nov 18, 2024
e295780
Merge pull request #89 from CanDIG/yavyx/move-missing-cases
yavyx Nov 18, 2024
d1e9ddd
update pkg-info
yavyx Nov 19, 2024
f9c585d
update readme
yavyx Nov 19, 2024
975c974
Merge branch 'david/githubWorkflowTestFix' into develop
DavidBrownlee Nov 19, 2024
96297d1
Update version in pyproject.toml
mshadbolt Nov 19, 2024
79447d8
remove dist files
yavyx Nov 22, 2024
8397452
Merge pull request #90 from CanDIG/yavyx/update-docs
yavyx Nov 22, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,18 +19,18 @@ jobs:
- name: Install dependencies
run: |
pip install -r requirements.txt
python -m pip install -e .
- name: Test with pytest
run: |
pytest
- name: Compare moh_template.csv
- name: Compare to moh_v3_template.csv
shell: bash {0}
run: |
python generate_schema.py
diff template.csv moh_template.csv > curr_diff.txt
# Script based largely on update_moh_template.sh
python src/clinical_etl/generate_schema.py --out moh_template
diff moh_template.csv moh_v3_template.csv > curr_diff.txt
bytes=$(head -5 curr_diff.txt | wc -c)
dd if=curr_diff.txt bs="$bytes" skip=1 conv=notrunc of=new_diff.txt
diff new_diff.txt test_data/moh_diffs.txt
diff new_diff.txt tests/moh_diffs.txt
if [[ $? == 1 ]]; then echo MoH template checking needs to be updated! See https://github.com/CanDIG/clinical_ETL_code#mapping-template for information.
exit 1
fi
19 changes: 2 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,12 +45,6 @@ Install the repo's requirements in your virtual environment
pip install -r requirements.txt
```

>[!NOTE]
> If Python can't find the `clinical_etl` module when running `CSVConvert`, install the depencency manually:
> ```
> pip install -e clinical_ETL_code/
> ```

Before running the script, you will need to have your input files, this will be clinical data in a tabular format (`xlsx`/`csv`) that can be read into program and a cohort directory containing the files that define the schema and mapping configurations.

### Input file/s format
Expand All @@ -65,7 +59,7 @@ If you are working with exports from RedCap, the sample files in the [`sample_in

### Setting up a cohort directory

For each dataset (cohort) that you want to convert, create a directory outside of this repository. For CanDIG devs, this will be in the private `data` repository. This cohort directory should contain the same files as shown in the [`sample_inputs/generic_example`](sample_inputs/generic_example) directory, which are:
For each dataset (cohort) that you want to convert, create a directory outside of this repository. For CanDIG devs, this will be in the private `clinical_ETL_data` repository. This cohort directory should contain the same files as shown in the [`sample_inputs/generic_example`](sample_inputs/generic_example) directory, which are:

* a [`manifest.yml`](#Manifest-file) file with configuration settings for the mapping and schema validation
* a [mapping template](#Mapping-template) csv that lists custom mappings for each field (based on `moh_template.csv`)
Expand Down Expand Up @@ -96,7 +90,7 @@ You'll need to create a mapping template that defines the mapping between the fi

Each line in the mapping template is composed of comma separated values with two components. The first value is an `element` or field from the target schema and the second value contains a suggested `mapping method` or function to map a field from an input sheet to a valid value for the identified `element`. Each `element`, shows the full object linking path to each field required by the model. These values should not be edited.

If you are generating a mapping for the current CanDIG MoH model, you can use the pre-generated [`moh_template.csv`](moh_template.csv) file. This file is modified from the auto-generated template to update a few fields that require specific handling.
If you are generating a mapping for the current CanDIG MoH model, you can use the pre-generated [`moh_v3_template.csv`](moh_v3_template.csv) file. This file is modified from the auto-generated template to update a few fields that require specific handling.

You will need to edit the `mapping method` values in each line in the following ways:
1. Replace the generic sheet names (e.g. `DONOR_SHEET`, `SAMPLE_REGISTRATIONS_SHEET`) with the sheet/csv names you are using as your input to `CSVConvert.py`
Expand Down Expand Up @@ -158,12 +152,6 @@ The main output `<INPUT_DIR>_map.json` and optional output`<INPUT_DIR>_indexed.j

Validation will automatically be run after the conversion is complete. Any validation errors or warnings will be reported both on the command line and as part of the `<INPUT_DIR>_map.json` file.

>[!NOTE]
> If Python can't find the `clinical_etl` module when running `CSVConvert`, install the depencency manually:
> ```
> pip install -e clinical_ETL_code/
> ```

#### Format of the output files

`<INPUT_DIR>_map.json` is the main output and contains the results of the mapping, conversion and validation as well as summary statistics.
Expand All @@ -187,9 +175,6 @@ A summarised example of the output is below:
"schemas_used": [
"donors"
],
"cases_missing_data": [
"DONOR_5"
],
"schemas_not_used": [
"exposures",
"biomarkers"
Expand Down
Binary file removed dist/clinical_ETL-2.2.1-py3-none-any.whl
Binary file not shown.
Binary file removed dist/clinical_ETL-3.0.0-py3-none-any.whl
Binary file not shown.
Binary file removed dist/clinical_ETL-3.1.0-py3-none-any.whl
Binary file not shown.
Binary file removed dist/clinical_etl-2.2.1.tar.gz
Binary file not shown.
Binary file removed dist/clinical_etl-3.0.0.tar.gz
Binary file not shown.
Binary file removed dist/clinical_etl-3.1.0.tar.gz
Binary file not shown.
2 changes: 1 addition & 1 deletion moh_v3_template.csv
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,7 @@ DONOR.INDEX.biomarkers.INDEX.her2_ish_status, {single_val(BIOMARKERS_SHEET.her2_
DONOR.INDEX.biomarkers.INDEX.hpv_ihc_status, {single_val(BIOMARKERS_SHEET.hpv_ihc_status)}
DONOR.INDEX.biomarkers.INDEX.hpv_pcr_status, {single_val(BIOMARKERS_SHEET.hpv_pcr_status)}
DONOR.INDEX.biomarkers.INDEX.hpv_strain, {pipe_delim(BIOMARKERS_SHEET.hpv_strain)}
DONOR.INDEX.followups.INDEX, {indexed_on(FOLLOWUPS_SHEET.submitter_donor_id)}
DONOR.INDEX.followups.INDEX, {moh_indexed_on_donor_if_others_absent(FOLLOWUPS_SHEET.submitter_donor_id)}
DONOR.INDEX.followups.INDEX.submitter_follow_up_id, {single_val(FOLLOWUPS_SHEET.submitter_follow_up_id)}
DONOR.INDEX.followups.INDEX.date_of_followup, {date_interval(FOLLOWUPS_SHEET.date_of_followup)}
DONOR.INDEX.followups.INDEX.disease_status_at_followup, {single_val(FOLLOWUPS_SHEET.disease_status_at_followup)}
Expand Down
4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ requires = ["setuptools >= 61.0"]
build-backend = "setuptools.build_meta"

[project]
version = "3.1.0"
version = "3.1.1"
name = "clinical_ETL"
dependencies = [
"pandas>=2.1.0",
Expand All @@ -25,4 +25,4 @@ readme = "README.md"
CSVConvert = "clinical_etl.CSVConvert:main"

[project.urls]
Repository = "https://github.com/CanDIG/clinical_ETL_code"
Repository = "https://github.com/CanDIG/clinical_ETL_code"
20 changes: 3 additions & 17 deletions src/clinical_ETL.egg-info/PKG-INFO
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Metadata-Version: 2.1
Name: clinical_ETL
Version: 3.1.0
Version: 3.1.1
Summary: ETL module for transforming clinical CSV data into properly-formatted packets for ingest into Katsu
Project-URL: Repository, https://github.com/CanDIG/clinical_ETL_code
Requires-Python: >=3.10
Expand Down Expand Up @@ -64,12 +64,6 @@ Install the repo's requirements in your virtual environment
pip install -r requirements.txt
```

>[!NOTE]
> If Python can't find the `clinical_etl` module when running `CSVConvert`, install the depencency manually:
> ```
> pip install -e clinical_ETL_code/
> ```

Before running the script, you will need to have your input files, this will be clinical data in a tabular format (`xlsx`/`csv`) that can be read into program and a cohort directory containing the files that define the schema and mapping configurations.

### Input file/s format
Expand All @@ -84,7 +78,7 @@ If you are working with exports from RedCap, the sample files in the [`sample_in

### Setting up a cohort directory

For each dataset (cohort) that you want to convert, create a directory outside of this repository. For CanDIG devs, this will be in the private `data` repository. This cohort directory should contain the same files as shown in the [`sample_inputs/generic_example`](sample_inputs/generic_example) directory, which are:
For each dataset (cohort) that you want to convert, create a directory outside of this repository. For CanDIG devs, this will be in the private `clinical_ETL_data` repository. This cohort directory should contain the same files as shown in the [`sample_inputs/generic_example`](sample_inputs/generic_example) directory, which are:

* a [`manifest.yml`](#Manifest-file) file with configuration settings for the mapping and schema validation
* a [mapping template](#Mapping-template) csv that lists custom mappings for each field (based on `moh_template.csv`)
Expand Down Expand Up @@ -177,12 +171,6 @@ The main output `<INPUT_DIR>_map.json` and optional output`<INPUT_DIR>_indexed.j

Validation will automatically be run after the conversion is complete. Any validation errors or warnings will be reported both on the command line and as part of the `<INPUT_DIR>_map.json` file.

>[!NOTE]
> If Python can't find the `clinical_etl` module when running `CSVConvert`, install the depencency manually:
> ```
> pip install -e clinical_ETL_code/
> ```

#### Format of the output files

`<INPUT_DIR>_map.json` is the main output and contains the results of the mapping, conversion and validation as well as summary statistics.
Expand All @@ -206,9 +194,6 @@ A summarised example of the output is below:
"schemas_used": [
"donors"
],
"cases_missing_data": [
"DONOR_5"
],
"schemas_not_used": [
"exposures",
"biomarkers"
Expand All @@ -220,6 +205,7 @@ A summarised example of the output is below:
}
}
```
`<INPUT_DIR>_validation_results.json` contains all validation warnings and errors.

The mapping and transformation result is found in the `"donors"` key.

Expand Down
37 changes: 17 additions & 20 deletions src/clinical_etl/CSVConvert.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,7 @@
import yaml
import argparse
from tqdm import tqdm
from clinical_etl import mappings
# Include clinical_etl parent directory in the module search path.
current_dir = os.path.dirname(os.path.abspath(__file__))
parent_dir = os.path.dirname(current_dir)
sys.path.append(parent_dir)
import mappings


def verbose_print(message):
Expand Down Expand Up @@ -277,7 +273,7 @@ def eval_mapping(node_name, rownum):
"""
verbose_print(f" Evaluating {mappings.IDENTIFIER}: {node_name}")
if "mappings" not in mappings.MODULES:
mappings.MODULES["mappings"] = importlib.import_module("clinical_etl.mappings")
mappings.MODULES["mappings"] = importlib.import_module("mappings")
modulename = "mappings"

method, parameters = parse_mapping_function(node_name)
Expand Down Expand Up @@ -596,7 +592,7 @@ def load_manifest(manifest_file):

# programatically load schema class based on manifest value:
# schema class definition will be in a file named schema_class.lower()
schema_mod = importlib.import_module(f"clinical_etl.{schema_class.lower()}")
schema_mod = importlib.import_module(f"{schema_class.lower()}")
schema = getattr(schema_mod, schema_class)(manifest["schema"])
if schema.json_schema is None:
sys.exit(f"Could not read an openapi schema at {manifest['schema']};\n"
Expand Down Expand Up @@ -633,7 +629,7 @@ def load_manifest(manifest_file):
f"{manifest_dir} and has the correct name.\n---")
sys.exit(e)
# mappings is a standard module: add it
mappings.MODULES["mappings"] = importlib.import_module("clinical_etl.mappings")
mappings.MODULES["mappings"] = importlib.import_module("mappings")
return result


Expand Down Expand Up @@ -743,36 +739,37 @@ def csv_convert(input_path, manifest_file, minify=False, index_output=False, ver
json.dump(mappings.INDEXED_DATA, f, indent=4)

result_key = list(schema.validation_schema.keys()).pop(0)

result = {
"openapi_url": schema.openapi_url,
"schema_class": type(schema).__name__,
result_key: packets
}
if schema.katsu_sha is not None:
result["katsu_sha"] = schema.katsu_sha
print(f"{Bcolors.OKGREEN}Saving packets to file.{Bcolors.ENDC}")
with open(f"{mappings.OUTPUT_FILE}_map.json", 'w') as f: # write to json file for ingestion
if minify:
json.dump(result, f)
else:
json.dump(result, f, indent=4)

# add validation data:
print(f"\n{Bcolors.OKGREEN}Starting validation...{Bcolors.ENDC}")
schema.validate_ingest_map(result)
validation_results = {"validation_errors": schema.validation_errors,
"validation_warnings": schema.validation_warnings}
"validation_warnings": schema.validation_warnings,
"cases_missing_data": schema.statistics["cases_missing_data"]}
result["statistics"] = schema.statistics
with open(f"{mappings.OUTPUT_FILE}_map.json", 'w') as f: # write to json file for ingestion
result["statistics"].pop("cases_missing_data") # remove donor IDs from _map.json file

# write ingestion and validation json files
print(f"{Bcolors.OKGREEN}Saving packets to file.{Bcolors.ENDC}")
with open(f"{mappings.OUTPUT_FILE}_map.json", 'w') as f:
if minify:
json.dump(result, f)
else:
json.dump(result, f, indent=4)
errors_present = False
with open(f"{input_path}_validation_results.json", 'w') as f:
json.dump(validation_results, f, indent=4)
print(f"Warnings written to {input_path}_validation_results.json.")
if len(validation_results["validation_errors"]) == 0 and len(validation_results["validation_warnings"]) == 0:
print(f"{Bcolors.OKGREEN}Validation passed!{Bcolors.ENDC}")
else:
with open(f"{input_path}_validation_results.json", 'w') as f:
json.dump(validation_results, f, indent=4)
print(f"Warnings written to {input_path}_validation_results.json.")
if len(validation_results["validation_warnings"]) > 0:
if len(validation_results["validation_warnings"]) > 20:
print(f"\n{Bcolors.WARNING}WARNING: There are {len(validation_results['validation_warnings'])} validation "
Expand Down
3 changes: 3 additions & 0 deletions src/clinical_etl/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Allows relative imports from current directory to work.
import os, sys
sys.path.append(os.path.dirname(os.path.realpath(__file__)))
6 changes: 6 additions & 0 deletions src/clinical_etl/generate_mapping_docs.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
# Updates the ../../mapping_functions.md
# Prior to running, set the PYTHONPATH for use by the subprocess with:
# export PYTHONPATH="$PWD"
# Then run:
# python generate_mapping_docs.py

import subprocess


Expand Down
4 changes: 2 additions & 2 deletions src/clinical_etl/generate_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ def parse_args():
default="https://raw.githubusercontent.com/CanDIG/katsu/develop/chord_metadata_service/mohpackets/docs/schemas/schema.json")
parser.add_argument('--schema', type=str, help="Name of schema class", default="MoHSchemaV3")
parser.add_argument('--out', type=str,
help="name of output file; csv extension will be added. Default is template",
default="template")
help="name of output file; csv extension will be added. Default is moh_template",
default="moh_template")
args = parser.parse_args()
return args

Expand Down
2 changes: 1 addition & 1 deletion src/clinical_etl/genomicschema.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import json
import dateparser
from clinical_etl.schema import BaseSchema, ValidationError
from schema import BaseSchema, ValidationError


"""
Expand Down
30 changes: 20 additions & 10 deletions src/clinical_etl/mappings.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import datetime
import math
from dateutil import relativedelta
import copy

VERBOSE = False
MODULES = {}
Expand Down Expand Up @@ -70,21 +71,27 @@ def earliest_date(data_values):
"""
fields = list(data_values.keys())
date_resolution = list(data_values[fields[0]].values())[0]
dates = list(data_values[fields[1]].values())[0]
dates = copy.deepcopy(list(data_values[fields[1]].values())[0])
earliest = DEFAULT_DATE_PARSER.get_date_data(str(datetime.date.today()))
# Ensure dates is a list, not a string, to allow non-indexed, single value entries.
if type(dates) is not list:
dates_list = [dates]
else:
dates_list = dates
for date in dates_list:
d = DEFAULT_DATE_PARSER.get_date_data(date)
if d['date_obj'] < earliest['date_obj']:
earliest = d
return {
"offset": earliest['date_obj'].strftime("%Y-%m-%d"),
"period": date_resolution
}
# If there's a None value, ignore it
if None in dates_list:
dates_list = [x for x in dates_list if x is not None]
if len(dates_list) > 0:
for date in dates_list:
d = DEFAULT_DATE_PARSER.get_date_data(date)
if d['date_obj'] < earliest['date_obj']:
earliest = d
return {
"offset": earliest['date_obj'].strftime("%Y-%m-%d"),
"period": date_resolution
}
else:
return None


def date_interval(data_values):
Expand All @@ -100,7 +107,9 @@ def date_interval(data_values):
try:
reference = INDEXED_DATA["data"]["CALCULATED"][IDENTIFIER]["REFERENCE_DATE"][0]
except KeyError:
raise MappingError("No reference date found to calculate date_interval: is there a reference_date specified in the manifest?", field_level=1)
_warn(message="No reference date found to calculate date_interval: check the reference_date is specified in the manifest or if it is missing for this donor",
input_values=data_values)
return None
DEFAULT_DATE_PARSER = dateparser.DateDataParser(
settings={"PREFER_DAY_OF_MONTH": "first", "DATE_ORDER": DATE_FORMAT}
)
Expand Down Expand Up @@ -578,3 +587,4 @@ def _parse_date(date_string):
except Exception as e:
raise MappingError(f"error in date({date_string}): {type(e)} {e}", field_level=2)
return date_string

2 changes: 1 addition & 1 deletion src/clinical_etl/mohschemav2.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import json
import dateparser
from clinical_etl.schema import BaseSchema, ValidationError
from schema import BaseSchema, ValidationError


"""
Expand Down
Loading
Loading