Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module path changes to allow running standalone (not as an included m… #50

Merged
merged 4 commits into from
Feb 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -225,7 +225,7 @@ You can validate the generated json mapping file against the MoH data model. The

```
$ python src/clinical_etl/validate_coverage.py -h
validate_coverage.py [-h] [--input map.json] [--manifest MAPPING]
usage: validate_coverage.py [-h] --json JSON [--verbose]

options:
-h, --help show this help message and exit
Expand Down
16 changes: 12 additions & 4 deletions mapping_functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,15 +66,23 @@ A detailed index of all standard functions can be viewed below in the [Standard

## Writing your own custom functions

If the data cannot be transformed with one of the standard functions, you can define your own. In your data directory (the one that contains `manifest.yml`) create a python file (let's assume you called it `new_cohort.py`) and add the name of that file as the `mapping` entry in the manifest.
If the data cannot be transformed with one of the standard functions, you can define your own.

Following the format in the generic `mappings.py`, write your own functions in your python file for how to translate the data. To specify a custom mapping function in the template:
In your data directory (the one that contains `manifest.yml`) create a python file (let's assume you called it `new_cohort.py`) and add the name of that file as a .yml list after `functions` in the manifest. For example:
```
functions:
- new_cohort
```

Following the format in the generic `mappings.py`, write your own functions in your python file to translate the data.

To use a custom mapping function in the template, you must specify the file and function using dot-separated notation:

`DONOR.INDEX.primary_diagnoses.INDEX.basis_of_diagnosis,{new_cohort.custom_function(DATA_SHEET.field_name)}`
DONOR.INDEX.primary_diagnoses.INDEX.basis_of_diagnosis,{**new_cohort.custom_function**(DATA_SHEET.field_name)}

Examples:

To map input values to output values (in case your data capture used different values than the model):
Map input values to output values (in case your data capture used different values than the model):

```
def sex(data_value):
Expand Down
9 changes: 8 additions & 1 deletion sample_inputs/new_cohort.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
## Additional mappings customised to my special cohort
import os
import sys
# Include src/ directory in the module search path.
current_dir = os.path.dirname(os.path.abspath(__file__))
parent_dir = os.path.dirname(current_dir)
sys.path.append(os.sep.join([parent_dir, "src"]))
import clinical_etl.mappings

## Additional mappings customised to my special cohort
def sex(data_value):
# make sure we only have one value
mapping_val = mappings.single_val(data_value)
Expand Down
10 changes: 7 additions & 3 deletions src/clinical_etl/CSVConvert.py
Original file line number Diff line number Diff line change
@@ -1,17 +1,21 @@
#!/usr/bin/env python
# coding: utf-8

import sys
import os
from copy import deepcopy
import importlib.util
import json
from clinical_etl import mappings
import os
import pandas
import csv
import re
import sys
import yaml
import argparse
# Include clinical_etl parent directory in the module search path.
current_dir = os.path.dirname(os.path.abspath(__file__))
parent_dir = os.path.dirname(current_dir)
sys.path.append(parent_dir)
from clinical_etl import mappings


def verbose_print(message):
Expand Down
5 changes: 5 additions & 0 deletions src/clinical_etl/validate_coverage.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,11 @@
import sys
import mappings
import importlib.util
import os
# Include clinical_etl parent directory in the module search path for a later import.
current_dir = os.path.dirname(os.path.abspath(__file__))
parent_dir = os.path.dirname(current_dir)
sys.path.append(parent_dir)
# from jsoncomparison import Compare
# from copy import deepcopy
# import yaml
Expand Down
9 changes: 7 additions & 2 deletions tests/test_data_ingest.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,14 @@
import pytest
import yaml
import os
import sys
import json
# Include src/clinical_etl directory in the module search path.
current_dir = os.path.dirname(os.path.abspath(__file__))
parent_dir = os.path.dirname(current_dir)
sys.path.append(os.sep.join([parent_dir, "src"]))
from clinical_etl import CSVConvert
from clinical_etl import mappings
import json
import os
from clinical_etl.mohschema import MoHSchema

# read sheet from given data pathway
Expand Down
6 changes: 6 additions & 0 deletions tests/testmap.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
import os
import sys
# Include src/ directory in the module search path.
current_dir = os.path.dirname(os.path.abspath(__file__))
parent_dir = os.path.dirname(current_dir)
sys.path.append(os.sep.join([parent_dir, "src"]))
import clinical_etl.mappings

def indexed_on_if_absent(data_values):
Expand Down
Loading