Skip to content

Commit

Permalink
Merge pull request #47 from CanDIG/daisieh/updates
Browse files Browse the repository at this point in the history
  • Loading branch information
daisieh authored Dec 19, 2023
2 parents 1369dd8 + 01c9541 commit f23bd22
Show file tree
Hide file tree
Showing 7 changed files with 12 additions and 15 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.10"]
python-version: ["3.12"]

steps:
- uses: actions/checkout@v2
Expand Down
Binary file modified dist/clinical_ETL-2.0.0-py3-none-any.whl
Binary file not shown.
3 changes: 1 addition & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,11 @@ build-backend = "setuptools.build_meta"
version = "2.0.0"
name = "clinical_ETL"
dependencies = [
"pandas~=1.3.4",
"pandas>=2.1.0",
"pytest>=7.2.0",
"pyYAML>=5.4.1",
"dateparser>=1.1.0",
"openpyxl>=3.0.9",
"jsoncomparison>=1.1.0",
"requests>=2.29.0",
"jsonschema~=4.19.2",
"openapi-spec-validator>=0.7.1",
Expand Down
9 changes: 4 additions & 5 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
pandas~=1.3.4
pytest>=6.2.5
pyYAML>=5.4.1
dateparser~=1.1.0
pandas~=2.1.4
pytest>=7.2.0
pyYAML>=6.0.1
dateparser~=1.2.0
openpyxl~=3.0.9
requests~=2.29
jsonschema~=4.19.1
jsoncomparison~=1.1.0
openapi-spec-validator~=0.7.1
pdoc3>=0.10.0
6 changes: 3 additions & 3 deletions src/clinical_ETL.egg-info/PKG-INFO
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,11 @@ Project-URL: Repository, https://github.com/CanDIG/clinical_ETL_code
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pandas~=1.3.4
Requires-Dist: pandas>=2.1.0
Requires-Dist: pytest>=7.2.0
Requires-Dist: pyYAML>=5.4.1
Requires-Dist: dateparser>=1.1.0
Requires-Dist: openpyxl>=3.0.9
Requires-Dist: jsoncomparison>=1.1.0
Requires-Dist: requests>=2.29.0
Requires-Dist: jsonschema~=4.19.2
Requires-Dist: openapi-spec-validator>=0.7.1
Expand All @@ -26,8 +25,9 @@ Specifically, this code was designed to convert clinical data for the MOHCCN pro
## Using clinical_etl as a package
You can import this module as a package by including the following in your `requirements.txt`:
```
clinical_etl@git+https://github.com/CanDIG/clinical_ETL_code.git@main
clinical_etl@git+https://github.com/CanDIG/clinical_ETL_code.git@stable
```
If you need the latest version, you can replace `stable` with `develop`.

## CSVConvert
Most of the heavy lifting is done in the [`CSVConvert.py`](CSVConvert.py) script. See sections below for setting up the inputs and running the script.
Expand Down
3 changes: 1 addition & 2 deletions src/clinical_ETL.egg-info/requires.txt
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
pandas~=1.3.4
pandas>=2.1.0
pytest>=7.2.0
pyYAML>=5.4.1
dateparser>=1.1.0
openpyxl>=3.0.9
jsoncomparison>=1.1.0
requests>=2.29.0
jsonschema~=4.19.2
openapi-spec-validator>=0.7.1
Expand Down
4 changes: 2 additions & 2 deletions src/clinical_etl/CSVConvert.py
Original file line number Diff line number Diff line change
Expand Up @@ -316,8 +316,8 @@ def process_data(raw_csv_dfs):
print(f"Processing sheet {page}...")
df = raw_csv_dfs[page].dropna(axis='index', how='all') \
.dropna(axis='columns', how='all') \
.applymap(str) \
.applymap(lambda x: x.strip()) \
.map(str) \
.map(lambda x: x.strip()) \
.drop_duplicates() # drop absolutely identical lines

# Sort by identifier and then tag any dups
Expand Down

0 comments on commit f23bd22

Please sign in to comment.