Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v4.2.0: Performance fixes, site curator role #124

Merged
merged 92 commits into from
Oct 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
92 commits
Select commit Hold shift + click to select a range
c651b2c
update requirements for authx
daisieh May 15, 2024
6f29859
update store_aws_credential to match
daisieh May 15, 2024
7c8a972
add get/delete s3-credential endpoints
daisieh May 15, 2024
905b178
add list programs endpoint
daisieh May 16, 2024
2dcbb50
Update requirements.txt
daisieh May 22, 2024
879d12e
change aws method names to s3
daisieh May 22, 2024
8ff7cc9
Merge pull request #97 from CanDIG/daisieh/s3-token
daisieh May 22, 2024
e276bf8
remove OPA_SECRET
daisieh May 23, 2024
0869c84
bump authx to v2.4.2
daisieh May 27, 2024
a9e8aef
Merge branch 'develop' into daisieh/no-service-token
daisieh May 27, 2024
011f7a2
Merge pull request #100 from CanDIG/daisieh/no-service-token
daisieh May 28, 2024
04851fa
hotfix: authx methods are still named aws
daisieh May 28, 2024
bcb03b0
Replace 'single quote' in PR titles (#102)
mshadbolt Jun 17, 2024
ec37fcb
Bump requests from 2.31.0 to 2.32.2 (#99)
dependabot[bot] Jun 18, 2024
6f0e24b
implement batch ingest
SonQBChau Jun 18, 2024
9b75985
format clean up
SonQBChau Jun 18, 2024
5636ff2
add batch size to api
SonQBChau Jun 19, 2024
1c7b3db
fix default batch_size
SonQBChau Jun 19, 2024
354c0e7
Merge pull request #103 from CanDIG/sonchau/katsu_batch
SonQBChau Jun 26, 2024
4a44541
Update katsu_ingest.py
SonQBChau Jun 26, 2024
dd8f44f
Merge pull request #104 from CanDIG/sonchau/fix_katsu_error
SonQBChau Jun 27, 2024
13562c8
DIG-1658: Update commandline ingest instructions (#105)
mshadbolt Jul 3, 2024
052bb5f
log-master instead of master
daisieh Jul 27, 2024
bd6b7fb
switch to model 3 ETL branch
SonQBChau Aug 7, 2024
93b8283
use V3 schema
SonQBChau Aug 7, 2024
18b28bc
Update clinical_ingest.json
SonQBChau Aug 7, 2024
841eddc
update url
SonQBChau Aug 7, 2024
755adfa
Update config.py
SonQBChau Aug 7, 2024
acbd709
add testing branch checkout
mshadbolt Aug 9, 2024
277eed5
use candidate branch for synth data
mshadbolt Aug 9, 2024
ddabd7f
update jsons, tests
mshadbolt Aug 9, 2024
7057060
clinical_etl develop
mshadbolt Aug 10, 2024
6941c2b
change default tmp
mshadbolt Aug 13, 2024
40a641b
switch to new logging
daisieh Aug 15, 2024
f3f1591
update clinical_etl
mshadbolt Aug 16, 2024
57cd9da
Merge pull request #106 from CanDIG/daisieh/logging
daisieh Aug 16, 2024
c653dca
Merge branch 'develop' into model_3
SonQBChau Aug 19, 2024
d22b727
split by program
mshadbolt Aug 20, 2024
bad8704
Merge branch 'model_3' of github.com:CanDIG/candigv2-ingest into model_3
mshadbolt Aug 20, 2024
f867083
fix file path
mshadbolt Aug 20, 2024
c0dfd5e
fix other path
mshadbolt Aug 20, 2024
aff720d
Update genomic_ingest.json
SonQBChau Aug 20, 2024
97b0866
Merge branch 'model_3' of https://github.com/CanDIG/candigv2-ingest i…
SonQBChau Aug 20, 2024
07f7b70
Merge pull request #107 from CanDIG/model_3
SonQBChau Aug 21, 2024
2a229cb
remove branch checkout
mshadbolt Aug 21, 2024
a597c78
add print for splitting
mshadbolt Aug 21, 2024
035f8fc
Merge pull request #108 from CanDIG/mshadbolt/update-mohccn-v
SonQBChau Aug 21, 2024
56f25a9
logging v1.0.0
daisieh Aug 22, 2024
9537313
Update requirements.txt
daisieh Aug 23, 2024
25f00b9
Merge pull request #109 from CanDIG/daisieh/refresh-token
daisieh Aug 26, 2024
b1e62b5
remove "new auth model" stuff
daisieh Aug 30, 2024
c4bb085
add updated operations
daisieh Aug 30, 2024
b594b13
add daemon
daisieh Aug 30, 2024
f4dabff
renamed ingest_clinical_data
daisieh Aug 30, 2024
12a1d7d
Update README.md
daisieh Sep 4, 2024
0116e80
Merge pull request #110 from CanDIG/daisieh/ingest-daemon
daisieh Sep 5, 2024
de3e555
switch to gunicorn
daisieh Sep 7, 2024
7a5e9eb
Merge pull request #112 from CanDIG/daisieh/gunicorn
daisieh Sep 9, 2024
5608ee8
remove commandline instructions (#111)
mshadbolt Sep 10, 2024
35d2cc6
switch to gunicorn
daisieh Sep 7, 2024
aa08f38
remove commandline instructions (#111)
mshadbolt Sep 10, 2024
ac3920e
validating htsget
daisieh Sep 12, 2024
1ddea7d
api and daemon know htsget vs katsu
daisieh Sep 12, 2024
5ad2f8c
separate validation from ingest in htsget
daisieh Sep 12, 2024
f62dec2
add do_not_index flag
daisieh Sep 16, 2024
1c9f5c4
fix test
daisieh Sep 16, 2024
717d29a
Merge pull request #113 from CanDIG/daisieh/htsget-async
daisieh Sep 16, 2024
3cc6731
Update katsu_ingest.py
daisieh Sep 19, 2024
f408e90
remove unused req
daisieh Sep 19, 2024
da2bb0f
wrap in a try
daisieh Sep 19, 2024
840de95
abstract out check_default_site_admin
daisieh Sep 19, 2024
5f74862
add warning for schema mismatch
daisieh Sep 19, 2024
ca66782
update urls
daisieh Sep 19, 2024
d517317
Merge pull request #114 from CanDIG/daisieh/active-katsu-validation
daisieh Sep 20, 2024
3bc24df
Update authx requirements
daisieh Sep 23, 2024
d93c439
hotfix: add back default warnings for site admin to endpoints (#116)
daisieh Sep 25, 2024
129cfb0
update clinical_etl (#117)
mshadbolt Sep 25, 2024
fad4640
DIG-1712: Ingest should prevent ingest if program auth doesn't exist …
daisieh Sep 30, 2024
abec432
Add an endpoint for users to exchange refresh tokens for new ones
OrdiNeu Sep 26, 2024
63994c2
Instead of grabbing a new refresh token, just return the user's
OrdiNeu Sep 27, 2024
5c756c6
Apply suggestions by @daisieh
OrdiNeu Oct 1, 2024
024b070
Merge pull request #118 from CanDIG/feature/frontend-refresh-token
OrdiNeu Oct 1, 2024
d29c4aa
Add option to automatically delete existing tmp directory (#120)
mshadbolt Oct 2, 2024
eabc319
Fixing auto delete (#121)
mshadbolt Oct 3, 2024
50569f2
update readme
mshadbolt Oct 3, 2024
99645df
minor update
mshadbolt Oct 3, 2024
4df2909
updates
mshadbolt Oct 3, 2024
eff11ed
check to make sure the email isn't already in the role
daisieh Oct 3, 2024
52d1481
Merge pull request #123 from CanDIG/hotfix/dup-role
daisieh Oct 3, 2024
fe6185e
updates based on feedback
mshadbolt Oct 3, 2024
8ccb8aa
Merge pull request #122 from CanDIG/mshadbolt/update-README
daisieh Oct 3, 2024
fb97049
Merge branch 'stable' into stable-candidate-v4.2.0
daisieh Oct 4, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 17 additions & 14 deletions .github/workflows/dispatch-actions.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
name: Submodule PR
on:
push:
pull_request:
branches: [develop]
types: [closed]
jobs:
CanDIG-dispatch:
runs-on: ubuntu-latest
Expand All @@ -10,21 +11,23 @@ jobs:
CHECKOUT_BRANCH: 'develop'
PR_AGAINST_BRANCH: 'develop'
OWNER: 'CanDIG'
if: github.event.pull_request.merged == true
steps:
- name: Check out repository code
uses: actions/checkout@v4
- name: get PR data
uses: actions/github-script@v7
id: get_pr_data
with:
script: |
return (
await github.rest.repos.listPullRequestsAssociatedWithCommit({
commit_sha: context.sha,
owner: context.repo.owner,
repo: context.repo.repo,
})
).data[0];
shell: python
run: |
import json
import os
with open('${{ github.event_path }}') as fh:
event = json.load(fh)
escaped = event['pull_request']['title'].replace("'", '"')
pr_number = event["number"]
print(escaped)
with open(os.environ['GITHUB_ENV'], 'a') as fh:
print(f'PR_TITLE={escaped}', file=fh)
print(f'PR_NUMBER={pr_number}', file=fh)
- name: Create PR in CanDIGv2
id: make_pr
uses: CanDIG/github-action-pr-expanded@v4
Expand All @@ -33,7 +36,7 @@ jobs:
parent_repository: ${{ env.PARENT_REPOSITORY }}
checkout_branch: ${{ env.CHECKOUT_BRANCH}}
pr_against_branch: ${{ env.PR_AGAINST_BRANCH }}
pr_title: '${{ github.repository }} merging: ${{ fromJson(steps.get_pr_data.outputs.result).title }}'
pr_description: "PR triggered by update to develop branch on ${{ github.repository }}. Commit hash: `${{ github.sha }}`. PR link: [#${{ fromJson(steps.get_pr_data.outputs.result).number }}](https://github.com/${{ github.repository }}/pull/${{ fromJson(steps.get_pr_data.outputs.result).number }})"
pr_title: "${{ github.repository }} merging: ${{ env.PR_TITLE }}"
pr_description: "PR triggered by update to develop branch on ${{ github.repository }}. Commit hash: `${{ github.sha }}`. PR link: [#${{ env.PR_NUMBER }}](https://github.com/${{ github.repository }}/pull/${{ env.PR_NUMBER }})"
owner: ${{ env.OWNER }}
submodule_path: lib/candig-ingest/candigv2-ingest
1 change: 1 addition & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ jobs:
python-version: ["3.12"]
env:
CANDIG_URL: "http://localhost"
IS_TESTING: true
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
Expand Down
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,5 @@ tmp/
tests/small_dataset_clinical_ingest.json
tests/small_dataset_genomic_ingest.json
tests/clinical_data_validation_results.json
.DS_Store
.DS_Store
tests/SYNTH*
163 changes: 87 additions & 76 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,11 @@ This repository can either be run standalone or as a Docker container.

## What you'll need for ingest

* A valid user for CanDIGv2 that has site administration credentials.
* A valid user for CanDIGv2 that has site administrator, site curator or program curator privileges for the programs you intend to ingest.
* List of users that will have access to this dataset.
* Clinical data, saved as either an Excel file or as a set of csv files.
* Genomic data files in vcf, bam or cram format with paired index files for each.
* File map of genomic files in a csv file, linking genomic sample IDs to the clinical samples.
* Locations of Genomic data files in vcf, bam or cram format with paired index files for each.
* File map of genomic files in a csv or json file, linking genomic sample IDs to the clinical samples.
* (if needed) Credentials for s3 endpoints: url, access ID, secret key.
* Reference genome used for the variant files.
* Manifest and mappings for [`clinical_ETL_code`](https://github.com/CanDIG/clinical_ETL_code) conversion.
Expand All @@ -22,62 +22,98 @@ Using a Python 3.10+ environment, run the following:
pip install -r requirements.txt
```

### Set environment variables
## How to use candigv2-ingest

`candigv2-ingest` can be used as a local API server or a docker container and is generally expected to be used as part of a running [CanDIGv2 stack](https://github.com/CanDIG/CanDIGv2). To use the local API, set your environment variables, run `python app.py`, and follow the API instructions in the sections below. The API will be available at `localhost:1236`. A swagger UI is also available at `/ui`. Docker instructions can be found at the [bottom of this document](#Run-as-Docker-Container). To authorize yourself for these endpoints, you will need to set the Authorization header to a keycloak bearer token (in the format `"Bearer ..."` without the quotes).

* CANDIG_URL (same as TYK_LOGIN_TARGET_URL, if you're using CanDIGv2's example.env)
* KEYCLOAK_PUBLIC_URL
* CANDIG_CLIENT_ID
* CANDIG_CLIENT_SECRET
* CANDIG_SITE_ADMIN_USER
* CANDIG_SITE_ADMIN_PASSWORD
### Getting a bearer token
<details><summary> </summary>

For convenience, you can generate a file `env.sh` from your [`CanDIGv2`](https://github.com/CanDIG/CanDIGv2) repo:
Users can obtain a bearer token by logging into the CanDIG data portal, clicking the cog in the top right corner, clicking `*** Get API Token` and clicking the token to copy it.

Site administrators or users using a local candig install can also obtain a token programmatically using the following curl commands from the CanDIGv2 repo:

```bash
cd CanDIGv2
python settings.py
source env.sh
```

## How to use candigv2-ingest
```bash
CURL_OUTPUT=$(curl -s --request POST \
--url $KEYCLOAK_PUBLIC_URL'/auth/realms/candig/protocol/openid-connect/token' \
--header 'Content-Type: application/x-www-form-urlencoded' \
--data grant_type=password \
--data client_id=$CANDIG_CLIENT_ID \
--data client_secret=$CANDIG_CLIENT_SECRET \
--data username=$CANDIG_SITE_ADMIN_USER \
--data password=$CANDIG_SITE_ADMIN_PASSWORD \
--data scope=openid)
```

`candigv2-ingest` can be used as either a command-line tool, a local API server or a docker container. To run the command line scripts, set your environment variables and follow the command line instructions in the sections below. To use the local API, set your environment variables, run `python app.py`, and follow the API instructions in the sections below. The API will be available at `localhost:1236`. A swagger UI is also available at `/ui`. Docker instructions can be found at the [bottom of this document](#Run-as-Docker-Container). To authorize yourself for these endpoints, you will need to set the Authorization header to a keycloak bearer token (in the format `"Bearer ..."` without the quotes).
```bash
export TOKEN=$(echo $CURL_OUTPUT | grep -Eo 'access_token":"[a-zA-Z0-9._\-]+' | cut -d '"' -f3)
```

## 1. Clinical data
</details>

### i. Prepare clinical data
## 1. Program registration

Before being ingested, data must be transformed to the CanDIG MoH data model. This can be done using CanDIG's [`clinical_ETL_code`](https://github.com/CanDIG/clinical_ETL_code) repository. Please visit that repository for full instructions and return to ingest when you have a valid JSON file with a set of donors. An example file can be found at [tests/single_ingest.json](tests/single_ingest.json)
Programs need to be registered before any data can be ingested. Initial program registration can be done by either a site admin or site curator. More information about assigning [site admins](#4-adding-or-removing-site-administrators) and [site curators](#5-adding-or-removing-site-curators) is in sections 4 and 5 below.

### ii. Ingest clinical data
To register a program, use the `/ingest/program/` [endpoint](https://github.com/CanDIG/candigv2-ingest/blob/4257929feca00be0d4384433793fcdf1b4e4137b/ingest_openapi.yaml#L114) to add, update, or delete authorization information for a program. Authorization headers for a site admin or site curator user must be provided. A POST request replaces a program authorization, while a DELETE request revokes it.

The preferred method for clinical data ingest is using the API.
During program registration, users can be assigned one of two levels of authorization:
* Team members are researchers of a program and are authorized to read and access all donor-specific data for a program.
* Program curators are users that are authorized to curate data for the program: they can ingest and delete data.

#### API
The following is an example of the payload you would need to `POST` to `/ingest/program` to add the following user roles to `TEST-PROGRAM-1`:
- `[email protected]` as a Team member
- `[email protected]` as a Program curator

The clinical ingest API runs at `/ingest/clinical`. Simply send a request with an authorized bearer token and a JSON body with your `DonorWithClinicalData` object. See the swagger UI/[schema](ingest_openapi.yaml) for the response format.
```
{"program_id": "TEST-PROGRAM-1", "team_members":["[email protected]"], "program_curators": ["[email protected]"]}
```

An example `curl` command that adds two program curators and 2 team members is below:

#### Command line
```bash
curl -s --request POST \
--url $CANDIG_URL'/ingest/program' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer '$TOKEN \
-d '{"program_id": "PROGRAM_ID", "program_curators": ["[email protected]", "[email protected]"], "team_members": ["[email protected]", "[email protected]"]}'
```

This method is mainly used for development work but may also be used if the JSON body is too big to send easily via POST.
See [Getting a bearer token](#getting-a-bearer-token) for how to get a token.

To ingest via the commandline script, the location of your clinical data JSON must be specified. This can be done either by:
> [!CAUTION]
> A POST request to the `ingest/program` replaces any existing program registration data for that program. It is advisable to first use a GET request to see the current users authorized to a program before adding additional program_curators and/or team_members when POSTing to this endpoint

supplying it as an argument to the script:
## 2. Clinical data

```commandline
python katsu_ingest.py --input path/to/clinical/data/
```
### i. Prepare clinical data

Or by exporting an environment variable `CLINICAL_DATA_LOCATION`, then running the script:
Before being ingested, data must be transformed to the CanDIG MoH data model. This can be done using CanDIG's [`clinical_ETL_code`](https://github.com/CanDIG/clinical_ETL_code) repository. Please visit that repository for full instructions and return to ingest when you have a valid JSON file with a set of donors. An example file can be found at [tests/single_ingest.json](tests/single_ingest.json)

### ii. Ingest clinical data

The preferred method for clinical data ingest is using the API.

#### API

The clinical ingest API runs at `$CANDIG_URL/ingest/clinical`. Simply send a request with an [authorized bearer](#getting-a-bearer-token) token and a JSON body with your clinical data json output from clinical_etl. See the swagger UI/[schema](ingest_openapi.yaml) for the response format. The request will return a response with a queue ID. You can check the status of your ingest using that ID at `$CANDIG_URL/ingest/status/{queue_id}`.

Example curl POST to ingest clinical data:
```bash
export CLINICAL_DATA_LOCATION=path/to/clinical/data/
source env.sh
python katsu_ingest.py
curl -X 'POST' \
$CANDIG_URL'/ingest/clinical' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer '$TOKEN \
-d '@/absolute/path/to/clinical_map.json>'
```

## 2. Genomic data
## 3. Genomic data

**First**, ensure that the relevant clinical data is ingested, as this must be completed before your genomic data is ingested.

Expand All @@ -88,7 +124,7 @@ Accepted file types:
* Aligned reads (`.bam` or `.cram`) with paired index files (`.bai`, `.crai`)

For each file, you need to have a note of:
* The `submitter_sample_id` that the file should link to
* The `submitter_sample_id`(s) that the file should link to
* How that sample is referred to within the file, e.g. the `sample ID` in a VCF or `@RG SM` in BAM/CRAM
* Where the file is located in relation to the running htsget server

Expand Down Expand Up @@ -201,57 +237,32 @@ The file should contain an array of dictionaries, where each item represents a s
### iv. Ingest genomic files

#### API
Use the `/ingest/genomic` endpoint with the proper Authorization headers and your genomic JSON as specified above for the body to ingest and link to the clinical dataset program_id.

#### Command line

To ingest using an S3 container, once the files have been added, you can run the htsget_ingest.py script:
Use the `$CANDIG_URL/ingest/genomic` endpoint with the proper [Authorization headers](#getting-a-bearer-token) and your genomic JSON as specified above for the body to ingest and link to the clinical dataset program_id.

Example curl POST request to ingest genomic data:
```bash
python htsget_ingest.py --samplefile [JSON-formatted sample data as specified above]
curl -X 'POST' \
$CANDIG_URL'/ingest/genomic' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer '$TOKEN \
-d '@/absolute/path/to/genomic.json>'
```

## 3. Assigning users to programs
The preferred method to assign user authorizations to programs is to use the API. Users can be assigned one of two levels of authorization:
* Team members are researchers of a program and are authorized to read and access all donor-specific data for a program.
* Program curators are users that are authorized to curate data for the program: they can ingest data.

The script `opa_ingest.py` can be used only to add Team member authorizations to a program that already has an existing authorization.

### API
Use the `/ingest/program/` [endpoint](https://github.com/CanDIG/candigv2-ingest/blob/4257929feca00be0d4384433793fcdf1b4e4137b/ingest_openapi.yaml#L114) to add, update, or delete authorization information for a program. Authorization headers for a site admin user must be provided. A POST request adds authorization, while a DELETE request revokes it.

The following is an example of the payload you would need to `POST` to `/ingest/program/{program_id}` to add the following user roles to `TEST-PROGRAM-1`:
- `[email protected]` as a Team member
- `[email protected]` as a Program curator

```
{"program_id": "TEST-PROGRAM-1", "team_members":["[email protected]"], "program_curators": ["[email protected]"]}
```

### Command line

The `opa_ingest.py` script can be used to add Team members to a program only. To add Program curators, the API described above must be used.

The script will add a single user or a list of users to a specified program (`--dataset`). Single users are added using the `--user` flag, while a list of users can be specified in a plain text file, with one user email specified per line, using the `--user-file` flag to specify the path to the file. If the `--remove` flag is used, the users will be removed, rather than added to the program.

example usage:
```bash
python opa_ingest.py --user|userfile [either a user email or a file of user emails] --dataset [name of dataset/program] [--remove]
```
See [Getting a bearer token](#getting-a-bearer-token) for how to get a token.

## 4. Adding or removing site administrators
Use the `/ingest/site-role/site_admin/{user_email}` endpoint to add or remove site administrators. A POST request adds the user as a site admin, while a DELETE request removes the user from the role.
Use the `/ingest/site-role/admin/{user_email}` endpoint to add or remove site administrators. A POST request adds the user as a site admin, while a DELETE request removes the user from the role. A valid site administrator token must be used with this endpoint.

## 5. Adding or removing site curators
Use the `/ingest/site-role/curator/{user_email}` endpoint to add or remove site curators. A POST request adds the user as a site curator, a GET request returns whether the user is a site curator as a boolean, while a DELETE request removes the user from the role. A valid site administrator token must be used with this endpoint.

## 5. Approving/rejecting pending users
## 6. Approving/rejecting pending users
Use the `/user/pending` endpoint to list pending users. A site admin can approve either a single or multiple pending users by POSTing to the `user/pending/{user}` or `user/pending` endpoints, and likewise reject with DELETEs to the same endpoints. DELETE to the bulk endpoint clears the whole pending list.


## 6. Adding a DAC-style program authorization for a user
## 7. Adding a DAC-style program authorization for a user
An authorized user can be approved to view a program for a particular timeframe by a POST to the `/user/{user_id}/authorize` endpoint. The body should be a json that contains the `program_id`, `start_date`, and `end_date`. Re-posting a new json with the same program ID will update the user's authorization. An authorization for a program can be revoked by a DELETE to the `/user/{user_id}/authorize/{program_id}` endpoint.


## Run as Docker Container
The containerized version runs the API as specified above within a Docker container (which is how this repository is used in the CanDIGv2 stack).
To run, ensure you have docker installed and CanDIGv2 running, then run the following commands:
Expand Down Expand Up @@ -284,7 +295,7 @@ The script `generate_test_data.py` can be used to generate a json files for inge

To run:

* Set up a virtual environment and install requirements (if you haven't already). If running inside the ingest docker container, this shouldn't be needed.
* Set up a virtual environment and install requirements (if you haven't already). If running inside the ingest docker container, this shouldn't be needed.
```commandline
pip install -r requirements.txt
```
Expand All @@ -293,7 +304,7 @@ pip install -r requirements.txt
Usage:
```commandline
python generate_test_data.py -h
usage: generate_test_data.py [-h] [--prefix PREFIX] --tmp
usage: generate_test_data.py [-h] [--prefix PREFIX] --tmp

A script that copies and converts data from mohccn-synthetic-data for ingest into CanDIG platform.

Expand Down
8 changes: 6 additions & 2 deletions app.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,18 @@
from connexion import FlaskApp
from flask import url_for, redirect
import os
import candigv2_logging.logging

candigv2_logging.logging.initialize()

logger = candigv2_logging.logging.CanDIGLogger(__file__)

def root():
return redirect(url_for('ingest_operations_get_service_info'))

def create_app():
if not os.getenv("CANDIG_URL"):
print("ERROR: CANDIG_URL not found. CanDIG stack environment variables likely not set. Please do so before "
"running the service.")
logger.warning("CANDIG_URL not found. CanDIG stack environment variables likely not set. Please do so before running the service.")
exit()
connexionApp = FlaskApp(__name__, specification_dir='./')
connexionApp.add_api('ingest_openapi.yaml', pythonic_params=True, strict_validation=True)
Expand Down
Loading