From 50569f23c5f71c0cc9ba1684772248ccc3af84d3 Mon Sep 17 00:00:00 2001 From: Marion Date: Thu, 3 Oct 2024 12:45:44 -0700 Subject: [PATCH 1/4] update readme --- README.md | 64 ++++++++++++++++++++++++++++++++++--------------------- 1 file changed, 40 insertions(+), 24 deletions(-) diff --git a/README.md b/README.md index a63eace..e62633e 100644 --- a/README.md +++ b/README.md @@ -6,7 +6,7 @@ This repository can either be run standalone or as a Docker container. ## What you'll need for ingest -* A valid user for CanDIGv2 that has site administration credentials. +* A valid user for CanDIGv2 that has site administrator, site curator or program curator privileges for the programs you intend to ingest. * List of users that will have access to this dataset. * Clinical data, saved as either an Excel file or as a set of csv files. * Locations of Genomic data files in vcf, bam or cram format with paired index files for each. @@ -24,9 +24,41 @@ pip install -r requirements.txt ## How to use candigv2-ingest -`candigv2-ingest` can be used as either a command-line tool, a local API server or a docker container. To run the command line scripts, set your environment variables and follow the command line instructions in the sections below. To use the local API, set your environment variables, run `python app.py`, and follow the API instructions in the sections below. The API will be available at `localhost:1236`. A swagger UI is also available at `/ui`. Docker instructions can be found at the [bottom of this document](#Run-as-Docker-Container). To authorize yourself for these endpoints, you will need to set the Authorization header to a keycloak bearer token (in the format `"Bearer ..."` without the quotes). +`candigv2-ingest` can be used as a local API server or a docker container and is generally expected to be used as part of a running [CanDIGv2 stack](https://github.com/CanDIG/CanDIGv2). To use the local API, set your environment variables, run `python app.py`, and follow the API instructions in the sections below. The API will be available at `localhost:1236`. A swagger UI is also available at `/ui`. Docker instructions can be found at the [bottom of this document](#Run-as-Docker-Container). To authorize yourself for these endpoints, you will need to set the Authorization header to a keycloak bearer token (in the format `"Bearer ..."` without the quotes). -## 1. Clinical data +## 1. Program registration + +Programs need to be registered before any data can be ingested. Initial program registration can be done by either a site admin or site curator. More information about assigning [site admins](#4-adding-or-removing-site-administrators) and [site curators](#5-adding-or-removing-site-curators) is in sections 4 and 5 below. + +To register a program, use the `/ingest/program/` [endpoint](https://github.com/CanDIG/candigv2-ingest/blob/4257929feca00be0d4384433793fcdf1b4e4137b/ingest_openapi.yaml#L114) to add, update, or delete authorization information for a program. Authorization headers for a site admin or site curator user must be provided. A POST request replaces a program authorization, while a DELETE request revokes it. + +During program registration, users can be assigned one of two levels of authorization: +* Team members are researchers of a program and are authorized to read and access all donor-specific data for a program. +* Program curators are users that are authorized to curate data for the program: they can ingest and delete data. + +The following is an example of the payload you would need to `POST` to `/ingest/program` to add the following user roles to `TEST-PROGRAM-1`: +- `user1@test.ca` as a Team member +- `user2@test.ca` as a Program curator + +``` +{"program_id": "TEST-PROGRAM-1", "team_members":["user1@test.ca"], "program_curators": ["user2@test.ca"]} +``` + +An example `curl` command that adds two program curators and 2 team members is below: + +```bash +curl -s --request POST \ + --url $CANDIG_URL'/ingest/program' \ + -H 'accept: application/json' \ + -H 'Content-Type: application/json' \ + -H 'Authorization: Bearer '$TOKEN \ + -d '{"program_id": "PROGRAM_ID", "program_curators": ["curator1@test.ca", "curator2@test.ca"], "team_members": ["user2@test.ca", "user1@test.ca"]}' +``` + +> [!CAUTION] +> A POST request to the `ingest/program` replaces any existing program registration data for that program, so to add a curator, existing users plus the additional user would need to be added. + +## 2. Clinical data ### i. Prepare clinical data @@ -40,7 +72,7 @@ The preferred method for clinical data ingest is using the API. The clinical ingest API runs at `$CANDIG_URL/ingest/clinical`. Simply send a request with an authorized bearer token and a JSON body with your `DonorWithClinicalData` object. See the swagger UI/[schema](ingest_openapi.yaml) for the response format. The request will return a response with a queue ID. You can check the status of your ingest using that ID at `$CANDIG_URL/ingest/status/{queue_id}`. -## 2. Genomic data +## 3. Genomic data **First**, ensure that the relevant clinical data is ingested, as this must be completed before your genomic data is ingested. @@ -166,34 +198,18 @@ The file should contain an array of dictionaries, where each item represents a s #### API Use the `$CANDIG_URL/ingest/genomic` endpoint with the proper Authorization headers and your genomic JSON as specified above for the body to ingest and link to the clinical dataset program_id. -## 3. Authorizing users to programs -Users can be assigned one of two levels of authorization: -* Team members are researchers of a program and are authorized to read and access all donor-specific data for a program. -* Program curators are users that are authorized to curate data for the program: they can ingest and delete data. - -### API -Use the `/ingest/program/` [endpoint](https://github.com/CanDIG/candigv2-ingest/blob/4257929feca00be0d4384433793fcdf1b4e4137b/ingest_openapi.yaml#L114) to add, update, or delete authorization information for a program. Authorization headers for a site admin user must be provided. A POST request adds authorization, while a DELETE request revokes it. - -The following is an example of the payload you would need to `POST` to `/ingest/program/{program_id}` to add the following user roles to `TEST-PROGRAM-1`: -- `user1@test.ca` as a Team member -- `user2@test.ca` as a Program curator - -``` -{"program_id": "TEST-PROGRAM-1", "team_members":["user1@test.ca"], "program_curators": ["user2@test.ca"]} -``` - ## 4. Adding or removing site administrators Use the `/ingest/site-role/site_admin/{user_email}` endpoint to add or remove site administrators. A POST request adds the user as a site admin, while a DELETE request removes the user from the role. +## 5. Adding or removing site curators +Use the `/ingest/site-role/curator/{user_email}` endpoint to add or remove site curators. A POST request adds the user as a site curator, a GET request returns true/false, while a DELETE request removes the user from the role. -## 5. Approving/rejecting pending users +## 6. Approving/rejecting pending users Use the `/user/pending` endpoint to list pending users. A site admin can approve either a single or multiple pending users by POSTing to the `user/pending/{user}` or `user/pending` endpoints, and likewise reject with DELETEs to the same endpoints. DELETE to the bulk endpoint clears the whole pending list. - -## 6. Adding a DAC-style program authorization for a user +## 7. Adding a DAC-style program authorization for a user An authorized user can be approved to view a program for a particular timeframe by a POST to the `/user/{user_id}/authorize` endpoint. The body should be a json that contains the `program_id`, `start_date`, and `end_date`. Re-posting a new json with the same program ID will update the user's authorization. An authorization for a program can be revoked by a DELETE to the `/user/{user_id}/authorize/{program_id}` endpoint. - ## Run as Docker Container The containerized version runs the API as specified above within a Docker container (which is how this repository is used in the CanDIGv2 stack). To run, ensure you have docker installed and CanDIGv2 running, then run the following commands: From 99645dfbff0693b8c1f926b70eebf733303fcd3a Mon Sep 17 00:00:00 2001 From: Marion Date: Thu, 3 Oct 2024 12:49:04 -0700 Subject: [PATCH 2/4] minor update --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index e62633e..baa576c 100644 --- a/README.md +++ b/README.md @@ -56,7 +56,7 @@ curl -s --request POST \ ``` > [!CAUTION] -> A POST request to the `ingest/program` replaces any existing program registration data for that program, so to add a curator, existing users plus the additional user would need to be added. +> A POST request to the `ingest/program` replaces any existing program registration data for that program, so to add a curator, existing users plus the additional users would need to be added to the payload. ## 2. Clinical data @@ -202,7 +202,7 @@ Use the `$CANDIG_URL/ingest/genomic` endpoint with the proper Authorization head Use the `/ingest/site-role/site_admin/{user_email}` endpoint to add or remove site administrators. A POST request adds the user as a site admin, while a DELETE request removes the user from the role. ## 5. Adding or removing site curators -Use the `/ingest/site-role/curator/{user_email}` endpoint to add or remove site curators. A POST request adds the user as a site curator, a GET request returns true/false, while a DELETE request removes the user from the role. +Use the `/ingest/site-role/curator/{user_email}` endpoint to add or remove site curators. A POST request adds the user as a site curator, a GET request returns whether the user is a site curator as a boolean, while a DELETE request removes the user from the role. ## 6. Approving/rejecting pending users Use the `/user/pending` endpoint to list pending users. A site admin can approve either a single or multiple pending users by POSTing to the `user/pending/{user}` or `user/pending` endpoints, and likewise reject with DELETEs to the same endpoints. DELETE to the bulk endpoint clears the whole pending list. From 4df2909b96267baf1caccce8c4bcf3f0e09aa3f2 Mon Sep 17 00:00:00 2001 From: Marion Date: Thu, 3 Oct 2024 13:24:30 -0700 Subject: [PATCH 3/4] updates --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index baa576c..ec70cda 100644 --- a/README.md +++ b/README.md @@ -199,10 +199,10 @@ The file should contain an array of dictionaries, where each item represents a s Use the `$CANDIG_URL/ingest/genomic` endpoint with the proper Authorization headers and your genomic JSON as specified above for the body to ingest and link to the clinical dataset program_id. ## 4. Adding or removing site administrators -Use the `/ingest/site-role/site_admin/{user_email}` endpoint to add or remove site administrators. A POST request adds the user as a site admin, while a DELETE request removes the user from the role. +Use the `/ingest/site-role/site_admin/{user_email}` endpoint to add or remove site administrators. A POST request adds the user as a site admin, while a DELETE request removes the user from the role. A valid site administrator token must be used with this endpoint. ## 5. Adding or removing site curators -Use the `/ingest/site-role/curator/{user_email}` endpoint to add or remove site curators. A POST request adds the user as a site curator, a GET request returns whether the user is a site curator as a boolean, while a DELETE request removes the user from the role. +Use the `/ingest/site-role/curator/{user_email}` endpoint to add or remove site curators. A POST request adds the user as a site curator, a GET request returns whether the user is a site curator as a boolean, while a DELETE request removes the user from the role. A valid site administrator token must be used with this endpoint. ## 6. Approving/rejecting pending users Use the `/user/pending` endpoint to list pending users. A site admin can approve either a single or multiple pending users by POSTing to the `user/pending/{user}` or `user/pending` endpoints, and likewise reject with DELETEs to the same endpoints. DELETE to the bulk endpoint clears the whole pending list. From fe6185e2cdecde2cf6c38ff0294b1507cf219d62 Mon Sep 17 00:00:00 2001 From: Marion Date: Thu, 3 Oct 2024 14:47:41 -0700 Subject: [PATCH 4/4] updates based on feedback --- README.md | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 57 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index ec70cda..25a4a3e 100644 --- a/README.md +++ b/README.md @@ -26,6 +26,35 @@ pip install -r requirements.txt `candigv2-ingest` can be used as a local API server or a docker container and is generally expected to be used as part of a running [CanDIGv2 stack](https://github.com/CanDIG/CanDIGv2). To use the local API, set your environment variables, run `python app.py`, and follow the API instructions in the sections below. The API will be available at `localhost:1236`. A swagger UI is also available at `/ui`. Docker instructions can be found at the [bottom of this document](#Run-as-Docker-Container). To authorize yourself for these endpoints, you will need to set the Authorization header to a keycloak bearer token (in the format `"Bearer ..."` without the quotes). +### Getting a bearer token +
+ +Users can obtain a bearer token by logging into the CanDIG data portal, clicking the cog in the top right corner, clicking `*** Get API Token` and clicking the token to copy it. + +Site administrators or users using a local candig install can also obtain a token programmatically using the following curl commands from the CanDIGv2 repo: + +```bash +source env.sh +``` + +```bash +CURL_OUTPUT=$(curl -s --request POST \ + --url $KEYCLOAK_PUBLIC_URL'/auth/realms/candig/protocol/openid-connect/token' \ + --header 'Content-Type: application/x-www-form-urlencoded' \ + --data grant_type=password \ + --data client_id=$CANDIG_CLIENT_ID \ + --data client_secret=$CANDIG_CLIENT_SECRET \ + --data username=$CANDIG_SITE_ADMIN_USER \ + --data password=$CANDIG_SITE_ADMIN_PASSWORD \ + --data scope=openid) +``` + +```bash +export TOKEN=$(echo $CURL_OUTPUT | grep -Eo 'access_token":"[a-zA-Z0-9._\-]+' | cut -d '"' -f3) +``` + +
+ ## 1. Program registration Programs need to be registered before any data can be ingested. Initial program registration can be done by either a site admin or site curator. More information about assigning [site admins](#4-adding-or-removing-site-administrators) and [site curators](#5-adding-or-removing-site-curators) is in sections 4 and 5 below. @@ -55,8 +84,10 @@ curl -s --request POST \ -d '{"program_id": "PROGRAM_ID", "program_curators": ["curator1@test.ca", "curator2@test.ca"], "team_members": ["user2@test.ca", "user1@test.ca"]}' ``` +See [Getting a bearer token](#getting-a-bearer-token) for how to get a token. + > [!CAUTION] -> A POST request to the `ingest/program` replaces any existing program registration data for that program, so to add a curator, existing users plus the additional users would need to be added to the payload. +> A POST request to the `ingest/program` replaces any existing program registration data for that program. It is advisable to first use a GET request to see the current users authorized to a program before adding additional program_curators and/or team_members when POSTing to this endpoint ## 2. Clinical data @@ -70,7 +101,17 @@ The preferred method for clinical data ingest is using the API. #### API -The clinical ingest API runs at `$CANDIG_URL/ingest/clinical`. Simply send a request with an authorized bearer token and a JSON body with your `DonorWithClinicalData` object. See the swagger UI/[schema](ingest_openapi.yaml) for the response format. The request will return a response with a queue ID. You can check the status of your ingest using that ID at `$CANDIG_URL/ingest/status/{queue_id}`. +The clinical ingest API runs at `$CANDIG_URL/ingest/clinical`. Simply send a request with an [authorized bearer](#getting-a-bearer-token) token and a JSON body with your clinical data json output from clinical_etl. See the swagger UI/[schema](ingest_openapi.yaml) for the response format. The request will return a response with a queue ID. You can check the status of your ingest using that ID at `$CANDIG_URL/ingest/status/{queue_id}`. + +Example curl POST to ingest clinical data: +```bash +curl -X 'POST' \ + $CANDIG_URL'/ingest/clinical' \ + -H 'accept: application/json' \ + -H 'Content-Type: application/json' \ + -H 'Authorization: Bearer '$TOKEN \ + -d '@/absolute/path/to/clinical_map.json>' +``` ## 3. Genomic data @@ -196,10 +237,22 @@ The file should contain an array of dictionaries, where each item represents a s ### iv. Ingest genomic files #### API -Use the `$CANDIG_URL/ingest/genomic` endpoint with the proper Authorization headers and your genomic JSON as specified above for the body to ingest and link to the clinical dataset program_id. +Use the `$CANDIG_URL/ingest/genomic` endpoint with the proper [Authorization headers](#getting-a-bearer-token) and your genomic JSON as specified above for the body to ingest and link to the clinical dataset program_id. + +Example curl POST request to ingest genomic data: +```bash +curl -X 'POST' \ + $CANDIG_URL'/ingest/genomic' \ + -H 'accept: application/json' \ + -H 'Content-Type: application/json' \ + -H 'Authorization: Bearer '$TOKEN \ + -d '@/absolute/path/to/genomic.json>' +``` + +See [Getting a bearer token](#getting-a-bearer-token) for how to get a token. ## 4. Adding or removing site administrators -Use the `/ingest/site-role/site_admin/{user_email}` endpoint to add or remove site administrators. A POST request adds the user as a site admin, while a DELETE request removes the user from the role. A valid site administrator token must be used with this endpoint. +Use the `/ingest/site-role/admin/{user_email}` endpoint to add or remove site administrators. A POST request adds the user as a site admin, while a DELETE request removes the user from the role. A valid site administrator token must be used with this endpoint. ## 5. Adding or removing site curators Use the `/ingest/site-role/curator/{user_email}` endpoint to add or remove site curators. A POST request adds the user as a site curator, a GET request returns whether the user is a site curator as a boolean, while a DELETE request removes the user from the role. A valid site administrator token must be used with this endpoint.