Skip to content

Commit

Permalink
cleanup
Browse files Browse the repository at this point in the history
  • Loading branch information
drkthomp committed Jan 13, 2021
1 parent 620e29f commit b0ffb4c
Show file tree
Hide file tree
Showing 73 changed files with 29 additions and 23,058 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
bd2k-extras/
pimmuno.py
pimmuno_2.py
*.pyc
Expand Down
58 changes: 15 additions & 43 deletions MANUAL.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,87 +27,59 @@ ProTECT is implemented in the [Toil](https://github.com/BD2KGenomics/toil.git) f
runs the workflow described in [protect/Flowchart.txt](
https://github.com/BD2KGenomics/protect/blob/master/Flowchart.txt).

**This manual is a quick adaptation for an adaptation of ProTECT to py3**


# Installation

ProTECT requires Toil and we recommend installing ProTECT and its requirements in a
[virtualenv](http://docs.python-guide.org/en/latest/dev/virtualenvs/).

ProTECT also requires [s3am](https://github.com/BD2KGenomics/s3am.git) version 2.0.1 to download and
~ProTECT also requires [s3am](https://github.com/BD2KGenomics/s3am.git) version 2.0.1 to download and
upload files from S3. We recommend installing s3am in its own virtualenv using the directions in
the s3am manual, then putting the s3am binary on your $PATH. ProTECT will NOT attempt to install
s3am during installation.
s3am during installation.~

currently WIP. for now, **only references to local files will work**. anything that requires access to s3am (s3 buckets) will **fail**.

ProTECT uses pkg_resources from setuptools to verify versions of tools during install. As of setuptools
~ProTECT uses pkg_resources from setuptools to verify versions of tools during install. As of setuptools
39.0.1, some modules were moved to the packaging module. If your machine has setuptools >=39.0.1, you
will need the packaging module.
will need the packaging module.~

Lastly, ProTECT uses [docker](https://www.docker.com/) to run the various sub-tools in a
reproducible, platform independent manner. ProTECT will NOT attempt to install docker during
installation.

### Method 1 - Using PIP (recommended)

First create a virtualenv at your desired location (Here we create it in the folder ~/venvs)

virtualenv ~/venvs/protect

Activate the virtualenv

source ~/venvs/protect/bin/activate

NOTE: Installation was tested using pip 7.1.2 and 8.1.1. We have seen issues with the installation
of pyYAML with lower versions of pip and recommend upgrading pip before installing ProTECT.

pip install --upgrade pip

Install Toil

pip install toil[aws]==3.5.2

Install packaging (required if setuptools>=39.0.1)

pip install packaging

Install ProTECT and all dependencies in the virtualenv

pip install protect

~Method 1 - Using PIP (recommended)~
### Method 2 - Installing from Source

This will install ProTECT in an editable mode.

Obtain the source from Github

git clone https://www.github.com/BD2KGenomics/protect.git
git clone https://www.github.com/Dranion/protect.git

Create and activate a virtualenv in the project folder (Important since the Makefile checks for
this and will fail if it detects that you are not in a virtual environment)

cd protect
virtualenv venv
virtualenv --python=python3 venv
source venv/bin/activate

Install Toil and pytest

make prepare

Install packaging (required if setuptools>=39.0.1)
Install the python3 conversion of bd2k and s3am. *s3am is untested as I am running locally*

pip install packaging
make special_install

Install ProTECT

make develop

## Method 3 - Using Docker
~Method 3 - Using Docker~

Dockerized versions of ProTECT releases can be found at https://quay.io/organization/ucsc_cgl. These
Docker containers run the ProTECT pipeline in single machine mode. The only difference between the
Docker and Python versions of the pipeline is that the Docker container takes the config options,
described below, as command line arguments as opposed to a config file. Running the container
without any arguments will list all the available options. Also, currently the dockerized version of
ProTECT only supports local file export.

# Running ProTECT

Expand Down Expand Up @@ -173,7 +145,7 @@ in the pipeline, and the information on the input samples. Elements before a `:`
dictionary read into ProTECT and should **NOT** be modified (Barring the patient ID key in the
patients dictionary). Only values to the right of the `:` should be edited.

Every required reference file is provided in the AWS bucket `cgl-pipeline-inputs` under the folder
Every required reference file is provided in the AWS bucket `protect-data` under the folder
`protect/hg19_references` or `protect/hg38_references`. The `README` file in the same location
describes in detail how each file was generated. To use a file located in an s3 bucket, replace
`/path/to` in the following descriptions with `s3://<databucket>/<folder_in_bucket>`.
Expand Down
15 changes: 10 additions & 5 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,12 @@ green=\033[0;32m
normal=\033[0m
red=\033[0;31m

# WIP
special_install: check_venv
git clone https://github.com/Dranion/bd2k-extras.git
make -C bd2k-extras/bd2k-python-lib develop
make -C bd2k-extras/s3am develop

prepare: check_venv
@$(pip) install toil pytest

Expand Down Expand Up @@ -106,11 +112,10 @@ clean_pypi:

clean: clean_develop clean_sdist clean_pypi

#always fails, even though in a venv
#check_venv:
# @$(python) -c 'import sys; sys.exit( int( not hasattr(sys, "real_prefix") ) )' \
# || ( echo "$(red)A virtualenv must be active.$(normal)" ; false )

check_venv:
@$(python) -c 'import sys; sys.exit( int( not (hasattr(sys, "real_prefix") or ( hasattr(sys, "base_prefix") and sys.base_prefix != sys.prefix ) ) ) )' \
|| [ ! -z "${VIRTUAL_ENV}" ] \
|| ( echo "$(red)A virtualenv must be active.$(normal)\n" ; false )

check_clean_working_copy:
@echo "$(green)Checking if your working copy is clean ...$(normal)"
Expand Down
169 changes: 0 additions & 169 deletions ProTECT_config.yaml

This file was deleted.

4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# ProTECT
### **Pr**ediction **o**f **T**-Cell **E**pitopes for **C**ancer **T**herapy

Adapation of ProTECT to use python 3.8 instead of 2.7. Currently have tested a complete run using fastq files from [HCC1395 WGS Exome RNA Seq Data](https://github.com/genome/gms/wiki/HCC1395-WGS-Exome-RNA-Seq-Data), but have not checked results against the [original ProTECT](https://github.com/BD2KGenomics/protect) with TCGA PRAD yet.
Adapation of ProTECT to use python 3.8 instead of 2.7. Currently have tested a complete run using fastq files from [HCC1395 WGS Exome RNA Seq Data](https://github.com/genome/gms/wiki/HCC1395-WGS-Exome-RNA-Seq-Data), with identical results in both version of python.

Adaptation done using 2to3 and manual bug testing. Manual changes recorded [at changes.md](https://github.com/Dranion/protect/blob/master/changes.md). Since s3am is python2, **currently is local only**, however an untested python3 version of s3am exists [here](https://github.com/Dranion/bd2k-extras/tree/main). Continuing to the original README:

Expand All @@ -23,6 +23,6 @@ All docker images used in this pipeline are available at


To learn how the pipeline can be run on a sample, head over to the [ProTECT Manual](
https://github.com/BD2KGenomics/protect/blob/master/MANUAL.md)
https://github.com/Dranion/protect/blob/master/MANUAL.md)

ProTECT is currently in its infancy and is under continuous development. We would appreciate users sharing the level 3 data produced by ProTECT with us such that we can better train our predictive models.
Loading

0 comments on commit b0ffb4c

Please sign in to comment.