Skip to content

Commit

Permalink
updated to python3 and debugged
Browse files Browse the repository at this point in the history
  • Loading branch information
dranion authored and drkthomp committed Jan 13, 2021
1 parent ab9e041 commit 620e29f
Show file tree
Hide file tree
Showing 135 changed files with 23,597 additions and 318 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ pimmuno_2.py
develop_data/
venv/
.cache/
jobStore/
test-report.xml
__pycache__
*.DONE
Expand Down
8 changes: 8 additions & 0 deletions .idea/.gitignore

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/inspectionProfiles/profiles_settings.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 7 additions & 0 deletions .idea/misc.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions .idea/modules.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

18 changes: 18 additions & 0 deletions .idea/protect3.iml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

13 changes: 13 additions & 0 deletions .idea/runConfigurations/Basic_Run_.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/vcs.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions MANUAL.md
Original file line number Diff line number Diff line change
Expand Up @@ -547,7 +547,7 @@ purposes:
12: g/f/jobO4yiE4 return self.run(fileStore)
13: g/f/jobO4yiE4 File "/home/ucsc/arjun/tools/dev/toil_clean/src/toil/job.py", line 1406, in run
14: g/f/jobO4yiE4 rValue = userFunction(*((self,) + tuple(self._args)), **self._kwargs)
15: g/f/jobO4yiE4 File "/home/ucsc/arjun/tools/protect_toil_clean/local/lib/python2.7/site-packages/protect/binding_prediction/common.py", line 566, in merge_mhc_peptide_calls
15: g/f/jobO4yiE4 File "/home/ucsc/arjun/tools/protect_toil_clean/local/lib/python3/site-packages/protect/binding_prediction/common.py", line 566, in merge_mhc_peptide_calls
16: g/f/jobO4yiE4 raise RuntimeError('No peptides available for ranking')
17: g/f/jobO4yiE4 RuntimeError: No peptides available for ranking
18: g/f/jobO4yiE4 ERROR:toil.worker:Exiting the worker because of a failed job on host sjcb10st7
Expand Down Expand Up @@ -581,9 +581,9 @@ do not store logs from tools (see BD2KGenomics/protect#275). The error looks sim
Z/O/job1uH92D return self.run(fileStore)
Z/O/job1uH92D File "/home/ucsc/arjun/tools/dev/toil_clean/src/toil/job.py", line 1406, in run
Z/O/job1uH92D rValue = userFunction(*((self,) + tuple(self._args)), **self._kwargs)
Z/O/job1uH92D File "/home/ucsc/arjun/tools/protect_toil_clean/local/lib/python2.7/site-packages/protect/mutation_calling/radia.py", line 238, in run_filter_radia
Z/O/job1uH92D File "/home/ucsc/arjun/tools/protect_toil_clean/local/lib/python3/site-packages/protect/mutation_calling/radia.py", line 238, in run_filter_radia
Z/O/job1uH92D tool_version=radia_options['version'])
Z/O/job1uH92D File "/home/ucsc/arjun/tools/protect_toil_clean/local/lib/python2.7/site-packages/protect/common.py", line 138, in docker_call
Z/O/job1uH92D File "/home/ucsc/arjun/tools/protect_toil_clean/local/lib/python3/site-packages/protect/common.py", line 138, in docker_call
Z/O/job1uH92D 'for command \"%s\"' % ' '.join(call),)
Z/O/job1uH92D RuntimeError: docker command returned a non-zero exit status (1)for command "docker run --rm=true -v /scratch/bio/ucsc/toil-681c097c-61da-4687-b734-c5051f0aa19f/tmped2fnu/f041f939-5c0d-40be-a884-68635e929d09:/data --log-driver=none aarjunrao/filterradia:bcda721fc1f9c28d8b9224c2f95c440759cd3a03 TCGA-CH-5788 17 /data/radia.vcf /data /home/radia/scripts -d /data/radia_dbsnp -r /data/radia_retrogenes -p /data/radia_pseudogenes -c /data/radia_cosmic -t /data/radia_gencode --noSnpEff --noBlacklist --noTargets --noRnaBlacklist -f /data/hg38.fa --log=INFO -g /data/radia_filtered_chr17_radia.log"
Z/O/job1uH92D ERROR:toil.worker:Exiting the worker because of a failed job on host sjcb10st1
Expand Down
15 changes: 7 additions & 8 deletions Makefile
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -45,17 +45,16 @@ help:
@echo "$$help"


python=python2.7
pip=pip2.7
python=python
pip=pip
tests=src/protect/test/unit
extras=

green=\033[0;32m
normal=\033[0m
red=\033[0;31m

prepare: check_venv
@$(pip) install toil==3.8.0 pytest==2.8.3
@$(pip) install toil pytest

develop: check_venv
$(pip) install -e .$(extras)
Expand Down Expand Up @@ -107,10 +106,10 @@ clean_pypi:

clean: clean_develop clean_sdist clean_pypi


check_venv:
@$(python) -c 'import sys; sys.exit( int( not hasattr(sys, "real_prefix") ) )' \
|| ( echo "$(red)A virtualenv must be active.$(normal)" ; false )
#always fails, even though in a venv
#check_venv:
# @$(python) -c 'import sys; sys.exit( int( not hasattr(sys, "real_prefix") ) )' \
# || ( echo "$(red)A virtualenv must be active.$(normal)" ; false )


check_clean_working_copy:
Expand Down
169 changes: 169 additions & 0 deletions ProTECT_config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
## Copyright 2016 UCSC Computational Genomics Lab
## Original contributor: Arjun Arkal Rao
##
## Licensed under the Apache License, Version 2.0 (the "License");
## you may not use this file except in compliance with the License.
## You may obtain a copy of the License at
##
## http://www.apache.org/licenses/LICENSE-2.0
##
## Unless required by applicable law or agreed to in writing, software
## distributed under the License is distributed on an "AS IS" BASIS,
## WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
## See the License for the specific language governing permissions and
## limitations under the License.
####################################################################################################
####################################################################################################
## This is the input parameter file for the precision immuno pipeline. The parameters for each of
## the tools is provided here. The file is written in the YAML format. A nice description of the
## format can be found at http://docs.ansible.com/ansible/YAMLSyntax.html
##
## You can add comments anywhere in this file by prefixing it with a '#'
##
## Unless otherwise mentioned, all fields must be filled.
##
####################################################################################################
####################################################################################################

# Any number of patients/samples can be listed here
patients:
TEST:
tumor_dna_fastq_1: /home/drkthomp/gerald_D1VCPACXX_1.bam
normal_dna_fastq_2: /home/drkthomp/gerald_D1VCPACXX_6.bam
tumor_rna_fastq_2: /home/drkthomp//gerald_C1TD1ACXX_8_ACAGTG.bam
tumor_type: 'SKCM'

# These are options that are used by most tools
Universal_Options:
dockerhub: aarjunrao
java_Xmx: 12G
reference_build: hg38 # Acceptable options are hg38, hg38, GRCh37, GRCh38
# sse_key: /path/to/master.key # Path to the AWS master key. Required if using AWS else optional
# sse_key_is_master: True # True or False. Required if using AWS else optional
# gdc_download_token: /path/to/token.txt # Path to the user's GDC download token.
storage_location: Local # Local or aws:<bucket_name> for where the output must go
#storage_location: aws:protect-run-xyz
output_folder: /home/drkthomp/results # Path to where the output must go.
#mail_to: [email protected] # Email patients:


# for sending success report.


# These options are for each module. You probably don't need to change any of this!
alignment:
cutadapt:
a: AGATCGGAAGAG
A: AGATCGGAAGAG
# version: 1.9.1
star:
type: star # use starlong if your reads are > 150bp
index: /home/drkthomp/e/protect-data/star_with_fusion_100bp_readlen_indexes.tar.gz # Use star_without if you set star_fusion = False
# version: 2.5.2b
bwa:
index: /home/drkthomp/e/protect-data/bwa_index.tar.gz
# version: 0.7.9a
post:
samtools:
# version: 1.2
picard:
# version: 1.135

expression_estimation:
rsem:
index: /home/drkthomp/e/protect-data/rsem_index.tar.gz
# version: 1.2.0

mutation_calling:
indexes:
chromosomes: canonical_chr, chrM
genome_fasta: /home/drkthomp/e/protect-data/hg38.fa.tar.gz
genome_fai: /home/drkthomp/e/protect-data/hg38.fa.fai.tar.gz
genome_dict: /home/drkthomp/e/protect-data/hg38.dict.tar.gz
cosmic_vcf: /home/drkthomp/e/protect-data/CosmicCodingMuts.vcf.tar.gz
cosmic_idx: /home/drkthomp/e/protect-data/CosmicCodingMuts.vcf.idx.tar.gz
dbsnp_vcf: /home/drkthomp/e/protect-data/dbsnp_coding.vcf.gz
dbsnp_idx: /home/drkthomp/e/protect-data/dbsnp_coding.vcf.idx.tar.gz
dbsnp_tbi: /home/drkthomp/e/protect-data/dbsnp_coding.vcf.gz.tbi
mutect:
java_Xmx: 2G
# version: 1.1.7
muse:
# version: 1.0rc_submission_b391201
radia:
cosmic_beds: /home/drkthomp/e/protect-data/radia_cosmic.tar.gz
dbsnp_beds: /home/drkthomp/e/protect-data/radia_dbsnp.tar.gz
retrogene_beds: /home/drkthomp/e/protect-data/radia_retrogenes.tar.gz
pseudogene_beds: /home/drkthomp/e/protect-data/radia_pseudogenes.tar.gz
gencode_beds: /home/drkthomp/e/protect-data/radia_gencode.tar.gz
# version: 398366ef07b5911d8082ed61cbf03d487a41f286
somaticsniper:
# version: 1.0.4
samtools:
# version: 0.1.8
bam_readcount:
# version: 0.7.4
star_fusion:
#run: True
#version: 1.0.0
fusion_inspector:
#run_trinity: True
#version: 1.0.1
strelka:
# version: 1.0.15
config_file: /home/drkthomp/e/protect-data/strelka_bwa_WXS_config.ini.tar.gz


mutation_annotation:
snpeff:
index: /home/drkthomp/e/protect-data/snpeff_index.tar.gz
# version: 3.6
java_Xmx: 20G

mutation_translation:
transgene:
gencode_peptide_fasta : /home/drkthomp/e/protect-data/gencode.v25.pc_translations_NOPARY.fa.tar.gz
gencode_transcript_fasta : /home/drkthomp/e/protect-data/gencode.v25.pc_transcripts_NOPARY.fa.tar.gz
gencode_annotation_gtf : /home/drkthomp/e/protect-data/gencode.v25.annotation_NOPARY.gtf.tar.gz
genome_fasta : /home/drkthomp/e/protect-data/hg38.fa.tar.gz
# version: 2.2.2

haplotyping:
phlat:
index: /home/drkthomp/e/protect-data/phlat_index.tar.gz
# version: 1.0

mhc_peptide_binding:
mhci:
method_file: /home/drkthomp/e/protect-data/mhci_restrictions.json.tar.gz
pred: IEDB_recommended
# version: 2.13
mhcii:
method_file: /home/drkthomp/e/protect-data/mhcii_restrictions.json.tar.gz
pred: IEDB_recommended
# version: 2.13
netmhciipan:
# version: 3.1

prediction_ranking:
rankboost:
mhci_args:
npa: 0.0
nph: 0.0
nMHC: 0.32
TPM: 0.0
overlap: 0.68
tndelta: 0.0
mhcii_args:
npa: 0.2
nph: 0.2
nMHC: 0.2
TPM: 0.2
tndelta: 0.2
# version: 2.0.3

reports:
mhc_pathways_file: /home/drkthomp/e/protect-data/mhc_pathways.tsv.tar.gz
itx_resistance_file: /home/drkthomp/e/protect-data/itx_resistance.tsv.tar.gz
immune_resistance_pathways_file: /home/drkthomp/e/protect-data/immune_resistance_pathways.json.tar.gz
car_t_targets_file: /home/drkthomp/e/protect-data/car_t_targets.tsv.tar.gz
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
[![Stories in Ready](https://badge.waffle.io/BD2KGenomics/protect.png?label=ready&title=Ready)](https://waffle.io/BD2KGenomics/protect)
# ProTECT
### **Pr**ediction **o**f **T**-Cell **E**pitopes for **C**ancer **T**herapy

Adapation of ProTECT to use python 3.8 instead of 2.7. Currently have tested a complete run using fastq files from [HCC1395 WGS Exome RNA Seq Data](https://github.com/genome/gms/wiki/HCC1395-WGS-Exome-RNA-Seq-Data), but have not checked results against the [original ProTECT](https://github.com/BD2KGenomics/protect) with TCGA PRAD yet.

Adaptation done using 2to3 and manual bug testing. Manual changes recorded [at changes.md](https://github.com/Dranion/protect/blob/master/changes.md). Since s3am is python2, **currently is local only**, however an untested python3 version of s3am exists [here](https://github.com/Dranion/bd2k-extras/tree/main). Continuing to the original README:

This repo contains the Python libraries for the Precision Immunology Pipeline developed at UCSC.

src/protect/pipeline/ProTECT.py - The python script for running the pipeline.
Expand Down
Loading

0 comments on commit 620e29f

Please sign in to comment.