Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HATCHet fails with new version of bcftools 1.21 #227

Open
simozacca opened this issue Oct 12, 2024 · 6 comments
Open

HATCHet fails with new version of bcftools 1.21 #227

simozacca opened this issue Oct 12, 2024 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@simozacca
Copy link
Contributor

simozacca commented Oct 12, 2024

The new version of bcftools 1.21 requires to specify that the AD field is in INFO in the query command used by HATCHet in

https://github.com/raphael-group/hatchet/blob/783058042cc3ef78b3f46ff7478edb5b6dfd9f44/src/hatchet/utils/count_alleles.py#L236C22-L236C53

Specifically, the following command of bcftools called by HATCHet fails and hang with the new version of bcftools 1.21:

bcftools mpileup ${BAM} -Ou -f ${REF} --skip-indels -a INFO/AD -q 13 -Q 13 -d 30 | bcftools query -f '%CHROM\\t%POS\\t%REF,%ALT\\t%AD\\n' -i 'SUM(AD)<=1000 & SUM(AD)>=1' > output

failing with the message:

[mpileup] 1 samples in 1 input files
Error: ambiguous filtering expression, both INFO/AD and FORMAT/AD are defined in the VCF header.

The temporary solution is to install HATCHet with a previous version of samtools/bcftools, like 1.19, after correct setting up of bioconda channells:

conda config --add channels bioconda
conda config --add channels conda-forge
conda config --set channel_priority strict

conda create -n hatchet hatchet "bcftools==1.19" "samtools==1.19"

In the meanwhile, @mmyers1 can we please fix this by making the bcftools command compatible, or changing bioconda recipe of HATCHet to only use a compatible version of bcftools/samtools please?

@tahashmi
Copy link

tahashmi commented Jan 15, 2025

Hi, any update on this bug? For some reason, this forced command: conda create -n hatchet hatchet "bcftools==1.19" "samtools==1.19" doesn't work at my side, with some conflicts issues.
`

@RunpengLuo
Copy link
Collaborator

RunpengLuo commented Jan 15, 2025

Hi, any update on this bug? For some reason, this forced command: conda create -n hatchet hatchet "bcftools==1.19" "samtools==1.19" doesn't work at my side, with some conflicts issues. `

Hi tahashmi, I've pushed the fix so it should work with default conda recipe for HATCHetv2.1.0. Please try to reinstall it (without the specific version for bcftools and samtools) and let me know if it fixes your problem. Thank you.

@tahashmi
Copy link

Hi @RunpengLuo , Thanks for this. Unfortunately due to removing and installing bcftools manually on HPC, I am having some issue with my account, hopefully it will be resolved today and then will try again.

@tahashmi
Copy link

Hi, @RunpengLuo

I have installed fresh hatchet conda env.

Issue still exists:

(hatchet) -bash-4.4$ python -m hatchet
HATCHet v2.1.0
Usage: hatchet <command> <arguments ..>

The following commands are supported:
 count-reads
 count-reads-fw
 genotype-snps
 count-alleles
 combine-counts
 combine-counts-fw
 cluster-bins
 plot-bins
 compute-cn
 plot-cn
 run
 check
 download-panel
 phase-snps
 cluster-bins-gmm
 plot-cn-1d2d
 plot-bins-1d2d
(hatchet) -bash-4.4$ vi run_1937.sh 
(hatchet) -bash-4.4$ vi hatchet_1937.ini
(hatchet) -bash-4.4$ vi hatchet_2009.ini 
(hatchet) -bash-4.4$ vi run_1937.out 
(hatchet) -bash-4.4$ vi 1937/normal_chr10_bcftools.log

[mpileup] 1 samples in 1 input files
Error: ambiguous filtering expression, both INFO/AD and FORMAT/AD are defined in the VCF header.
[mpileup] maximum number of reads per input file set to -d 1000

@RunpengLuo
Copy link
Collaborator

hatchet_v2.1.0_conda.yaml.zip

Hi @tahashmi, sorry for the inconvenience, I think the BioConda hasn't sync the latest commit on the main branch, and I will work on this ASAP and let you know when it is done. At the mean time, do you mind to install HATCHet directly from source, the detailed steps are shown as follows. Thank you.

  1. remove previous HATCHet installed via Conda.
  2. create&activate the conda environment with the attached YAML file (skip if all dependencies already installed)
    • conda env create -f hatchet_v2.1.0.conda.yaml
    • conda activate hatchet-v2.1.0
  3. Download&install hatchet directly from GitHub
    • git clone https://github.com/raphael-group/hatchet.git
    • cd hatchet
    • export HATCHET_BUILD_NOEXT=1; #also works if unset.
    • module load gurobi/9.1.2; #unnecessary if using cbc solver
    • export HATCHET_COMPUTE_CN_SOLVER=gurobi; #change to cbc if cbc solver is used instead
    • pip install . #before this step, try to clean up cache as well, most likely in $HOME/.cache/pip
    • hatchet check #check all dependencies.

@tahashmi
Copy link

tahashmi commented Jan 24, 2025

Thanks for the prompt reply. I can wait once you fix this issue. My hatchet workflow is already setup with default installation method, and it already took some time to setup gurobi.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants