Variant Calling (Samtools Mpileup): Error
0
0
Entering edit mode
5.7 years ago
venuraherath ▴ 20

Hi,

I am trying to update a variant calling script came from the predecessors. I am getting the error "bash: mpileup : No such file or directory"

Following are the contents of the old script;

samtools mpileup -uD -f <reference_genome.fna> <input.bam> > <output1.out> [mpileup] 1 samples in 1 input files <mpileup> Set max per-file depth to 8000^C

bcftools view <output1.out>  > <output2.out>
samtools vcfutils.pl varFilter -d2 -D100 -a2 <output2.out> > output3.vcf

I did modify "samtools mpileup" to "bcftools mpileup" and "-uD to -u -t DP" to match new versions. The modified script is given below;

bcftools mpileup -u -t DP -f <reference_genome.fna> <input.bam> > <output1.out> [mpileup] 1 samples in 1 input files  <mpileup> Set max per-file depth to 8000^C

bcftools view <output1.out>  > <output2.out>
samtools vcfutils.pl varFilter -d2 -D100 -a2 <output2.out> > output3.vcf

When I run it , I get above error. I am using samtools 1.9 & bcftools 1.9

Can someone help me on this regard?

Thank you.

variant calling samtools mpileup bcftools • 4.1k views
ADD COMMENT
0
Entering edit mode

Umm, you know you aren't supposed to literally have those <> in your command line, right?

ADD REPLY
0
Entering edit mode

Are your referring to <> between output files? This is just the template. I used real file name and didnt include <>. except for <mpileup> Set max per-file depth to 8000^C

ADD REPLY
0
Entering edit mode

except for <mpileup> Set max per-file depth to 8000^C

You wrote this in your command line by yourself? That doesn't work. This have to be done with -d 8000 as a parameter for bcftools mpileup.

fin swimmer

ADD REPLY
0
Entering edit mode

If I use the following will it do the trick? Thanks.

bcftools mpileup -u -t DP -f -d 8000 <reference_genome.fna> <input.bam> > <output1.out> 

 [mpileup] 1 samples in 1 input files
ADD REPLY
1
Entering edit mode

You should first start bcftools mpileup without any parameter to see the help page. You will see:

  • that there is no paramter -u. You probably want to use -Ou to output as uncompressed bcf?
  • that -t is used for defining target regions . You probably want to use -a FORMAT/DP to add this value into the output?
  • that -f must be followed by the reference sequence.

You are aware that bcftools mpileup itself doesn't call variants? For this bcftools call is neccessary. And this needs the uncompressed bcf output from mpileup as input.

fin swimmer

ADD REPLY
0
Entering edit mode

Oh, got it. Thank you so much.

  • Yes I will use -Ou . Most of the examples available are based on samtools mpileup not bcftools mpileup.
  • Actually what i wanted there was was to keep read depth for each sample. So I will use -a FORMAT/DP
  • I will move -d 8000 before -f

    bcftools mpileup -Ou -a FORMAT/DP -d 8000 -f <reference_genome.fna> <input.bam> > <output1.out>
    [mpileup] 1 samples in 1 input files'

When I try that I get following error;

[E::hts_open_format] Failed to open file [mpileup] [mpileup] failed to open [mpileup]: No such file or directory

ADD REPLY
0
Entering edit mode

Could you please show the real command you use (replacing your placeholder by filenames)?

fin swimmer

ADD REPLY
0
Entering edit mode
bcftools mpileup -Ou -a FORMAT/DP -d 8000 -f GCF_000002315.5_GRCg6a_genomic.fna FCC717NANXX-wHAIPI020726-60.bam > FCC717NANXX-wHAIPI020726-60.out [mpileup] 1 samples in 1 input files
ADD REPLY
1
Entering edit mode

Remove this part: [mpileup] 1 samples in 1 input files

And use just

bcftools mpileup -Ou -a FORMAT/DP -d 8000 -f GCF_000002315.5_GRCg6a_genomic.fna FCC717NANXX-wHAIPI020726-60.bam > FCC717NANXX-wHAIPI020726-60.out

ADD REPLY
0
Entering edit mode

Thank you so much! It is running now. Regarding bcftools call ; I have received VCF files generated via original workflow (as given in the old script). Now I am confused. Do you think it is better to dump those VCF files and re do those samples as well, (esp since dif version of tools were used)?

Omitting rest ,Is it ok to use bcftools call -v -m -O z -o variantc.vcf.gz followed by filtering variants using rtg vcffilter? Sorry for asking too many questions. Thank you.

ADD REPLY
1
Entering edit mode

Hello venuraherath ,

you bcftools call looks fine to me. In general I recommend to normalize your indels after calling. You can chain all these command together:

$ bcftools mpileup -Ou -a FORMAT/DP -d 8000 -f GCF_000002315.5_GRCg6a_genomic.fna FCC717NANXX-wHAIPI020726-60.bam | bcftools call -v -m -Ou | bcftools norm -f GCF_000002315.5_GRCg6a_genomic.fna -Oz -o variantc.vcf.gz

I've never used rtg. My favorite is bcftools filter or bcftools view.

Do you think it is better to dump those VCF files and re do those samples as well, (esp since dif version of tools were used)?

This depends on the aim of your investigation.

fin swimmer

ADD REPLY
0
Entering edit mode

Thanks again. You were very helpful. Earlier script is still running since its using only one thread though 24 threads are available. Do you deposit your script there at Github? If so please give me the link. So I can look at them and learn more.

Have a great day! Venura

ADD REPLY

Login before adding your answer.

Traffic: 1512 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6