bcftools mpileup create an empty output file
1
0
Entering edit mode
3.5 years ago
Assa Yeroslaviz ★ 1.8k

I try to run bcftools to create a vcf file of my data, but I keep getting empty output files

The command I use are

bwa mem -t14 -M ~/genomes/Sce.R64-1-1/bwaIndex/Sce.R64-1-1 HHgDNA1.R1.fastq.gz HHgDNA1.R2.fastq.gz | samtools view -buh - | samtools sort -@ 14 -o bamFiles/bwa/HHgDNA1.sorted.bam
samtools index bamFiles/bwa/HHgDNA1.sorted.bam
bcftools mpileup -A  -t 10 -Ou -f $REF_FASTA bamFiles/bwa/HHgDNA1.sorted.bam

while $REF_FASTA is the fastA file also used to create the bwa indexed genome. I know the output goes to STDOUT, but I'm still trying to figure it out. looking at the bam files i have ~99% mapped and properly paired reads.

All I get is the header of the file, but nothing more.

any ideas what is going on?

thanks

bcftools mpileup samtools bwa bam • 1.8k views
ADD COMMENT
0
Entering edit mode

Are you using it correctly? - where is your bcftools call command? See here: https://github.com/kevinblighe/ClinicalGradeDNAseq/blob/master/AnalysisMasterVersion1.sh#L302-L316

ADD REPLY
0
Entering edit mode

I think I do. I have removed it after I got nothing from the bcftools call command to see why the output file is empty. This is also, why I output it to the STDOUT, but all I get is the header. From what I remember from the samtools mpileup one can also get an output file from calling only the mpileup. But here I get nothing.

ADD REPLY
0
Entering edit mode

So, it functions correctly but just calls nothing / no variants. Are your contigs named correctly?, e.g., 'chr1' versus '1'? Check, also, the header that's produced, as it should record commands that were used during processing - check for anything unusual.

Is $REF_FASTA being expanded correctly in your shell?

ADD REPLY
0
Entering edit mode

Are your contigs named correctly?, e.g., 'chr1' versus '1'? yes as far as I can tell

the header of my bam file

samtools view -H bamFiles/bwa/HHgDNaaA1.sorted.bam 
@HD     VN:1.5  SO:coordinate
@SQ     SN:I    LN:230218
@SQ     SN:II   LN:813184
@SQ     SN:III  LN:316620
@SQ     SN:IV   LN:1531933
@SQ     SN:V    LN:576874
@SQ     SN:VI   LN:270161
@SQ     SN:VII  LN:1090940
@SQ     SN:VIII LN:562643
@SQ     SN:IX   LN:439888
@SQ     SN:X    LN:745751
@SQ     SN:XI   LN:666816
@SQ     SN:XII  LN:1078177
@SQ     SN:XIII LN:924431
@SQ     SN:XIV  LN:784333
@SQ     SN:XV   LN:1091291
@SQ     SN:XVI  LN:948066
@SQ     SN:Mito LN:85779
@PG     ID:bwa  PN:bwa  VN:0.7.17-r1188 CL:bwa mem -t14 -M bwaIndex/Sce data/HHgDNA1.conc.R1.fastq.gz data/HHgDNA1.conc.R2.fastq.gz

the fastA file contains the chromosomes

$ grep ">" Sce.R64-1-1.fa
>I dna:chromosome chromosome:R64-1-1:I:1:230218:1 REF
>II dna:chromosome chromosome:R64-1-1:II:1:813184:1 REF
>III dna:chromosome chromosome:R64-1-1:III:1:316620:1 REF
>IV dna:chromosome chromosome:R64-1-1:IV:1:1531933:1 REF
>V dna:chromosome chromosome:R64-1-1:V:1:576874:1 REF
>VI dna:chromosome chromosome:R64-1-1:VI:1:270161:1 REF
>VII dna:chromosome chromosome:R64-1-1:VII:1:1090940:1 REF
>VIII dna:chromosome chromosome:R64-1-1:VIII:1:562643:1 REF
>IX dna:chromosome chromosome:R64-1-1:IX:1:439888:1 REF
>X dna:chromosome chromosome:R64-1-1:X:1:745751:1 REF
>XI dna:chromosome chromosome:R64-1-1:XI:1:666816:1 REF
>XII dna:chromosome chromosome:R64-1-1:XII:1:1078177:1 REF
>XIII dna:chromosome chromosome:R64-1-1:XIII:1:924431:1 REF
>XIV dna:chromosome chromosome:R64-1-1:XIV:1:784333:1 REF
>XV dna:chromosome chromosome:R64-1-1:XV:1:1091291:1 REF
>XVI dna:chromosome chromosome:R64-1-1:XVI:1:948066:1 REF
>Mito dna:chromosome chromosome:R64-1-1:Mito:1:85779:1 REF

Check, also, the header that's produced, as it should record commands that were used during processing - check for anything unusual.

This is the output I get from the bcftools mpielup call

bcftools mpileup -A -t 10 -Ou -f /home/yeroslaviz/poolFolders/pool-bcfngs/genomes/Sce.R64-1-1.fa bamFiles/bwa/HHgDNaaA1.sorted.bam 
[mpileup] 1 samples in 1 input files
BCFa
##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed",IDX=0>
##bcftoolsVersion=1.8+htslib-1.8
##bcftoolsCommand=mpileup -A -t 10 -Ou -f /home/yeroslaviz/poolFolders/pool-bcfngs/genomes/Sce.R64-1-1.fa bamFiles/bwa/HHgDNaaA1.sorted.bam
##reference=file:///home/yeroslaviz/poolFolders/pool-bcfngs/genomes/Sce.R64-1-1.fa
##contig=<ID=I,length=230218,IDX=0>
##contig=<ID=II,length=813184,IDX=1>
##contig=<ID=III,length=316620,IDX=2>
##contig=<ID=IV,length=1531933,IDX=3>
##contig=<ID=V,length=576874,IDX=4>
##contig=<ID=VI,length=270161,IDX=5>
##contig=<ID=VII,length=1090940,IDX=6>
##contig=<ID=VIII,length=562643,IDX=7>
##contig=<ID=IX,length=439888,IDX=8>
##contig=<ID=X,length=745751,IDX=9>
##contig=<ID=XI,length=666816,IDX=10>
##contig=<ID=XII,length=1078177,IDX=11>
##contig=<ID=XIII,length=924431,IDX=12>
##contig=<ID=XIV,length=784333,IDX=13>
##contig=<ID=XV,length=1091291,IDX=14>
##contig=<ID=XVI,length=948066,IDX=15>
##contig=<ID=Mito,length=85779,IDX=16>
##ALT=<ID=*,Description="Represents allele(s) other than observed.">
##INFO=<ID=INDEL,Number=0,Type=Flag,Description="Indicates that the variant is an INDEL.",IDX=1>
...
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  bamFiles/bwa/HHgDNaaA1.sorted.bam

I can't see anything out of the ordinary.

Is $REF_FASTA being expanded correctly in your shell?

yes, but I have tried it with both the full path and the variable with no differences in the results

ADD REPLY
1
Entering edit mode
3.5 years ago
Assa Yeroslaviz ★ 1.8k

ok, the problem was using the wrong threads parameter. Somehow I assumed that the threads number can be called with -t. But unfortunately this is a region parameter, and as such for a chromosome I don't have in my data set (as I have I,II,III, etc)

so calling it correctly would be

bcftools mpileup -A --threads 10 -Ou -f $REF_FASTA bamFiles/bwa/HHgDNA1.sorted.bam
ADD COMMENT
0
Entering edit mode

Thanks for posting the answer. -t may have been used for threads in previous versions of bcftools or samtools - not sure.

ADD REPLY

Login before adding your answer.

Traffic: 2434 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6