Samtools mpileup not providing correct count info
1
1
Entering edit mode
8.8 years ago
jgbradley1 ▴ 110

I simulated reads for a list of variant sites using GATK SimulateReadsForVariants tool. From that, I get a bam file output. Next I create a pileup using samtools mpileup.

samtools mpileup -t DP,DPR,DV,INFO/DPR -vuf GRCh37.fa -l snp.file.vcf simreads.bam > simreads.raw.vcf

The problem is that the output is not giving me correct counts at the variant sites. Here's an example of the first two SNP sites from the pileup.

#CHROM    POS    ID    REF    ALT    QUAL    FILTER    INFO    FORMAT    NA12878
1    837214    .    G    <X>    0    .    DP=20;I16=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0;QS=0,0;MQ0F=0;DPR=0,0    PL:DP:DV:DPR    0,0,0:0:0:0,0
1    851390    .    G    <X>    0    .    DP=20;I16=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0;QS=0,0;MQ0F=0;DPR=0,0    PL:DP:DV:DPR    0,0,0:0:0:0,0

So why would all the counts (with the exception of the DP tag) be 0? I opened up the bam file in IGV to check the first couple of sites and the reads do pileup there as they should.

pileup mpileup samtools • 3.6k views
ADD COMMENT
0
Entering edit mode

Can you post the BAM file (or a small subset of it) and the version of samtools that you're using?

ADD REPLY
2
Entering edit mode
8.8 years ago
jgbradley1 ▴ 110

I figured out the solution myself. I had simulated more error in my reads than I initially thought. For anyone else experiencing a similar issue, check the -Q, --min-BQ mpileup flag. Samtools automatically does not count reads in the other tags unless they are "high quality" (i.e. have a minimum base quality of 13). Depending on your experiment, if you're only interest is getting all the reads counted then consider using --min-BQ 1.

ADD COMMENT

Login before adding your answer.

Traffic: 2702 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6