Question: Samtools mpileup reference is different from input reference fasta
0
gravatar for julirsch
2.4 years ago by
julirsch0
julirsch0 wrote:

When I run samtools (v1.3.1) mpileup I am running into some very concerning issues. The reference bases for the output pileup file are not in the same position as the references in the input fasta. In fact, they appear to be oddly shifted. For example, here is output for 10 variants from the following samtools pileup command:

samtools mpileup -f mtDNA.fasta -r chrM -Q 0 -t AD,ADF,ADR --skip-indels input.bam > input.txt

chrM 1330 C 457 G$G$GGGgggtg...

chrM 1331 A 461 T$T$T$tttttT...

chrM 1332 A 455 c$c$c$c$c$CC...

chrM 1333 G 449 A$A$A$A$A$A$...

chrM 1334 G 463 A$,$,$AAaaaa...

chrM 1335 T 487 GGgggggggggg...

chrM 1336 G 529 .$.$,$,$,$,,...

chrM 1337 T 524 ,$,$,$,$,$,$...

chrM 1338 A 503 g$g$g$g$g$g$...

chrM 1339 G 497 TTcccctttTTT...

chrM 1340 C 506 A$A$ttttaaaA...

Here is output from faidx, showing the correct fasta sequence for these positions:

samtools faidx mtDNA.fasta chrM:1330-1340

chrM:1330-1340 GTCAAGGTGTA

Notice that the pileup format is shifted by 2 bases. Looking at this more closely, I identified some positions where the reference sequence would be "CCCCCCCCTCCCCCC" (8 Cs, 1 T, 6 Cs) but samtools had converted this to "CCCCCCCTCCCCC" (7 Cs, 1 T, 5 Cs).

samtools variant calling vcf • 995 views
ADD COMMENTlink modified 2.4 years ago by Brian Bushnell16k • written 2.4 years ago by julirsch0

Please don't post questions as tools. Instead, post them as questions.

ADD REPLYlink written 2.4 years ago by Brian Bushnell16k

Cross-posted to samtools-help and likely solved there. Please pick one forum at a time…

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by John Marshall1.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1115 users visited in the last hour