Tuning Consensus Calling with bcftools/samtools
1
1
Entering edit mode
7.6 years ago
nickp60 ▴ 60

I mapping fasta reads (200bp) to a short (~5kb) reference that has N's in it. Coverage is very low (1-4x), as this isn't NGS data, but rather a consensus calling project.

The procedure, in short, is as follows:

# index reference
smalt index reference.fasta reference.fasta 
# map reads to reference, sort, save as bam
smalt map reference.fasta read.fasta | samtools sort - > sorted.bam
# make pileup, call consensus, and index
bcftools mpileup -Ou -f reference.fasta sorted.bam | bcftools call -P 9.9e-1 -mv -Oz -o calls.vcf.gz ; tabix  calls.vcf.gz
# get consensus 
 cat reference.fasta | bcftools consensus calls.vcf.gz > consensus.fa

The issue I am having is that none of the N's are replaced, even when there is coverage. SNPs are called when applicable, but none of the N's are resolved. I am assuming that it does that because there is only a 1x coverage.

Here is a snippet of the result of samtools mpileup:

1_seq 5403 N 1 c ~

1_seq 5404 N 1 a ~

1_seq 5405 N 1 g ~

1_seq 5406 N 1 g ~

1_seq 5407 N 1 c ~

1_seq 5408 N 1 a ~

1_seq 5409 N 1 t ~

1_seq 5410 C 1 , ~

1_seq 5411 A 1 , ~

1_seq 5412 A 1 , ~

1_seq 5413 A 1 , ~

Question: Give this (non-standard) use case, is there a way to tune the consensus calling parameters to favor replacement of N's?

Thanks in advance!

bcftools samtools software error sequence • 2.6k views
ADD COMMENT
1
Entering edit mode
7.6 years ago
nickp60 ▴ 60

Well, having heard back nothing, here is the Python workaround I am using. It needs work, but if anyone else finds themselves in a similar situation, it may be a good starting point. This is a VERY crude consensus caller, where N's are replaced when there is uncontested resolution, and reference bases are replaced with mapped sequence when unanimous. Weighting/qual scores are ignored, and only the most cursory handling of indels is implemented. Github Gist for crude mpileup consensus calling. That being said, Any further direction is appreciated!

ADD COMMENT

Login before adding your answer.

Traffic: 2230 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6