samtools mpileup not calling small (6 base) INSERT
1
0
Entering edit mode
5.5 years ago
ALchEmiXt ★ 1.9k

Hi All,

I have probably a simple problem but which I cannot solve as of yet and puzzles me quite a bit. Using samtools mpileup and bcftools I am unable to call a small 6 base INSERT compared to my reference. Single SNPs are called ok. I have looked at many unanswered questions and potential answers here at BioStar and this question/answer gets closest but none actually has a working answer!

My play aorund data:

• Illumina PE 250 dataset of bacterial genome (2Mb at coverage 5k+)
• Assembled and reconstructed a genome from it
• If I check for SNPs by mapping the raw reads to the fresh assembly none are found (underlining the correct assembly)

Next:

• I created an artificial 2 position SNP and somewhere else I deleted 6 bases in the assembly
• I mapped the original reads which should detect the SNPs and the INSERT. Hoewever I only see the SNPs in the vcf and not the INSERT

My mapping is done with bowtie2 --local-sensitive settings to the ArtificialREFERENCE-6del sequence

Sam file is converted to bam and sorted using samtools

Then I call variants:

samtools mpileup -uf ArtificialREFERENCE-6del.fasta SORTED_BAMfile.bam -o SAMPLE_file.bcf

bcftools call -vm -o SAMPLE_file.vcf SAMPLE_file.bcf


I tried different bcftools settings like -c and -p 0.5 ... no luck...

Any help would be appreciated! It is probably something very obvious but I fail to see it from the documentation.

NB: I tried samtools/bcftools version 1.3.1 but also an older 1.1

NB2: I tried the additional options and FLAGS as proposed in the above linked post without luck.

NB3: The below is an IGV snippet of the involved BAM file clearly showing there IS a 6 base insert.

samtools mpileup INDEL prokaryote vcf • 2.1k views
0
Entering edit mode

No one sees this behaviour?

I can confirm that it not only occurs with my data but also with at least another case when mapping/consensus calling viral sequences!

0
Entering edit mode
5.2 years ago
sb1 • 0

Hi,

Did you find out why mpileup was unable to see the event? I seem to be having the same problem but with human data.

0
Entering edit mode

If you know a specific example in your data, please try to pileup the surrounding portion with  samtools mpileup --output-tags AD,DP -uf $REFERENCE -r$REGION_OF_INTEREST $BAMFILE | bcftools call -m  with $REGION_OF_INTEREST begin something like chr1: 10000-10050 if your expected indel is at chr1:10025, i.e., 25 bases up- and downstream of the expected position.

This command skips the -v, --variants-only to have all alternative alleles to be listed. If the insertion is listed, please check the genotype. My guess is, that the genotype is "0/0", which - from my understanding - suggests that samtools statistical methods infer the sample to be wild-type, i.e., not contain the insertion. If this is the case you will have to dig deep into the statistical methods of variant calling or ask the authors of samtools ;-)

If the insertion does not appear even you left away the -v, then there is probably some problem with the data or some bug.

0
Entering edit mode

will check that and post back later.

0
Entering edit mode

no not yet. I had to give it a rest for a while, but it still puzzles me.