Hi all, I have several samples that have been sequenced on a MiSeq platform using a paired end protocol. Each read set of each sample was individually aligned using BWA (to the relevant reference sequence). Paired read sets were merged using samtools (option sampe) to produce the .sam files. The .sam files were then compressed, sorted and indexed (again using samtools) to produced the .bam; .sorted.bam and .sorted.bam.bai files. Now, I use the mpileup command to create the pileup file (sample command: samtools mpileup -B -f $reference $sorted_bam_file > $out_pileup). Here is the problem: When I visualize a region of the alignment using IGV, the depth of coverage reported by IGV does not agree with that reported by the mpileup command in any given position. Has anyone else faced this problem? Is there a solution? TIA, Anjan
Samtools mpileup will ignore bases with Q<13. You could try using -Q 0 to disable filtering of low-quality bases. Also, note that samtools mpileup will ignore all duplicates. Again, this will affect the reported depth. I do not know of a way to disable this feature (and really wouldn't want to, anyway).
Edit: As pointed out by others, there are filters both on the samtools side and the IGV side. In general, when comparing the outputs of software, one needs to first determine if the outputs should be the same. In this case, there is good reason to believe that they should not be identical.
I am just cross-referencing another thread from SEQAnswers, http://seqanswers.com/forums/showthread.php?t=19702
Quote: "I found out that samtools filters reads before including them in the pileup; it reads the flag field in the bam file and discards reads that
- are not paired
- not properly mapped
- mate is not mapped
- alignment is not primary
- reads fail quality control of vendor
- is marked as PCR duplicates.
If the filters (1) and (3) are not desired, you can use the parameter -A. In addition, samtools performs realignment unless the parameter -B is used, and discards low quality reads unless -Q0 is used. Finally, it stops at a certain number of reads unless the -d parameter is invoqued."
Command I used:
samtools mpileup -ABQ0 -d 1000000000 -f ref.fasta -l exome.bed mybam.bam