Question: Samtools mpileup output: missing fields?
gravatar for cjaln
3.1 years ago by
cjaln0 wrote:

Hi, I ran the following code in samtools version 1.3.1 and I am confused by the output, leaving me a little worried that I have missing data. So here's hoping someone can prove me wrong or right!

samtools mpileup -q 250 -Q 20 -f dmel-all-chromosome-r6.01.fasta -b Sample1Sample2.txt | gzip > Sample1Sample2.mpileup.gz

I set -q 250 because the RNA aligner I used, STAR, sets mapping quality to 250 for unique alignments. The files listed in Sample1Sample2.txt were two bam files, one for each sample. The reason I am confused about the output is the presence of < and > symbols. Here's an example of a line of my mpileup:

2R      14442944        G       48      >><>>><<<<<<<<<><<<><<<<><<<<<>><>>><><<>>>>>><.        FFBFFFF<FFFFFFFFFFFFFFFFFFFFFFFBBFFFFFFFFFF<FFFF       
75      <<<><<><<<<<<<<<<<<<<<<<<<<<<<<<>>><<<<<<<<<>>>><<>>>>>>>>>><>><>><>><><>><  FFF<FFF<FFBBBBBBBFBBBFBFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFF<FBFFFFFFFFFFFF

Why do < and > appear to be in the fifth column with base calls (eg. ".") rather than in the sixth column as depicted in this article? I thought it might be something to do with symbolising spliced reads, as I am working with RNAseq...but I just don't know.

Thanks in advance, Chris

(Edit: sorry, the text editor is making the line of pileup look funny, ignore the br= and p= stuff)

mpileup pileup samtools rnaseq • 1.2k views
ADD COMMENTlink modified 3.1 years ago by John Marshall2.1k • written 3.1 years ago by cjaln0

visialize with IGV and check for insertion/ deletion at: 2R:14442944

ADD REPLYlink written 3.1 years ago by Pierre Lindenbaum131k
gravatar for John Marshall
3.1 years ago by
John Marshall2.1k
Glasgow, Scotland
John Marshall2.1k wrote:

The article you're looking at is an ancient web page discussing the old pileup format, which is similar to but a simplified version of mpileup format, which replaced the old format with the arrival of the samtools mpileup command quite some years ago.

Instead you should look at the (admittedly terse) description of mpileup format in the mpileup section of the samtools man page. This notes that one of the additions in mpileup format is that < and > indicate reference skips.

So this particular position is covered by an N CIGAR operator in practically all of your reads.

ADD COMMENTlink written 3.1 years ago by John Marshall2.1k

Thanks a lot John :)

ADD REPLYlink written 3.1 years ago by cjaln0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1413 users visited in the last hour