Question: Calculating Coverage From Pileup File To Find Gene Duplication Events
gravatar for thecuriousbiologist
6.5 years ago by
United States
thecuriousbiologist430 wrote:


I have a pileup file like below :

seq1 272 T 24  ,.$.....,,.,.,...,,,.,..^+. <<<+;<<<<<<<<<<<=<;<;7<&
seq1 273 T 23  ,.....,,.,.,...,,,.,..A <<<;<<<<<<<<<3<=<<<;<<+
seq1 274 T 23  ,.$....,,.,.,...,,,.,...    7<7;<;<<<<<<<<<=<;<;<<6
seq1 275 A 23  ,$....,,.,.,...,,,.,...^l.  <+;9*<<<<<<<<<=<<:;<<<<

I have to find the gene coverage from this pileup file and if the gene coverage is above a certain "threshhold" coverage, I want to consider that as a gene duplication event.

How can I go about solving this problem ?

The only file that I have is the pileup file. I don't have a BAM file for this.

gene coverage pileup • 1.6k views
ADD COMMENTlink written 6.5 years ago by thecuriousbiologist430
gravatar for Joseph Hughes
6.5 years ago by
Joseph Hughes2.7k
Scotland, UK
Joseph Hughes2.7k wrote:

The 5th column provides the list of bases at that position. A,T,C,G correspond to alternate alleles and . and , correspond to the reference allele depending on strand. A deleted base is represented by *, $ is for the end of a read, a symbol ‘^’ marks the start of a read and any other character after ^ correspond to the quality of that base. So all you need to do in your favourite scripting language is to sum the number of ,.ACTG in column 5 and that will give you the coverage at that particular position.

Hope that helps, Joseph

ADD COMMENTlink written 6.5 years ago by Joseph Hughes2.7k

Thanks. Can I just directly use the 4th column to find the mean for specific regions, rather than looking at the 5th column ?

Let's say I have a gene which covers positions 2,3,4 in the above example. Can I not just add 23+23+23 and divide by 3 ? This will mean I have 23X coverage for this gene, is that correct ?

ADD REPLYlink written 6.5 years ago by thecuriousbiologist430

yes, you can simply use the 5th column and the average coverage is correct.

ADD REPLYlink written 6.5 years ago by JC7.8k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 938 users visited in the last hour