Question: How To Figure Out The Base On The Special Positions
gravatar for camelbbs
6.7 years ago by
camelbbs670 wrote:

Assume I have RNASeq data and a list of chromosome loci. Can I use some tools to get an output like this:

chr 2032121  A:20  T:350  C:300  G:55
chr 12098311  A:190  T:0  C:20  G:760

I want to calculate in each loci, how many A/T/C/G in the RNASeq reads. I am using samtools mpileup -l FILE -f hg19.fa in1.bam. But seems that can only identify the base on the reference genome. Thanks for help.

rnaseq snp • 1.5k views
ADD COMMENTlink modified 6.7 years ago • written 6.7 years ago by camelbbs670

Pileup format gives you information about variations, it is shown here, I think this discussion is also highly relevant How to get a summary for a special chromosome position in BAM file

ADD REPLYlink modified 6.7 years ago • written 6.7 years ago by Pavel Senin1.9k

I wrote a script to parse that info from mpileup output a while back:

However, I am not sure that it works 100% correctly. So I would spot check the output.

ADD REPLYlink written 6.7 years ago by Damian Kao15k

Thanks Damian. I have a question here: I see if I have 2000 positions, but mpileup will only calculate about 1200 positions, is there a parameter that I can figure out on the all positions. Or does that mean the left 800 positions are in the sequencing gaps?

ADD REPLYlink written 6.6 years ago by camelbbs670
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1836 users visited in the last hour