How To Figure Out The Base On The Special Positions
0
0
Entering edit mode
10.9 years ago
camelbbs ▴ 710

Assume I have RNASeq data and a list of chromosome loci. Can I use some tools to get an output like this:

chr 2032121  A:20  T:350  C:300  G:55
chr 12098311  A:190  T:0  C:20  G:760

I want to calculate in each loci, how many A/T/C/G in the RNASeq reads. I am using samtools mpileup -l FILE -f hg19.fa in1.bam. But seems that can only identify the base on the reference genome. Thanks for help.

rnaseq snp • 2.1k views
ADD COMMENT
1
Entering edit mode

Pileup format gives you information about variations, it is shown here, I think this discussion is also highly relevant How to get a summary for a special chromosome position in BAM file

ADD REPLY
1
Entering edit mode

I wrote a script to parse that info from mpileup output a while back: http://blog.nextgenetics.net/?e=56

However, I am not sure that it works 100% correctly. So I would spot check the output.

ADD REPLY
0
Entering edit mode

Thanks Damian. I have a question here: I see if I have 2000 positions, but mpileup will only calculate about 1200 positions, is there a parameter that I can figure out on the all positions. Or does that mean the left 800 positions are in the sequencing gaps?

ADD REPLY

Login before adding your answer.

Traffic: 1918 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6