Screen reads with specific sub-sequences from BAM file
1
0
Entering edit mode
6.0 years ago
yuabrahamliu ▴ 60

Hi all,

Could anyone give me some help?

I have an RNA-seq bam file. What I want to do is to select the reads in it that have additional continuous A strings at the 3'end of the reads. (i.e. these reads can be divided into 2 parts, the 5'end part can be aligned to the genome well, but the 3'end cannot and this 3'end sequences are all continuous A strings). These reads are deemed as the reads derived from the RNA polyA addition during transcription and can be used to define the polyA addition sites of the RNA.

My question is just how to screen these reads from bam file? Maybe it is easy to someone, but I really don't know how to do this. Could anyone give some suggestions? Thank you so much.

RNA-Seq screen bam reads sequence • 1.1k views
ADD COMMENT
0
Entering edit mode
6.0 years ago

You should be able to do this with a little grep. Something like

samtools view sort.bam | cut -f 1,10 | grep 'AAAAAAA$'
ADD COMMENT

Login before adding your answer.

Traffic: 2941 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6