Retrieving Gene Names From 3'-End Aligned Reads
1
0
Entering edit mode
11.5 years ago
GPR ▴ 390

Hello, I am planing an experiment, where I will collect RNA-seq data on a 3'-polyA capture library. My question is, after alignment with Bowtie, what is the best way to retrieve the gene names of the loci to which the data aligned? These are short reads that map to the 3'-end of transcripts and that's it. Any idea, will be appreciated. G.

bowtie • 2.0k views
ADD COMMENT
0
Entering edit mode

Do you have a gene model file for the organism? Then after mapping, its basically the same procedure. You could just count the number of reads mapped to each gene, albeit it maps to the 3' region of your gene. If necessary, to filter out certain mapping errors, you could also choose a window size from the 3' region of your gene and obtain counts over that region. Those with counts are the ones you are interested in (typically with a certain threshold over the number of read counts per million, cpm).

ADD REPLY
1
Entering edit mode
11.5 years ago

You need a genome annotation file preferably in gtf or gff format for the genes you want to look at.

I used HTSeq-count a lot for counting reads that correspond to an annotation. http://www-huber.embl.de/users/anders/HTSeq/doc/count.html#count

ADD COMMENT

Login before adding your answer.

Traffic: 2542 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6