Quantification of interested genes using bowtie and eXpress
Entering edit mode
8.6 years ago

Currently, I am interested in several (around 100) genes in fish and would like to investigate their expression level using public available RNA-Seq data. My strategy is to build up the reference sequences (interested genes). Index them with bowtie 2 and then align the public available RNA-Seq SRA data (filtered using SRAtoolkit) against it. The obtained SAM file was further counted by eXpress for each gene expression level using the FPKM value.

I have several questions about this strategy,

Firstly, when building up the functional gene reference, what kind of sequences should I use if there is no genomic data available? For example, gene A may studied by several scholars and their sequence results can be found in the NCBI Nucleotide database but with difference lengths. Which one should I choose. Besides, RNA splicing proceeded during RNA expression, introns may be spliced out. Therefore, which sequence should I use before or after splicing (this is important because the length of the gene affect the final FPKM value) and how I can identify whether the obtained RNA sequence is spliced or not.

Secondly, is there any problem with the estimated expression level using this strategy? Over or underestimated.

Any other suggestions are strongly welcomed!

ChIP-Seq alignment RNA-Seq gene • 2.1k views

Login before adding your answer.

Traffic: 2261 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6