Question: RNA-Seq analysis and expression level
gravatar for sbdk82
5.6 years ago by
United States
sbdk8260 wrote:

 We have to find the expression level of a gene on different conditions. We have assembled both RNA-seq sample data using velvet-Oases. Now I need to find the gene from a full length CDNA library using the assembled contigs. Then I need to map the reads with the gene to find the expression level. I think I need to use BLAST for finding gene from cDNA library and then some alignment tool for finding expression level. I am just wondering if I am in right direction. If not, can you please suggest what should I do. Also what are the best tools I could use for this. I am just new to this bioinformatics, so need some help from experts.

ADD COMMENTlink modified 5.6 years ago by seidel6.9k • written 5.6 years ago by sbdk8260
gravatar for Charles Warden
5.6 years ago by
Charles Warden7.6k
Duarte, CA
Charles Warden7.6k wrote:

I think this is a fairly common question.  I've compiled a list of pointers that I would suggest:

ADD COMMENTlink written 5.6 years ago by Charles Warden7.6k
gravatar for seidel
5.6 years ago by
United States
seidel6.9k wrote:

If you have assembled contigs, and you have RNA Seq data, you would use an aligner, such as bowtie, to map the reads to your contigs so you can count them. Your contigs represent your "genes". Once you have counts on your genes, you can determine relative expression differences for genes between conditions using R and the edgeR or DESeq libraries (you might have to look up how to summarize counts on genes, there are several resources available), or other methods. And you can estimate relative expression levels between contigs in a single condition - but "expression level" will only be relative to other transcripts (i.e. your contigs). The edgeR library has a function for determining a normalized RPKM value, but it is simply a convenience, as RPKM can be affected by many things. To determine the actual expression level, you would have to use externally added control spikes. There is a set of 92 available from Ambion (Life Tech, ERCC Spike In Mix). Since these are present at known concentrations, you can determine a relationship between read counts and the concentration of a molecule in your experiment (i.e. expression level). Otherwise, you're just guessing.

If I follow you correctly, you would use BLAST to determine which of your assembled contigs represents your gene of interest. However, the contig itself would be the mapping target for your RNA Seq analysis, since it is what came from and is represented by your data (though this could be a sticky point).

ADD COMMENTlink written 5.6 years ago by seidel6.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 852 users visited in the last hour