Question: Differential gene expression analysis- different methods
gravatar for jackuser1979
5.8 years ago by
jackuser1979870 wrote:

One of my collegue outsourced the data analysis and they followed below method in finding differential expression analysis

First they did, denovo assembly of controlA, controlB, treatedA and treated B separtely. Assembled transcripts from all the samples were iteratively clustered to produce uni-transcripts with fewest redudant sequence (they did not mention how they clustered to produce uni-transcripts). Then pre-processed reads were then mapped to the uni-transcritps to carry out the expression analysis and differential expression analysis using tophat-cufflinks.

they followed -clustering parameters
minimum identity for overlaps:96%
minimum overlap length:50bp
maximum length of unmatched overhangs:50bp

they followed -uni-transcipt is considered differentially expressed if
fold change > or equal to 4 and q-value less than 0.05
expression detected only in one smaple condition and qvalue less than 0.05

From my experience, I usually do denovo assemble all the samples (control A, Control B, treatedA and treated B) as one reference transcript assembly. Then I do read map each sample (controlA, controlB and treated A and treated B) and get for each sampe FPKM values. Then do differential expression analysis taking all read mapped ones in any one of differential expression software (edgeR, deseq).

Do you think my collegue outsourced method is correct? If it is correct, why they want to do clustering?


ADD COMMENTlink modified 5.0 years ago by Biostar ♦♦ 20 • written 5.8 years ago by jackuser1979870

On a small side note, when you only have 2 samples in each group, I would be very vary of any findings you make. The statistical power of such a sample size is simply too low to make any decent conclusions. 

ADD REPLYlink written 5.8 years ago by David Westergaard1.4k
gravatar for Charles Warden
5.8 years ago by
Charles Warden7.7k
Duarte, CA
Charles Warden7.7k wrote:

First, you probably should mention "de novo assembly" in the question and tags in order to get more responses.

This is probably not how I could do this type of analysis, but this is a common question. I've collected my own suggestions in the following post:

My recommendation would probably be to use a strategy that involves defining a single pooled reference, which you can then use for a normal differential expression analysis.  However, I noticed that a paper describing the Corset algorithm was recently published, which might be more similar to the strategy that you are describing.  You can take a look at that paper to see how it compared to your own "clustering" strategy:

ADD COMMENTlink written 5.8 years ago by Charles Warden7.7k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2027 users visited in the last hour