Question: What is the difference between calculating fold-changes from normalized counts v. fitting to a model?
0
gravatar for snamjoshi87
2.6 years ago by
snamjoshi8740
snamjoshi8740 wrote:

I am new to RNAseq and I am trying to understand some of the theory behind the various analysis methods.

Once you have raw read counts, you can pass these values to a package like DESeq or EdgeR for differential expression analysis. These packages will filter low counts, normalize, perform modeling based on a distribution, and report a fold-change along with some kind of value like an FDR cut-off.

However, you could also normalize the raw counts, filter very low counts/0 counts, average across replicates, perform additional filtering for replicates that vary greatly in counts, and then divide your filtered counts for the experimental sample over a control sample. This would give you a "fold-change over baseline". This very simple approach does not use a model or have any kind of associated FDR.

My question(s): What is the difference between these two different approaches and what do they tell you (or what is the limit of what they can tell you)? Is one "more valid" than the other?

ADD COMMENTlink modified 2.6 years ago by Devon Ryan88k • written 2.6 years ago by snamjoshi8740
2
gravatar for Devon Ryan
2.6 years ago by
Devon Ryan88k
Freiburg, Germany
Devon Ryan88k wrote:

Fold-change alone isn't a reliable indicator of significance, which is why a statistical model is used. This is the same reason you do a T-test (or similar) instead of just reporting the ratio of two groups of measurements. BTW, it's exactly the FDR (well, adjusted p-value) that everyone wants.

ADD COMMENTlink written 2.6 years ago by Devon Ryan88k
1

I believe it's implied in your response, but it's also worth explicitly pointing out significant p-values do not necessarily imply biologically significant or interesting differences.

ADD REPLYlink written 2.6 years ago by spvensko180
1

Definitely, statistical significance isn't biological relevance. The strategy is to filter by both adjusted p-value and fold-change (i.e., the fit coefficient from the model). I should note that fitting a model has the benefit that you can do various types of shrinkage (aka, regress with a prior distribution) and profile out things like dispersion.

ADD REPLYlink written 2.6 years ago by Devon Ryan88k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2404 users visited in the last hour