Question: Probability Of Expression Changes 2-, 5-, ...100-Fold
3
gravatar for Israel Barrantes
7.9 years ago by
Germany
Israel Barrantes740 wrote:

In RNA-seq and other gene expression approaches, usually you calculate the probability of obtaining a Y value (measured in sample B) from X (sample A), such in the case discussed by Audic and Claverie (Genome Res. 1997 Oct;7:986).

Now the case is the following: Having the read counts of two samples (X and Y, for each different transcript available), we would like to obtain a list of all transcript IDs, the true expression level of which is, with 95 % confidence, at least 5-fold different in the two samples. Which statistical test could help in this case?

Certainly, we would like to choose between 95% and 99% confidence intervals and betwees arbitrary cut-offs of x-fold expression, receiving e.g. a list of all transcripts that are 2-fold or 20-fold or 100-fold overexpressed at the choosen error probability p < given value

gene rna statistics • 2.8k views
ADD COMMENTlink modified 7.9 years ago by Marcin Cieslik520 • written 7.9 years ago by Israel Barrantes740

In common with the 2 answers so far, I don't understand the question. Could you add some additional information or consider re-wording it, because I'm not sure it's answerable in its current form.

ADD REPLYlink written 7.9 years ago by Daniel Swan13k

Here it's the question, posed in a different way:

We have the read counts of two samples, X and Y, for each different transcript available.

The question is now the following: Give me a list of all transcript names, the true expression level of which is, with 95 % confidence, at least 5-fold different in the two samples.

Certainly, we would like to choose between 95% and 99% confidence intervals and betwees arbitrary cut-offs of x-fold expression, receiving e.g. a list of all transcripts that are 2-fold or 20-fold or 100-fold overexpressed at the choosen error probability p < given value

ADD REPLYlink written 7.9 years ago by Israel Barrantes740

Ah, I see. This is something i have been asking myself for a long time, but I don't have a solution. I will keep watching this thread...

ADD REPLYlink written 7.9 years ago by Lyco2.3k

I edited the question accordingly.

ADD REPLYlink written 7.9 years ago by Israel Barrantes740
1
gravatar for Lyco
7.9 years ago by
Lyco2.3k
Germany
Lyco2.3k wrote:

I am not entirely sure what your question is. You can calculate the probabiliby of finding 2x or 5x enrichment with the Aucid & Claverie statistics, but of course the probability depends on the actual count number, not only on the factor. There is an online server for performing the calculation and, according to their webpage, a 'unix version' of the program can be downloaded from http://www.igs.cnrs-mrs.fr/SpipInternet/spip.php?article168

ADD COMMENTlink written 7.9 years ago by Lyco2.3k
1
gravatar for Chris Evelo
7.9 years ago by
Chris Evelo10.0k
Maastricht, The Netherlands
Chris Evelo10.0k wrote:

I am not sure I completely understand the question. But the fold changes you find will really depend on what your samples are. If you for instance compare a knockout strain with a native strain fold changes will be very high (or infinity if you assume the knockout really gave expression zero). Same for null alleles. A hundred fold fold change would almost certainly be something like that. Copy number variations also tend to give high fold changes in expression.

On the other hand we often times search for effects of treatment in two samples that are otherwise as comparable as can be. E.g. the same individual before and after treatment. In nutritional interventions for instance we hardly ever find high fold changes. Two fold would already be very high. But what do you expect? you normally don't get blond hair all of a sudden from eating candies (although some of these might give you blue hair).

ADD COMMENTlink written 7.9 years ago by Chris Evelo10.0k
1
gravatar for Marcin Cieslik
7.8 years ago by
Marcin Cieslik520 wrote:

(I write from memory as I do not have access to the paper, so this might not be accurate)

Having two read counts X and Y for a transcript and the total number of sequenced reads (A and B) the the poisson margin test (introduced here http://www.ncbi.nlm.nih.gov/pubmed/21385042) gives the probability of observing a count difference at least as high as D = Y - X, purely by chance with the rate of the poisson processes that generated X and Y the same (but unknown). In other words a low probability allows one to reject the hypothesis that there is no fold-change.

A different approach is to (somehow) estimate the rates of the generating processes and to calculate the p-value exactly (using the negative binomial: http://precedings.nature.com/documents/4282/version/1/files/npre20104282-1.pdf or negative binomial differential: http://smithlab.cmb.usc.edu/histone/rseg/rseg-supp.pdf )

ADD COMMENTlink written 7.8 years ago by Marcin Cieslik520
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1017 users visited in the last hour