Understanding Sleuth output for multiple samples
0
1
Entering edit mode
5.6 years ago

I have RNAseq data from 10 different condition with two biological replicates each. I have run Kallisto on all the samples with 100 bootstraps.

I passed all the samples to Sleuth for analysis with a reference table as mentioned in the "getting started" page.

sample   condition
A1         A
A2         A
B1         B
B2         B
C1         C
C2         C


...and so on.

I fitted the model as described in the "getting started" page and exported the results obtained by

results_table <- sleuth_results(so, 'reduced:full', test_type = 'lrt')


I export this table as a CSV file with columns which include p-value and q-value (and other metrics). What I do not understand is what do these p-values signify. What happens when multiple conditions are passed; does Sleuth do a pairwise comparison? If that is so then there should be a p-value/q-value (and other values) for each pair. I am not getting that.

When I input multiple conditions then do p-values denote differential expression in all the conditions or at least one condition?

If possible also let me know (some page/resource on) what other models can be fit on the data (from the "getting started" example, it appears that different models are possible but the manual and the preprint do not mention anything other than the intercept-only model).

RNA-Seq sleuth kallisto • 4.1k views
0
Entering edit mode
1
Entering edit mode

I posted a question there but it was not answered!

2
Entering edit mode

Did you manage to find an answer for this question, I am running into the same problem where I want to compare multiple conditions (>2)

1
Entering edit mode

No. I didn't get an answer from anywhere and since it was not my own project (I was helping someone) I didn't check further. In fact, I have now forgotten how exactly sleuth works. I guess, it fits a generalized linear model (perhaps with mixed effects). I am using GLM these days but for a different analysis. If I am thinking correctly, the different conditions would be one of the factors in the GLM (i.e. the different conditions would be the different values of a nominal variable called condition). From the analysis of deviance (using LRT) you would know if the conditions have a significant effect on the transcript count. When there are more than two conditions you won't exactly know which condition caused the significant difference. Perhaps sleuth also behaves that way. In any case, I would recommend that you do a pairwise analysis of differential expression (I think that's what I did in the end).