Question

How to interpret DEXseq results in therms of significance

1

Entering edit mode

6.6 years ago

Lila M ★ 1.2k

Hi guys I have a question for DEXseq results. After running it in R, I've just realized that some featuresIDs (e.g. E001) reported a p-value > 0.05. I'm wondering why DEXseq result consider this feature as differential y usage in therms of exons. Is because of the log2fold? Is correct if I filter the featuresID considering only those with pvalue < 0.05?

groupID          featureID  exonBaseMean  dispersion    pvalue  ...   log2fold
ENSG000000044577      E001      0.12106875    0.067       0.145     ... -1.683

Many thanks!

DEXseq RNA-Seq exon • 3.7k views

ADD COMMENT • link 6.6 years ago by Lila M ★ 1.2k

0

Entering edit mode

Which command did you use to generate your listing of differentially expressed exons?

The object that you create with the DEXSeqResults() function contains all results irrespective of significance and this can easily be written to disk with:

write.table(as.data.frame(DEXSeqResults(MyObj)), "Results.tsv", quote=FALSE, sep="\t", dec=".")

The default cut-off of DEXSeq for other functions, such as DEXSeqHTML(), is set to FDR Q < 0.1.

ADD REPLY • link 6.6 years ago by Kevin Blighe 87k

0

Entering edit mode

Thank you but this not answer my question at all. Anyway the command line that I used:

dsd = estimateSizeFactors(dsd)
dsd = estimateDispersions(dsd)
dsd = plotDispEsts(dsd)
dsd = testForDEU(dsd)
dsd = estimateExonFoldChanges(dsd, fitExpToVar="Disease_status")
dxr1 = DEXSeqResults(dsd)
write.table(dxr1, "DEXSeq_results" , sep="\t", col.names=T, row.names = T) 
DEXSeqHTML( dxr1, FDR=0.05, color=c("#FF000080", "#0000FF80"), fitExpToVar="Disease_status")

So in the DEXSeqHTML are some featuresID with p-value < 0.05 and I would like why.

Thanks

ADD REPLY • link 6.6 years ago by Lila M ★ 1.2k

1

Entering edit mode

DEXSeq, specifically the DEXSeqHTML() function, will output all genes that contain at least 1 feature (e.g. an exon) that passes the specified threshold, in your case FDR<0.05.

For example, I used that same function and one of my genes is IL16. DEXSeq outputs all exons in the gene, even though only 1 is significant. In my output below, only E003 (exon 3) is different.

groupID featureID   exonBaseMean    dispersion  pvalue  padj    seqnames    start   end width   strand
IL16    E001    0.1039758   0.162   0.067   1.000   15  81489219    81489478    260 +
IL16    E002    4.3899951   0.698   0.004   1.000   15  81517640    81518052    413 +
IL16    E003    1.5172541   0.229   0.000   0.049   15  81552113    81552221    109 +
IL16    E004    2.1541030   0.400   0.001   1.000   15  81558000    81558142    143 +
IL16    E005    2.0957201   0.332   0.000   0.330   15  81561879    81561989    111 +
IL16    E006    1.4555771   0.280   0.000   0.516   15  81565431    81565545    115 +
IL16    E007    1.9268769   0.383   0.000   0.344   15  81571158    81571231    74  +
IL16    E008    1.8824670   0.540   0.014   1.000   15  81571899    81572115    217 +
IL16    E009    1.7349858   0.503   0.023   1.000   15  81574980    81575097    118 +
IL16    E010    1.9800806   0.600   0.019   1.000   15  81578039    81578171    133 +
IL16    E011    1.4733303   0.460   0.036   1.000   15  81582794    81582881    88  +
IL16    E012    6.6046320   0.400   0.034   1.000   15  81584897    81585378    482 +
IL16    E013    0.6449802   0.395   0.209   1.000   15  81589254    81589268    15  +
IL16    E014    13.2419637  0.246   0.660   1.000   15  81589269    81589419    151 +
IL16    E015    103.8995139 0.308   0.919   1.000   15  81591721    81592816    1096    +
IL16    E016    38.4906121  0.368   0.841   1.000   15  81593685    81593853    169 +
IL16    E017    38.4013161  0.034   0.204   1.000   15  81595890    81595991    102 +
IL16    E018    68.1948279  0.009   0.005   1.000   15  81598249    81598507    259 +
IL16    E019    12.0726132  0.009   0.788   1.000   15  81598761    81598763    3   +
IL16    E020    42.3443750  0.171   0.562   1.000   15  81598764    81598886    123 +
IL16    E021    287.6214095 0.205   0.557   1.000   15  81600946    81605104    4159    +

The logic is that you only need to observe 1 exon with statistically significant different usage/dosage in order to infer that different splice isoforms are using the exons differently.

Kevin

ADD REPLY • link 6.6 years ago by Kevin Blighe 87k

0

Entering edit mode

Many thanks, more clear now :). So the p-value is of course the only parameter that I have to look in order to accept that at least one of the condition has an effect on exon usage. However the log2fold change give us the information in terms of differential expressed exon in two different condition. Many thanks

ADD REPLY • link 6.6 years ago by Lila M ★ 1.2k

1

Entering edit mode

Yes, exactly. It just takes a single exon to make the gene interesting :)

If you are aiming to publish, though, you may have to go by the FDR Q value (even if it is only significant at 10% FDR).

Best of luck

ADD REPLY • link 6.6 years ago by Kevin Blighe 87k