Question: How is the bootstrapping information from kallisto treated in DeSeq2 when imported using tximport?
0
gravatar for divya.nandakumar
3 months ago by
divya.nandakumar20 wrote:

I have transcript abundances from kallisto run with 100 bootstraps. My understanding is the bootstrapping gives information about the variability in the abundance estimate. If I use tximport to import this abundance information for use in deseq2, is the variance information from bootstrapping used by deseq in any way or does deseq calculate the variance in a different way?

I see in the tximport manual that there is a way to import the inferential replicate values by setting txOut=TRUE and varReduce to summarize the inferential replicates in to one variance value per transcript. But is this information used by DeSeq2 in anyway during the diff expression analysis?

Also, does RSEM perform any variance calculation for the estimated counts?

Background: I am trying to compare kallisto -> sleuth with featureCounts -> DeSeq2. kallisto followed by sleuth shows no significantly differentially expressed genes (at transcript or gene level) while featureCounts -> DeSeq2 shows several genes that are differentially expressed. To know if this is an effect of having the variance data, I wanted to try running the kallisto transcript abundances in Deseq2.

ADD COMMENTlink modified 3 months ago by ATpoint36k • written 3 months ago by divya.nandakumar20
0
gravatar for ATpoint
3 months ago by
ATpoint36k
Germany
ATpoint36k wrote:

No, tximport does not import bootstrap information from kallisto when summarizing to the gene level, see the tximport manual where this is clearly stated: https://bioconductor.org/packages/release/bioc/vignettes/tximport/inst/doc/tximport.html#kallisto

Because the kallisto_boot directory also has inferential replicate information, it was imported as well (and because txOut=TRUE). As with Salmon, inferential replicate information will not be summarized to the gene level.

I think your comparison is not informative since you are comparing two different quantification methods (kallisto => pseudoalignment, featureCounts => traditional alignment quantification) and on top of that two different statistical frameworks. For a meaningful comparison keep either quantification or downstream statistics constant. Currently the differences could simply be based on the quantification method. If you get no DEGs then better check if you have enough power and samples are of good quality, samples have sufficient depth and cluster well in a PCA. Maybe, and this is always an option, the biological truth is that there are no DEGs at all.

ADD COMMENTlink modified 3 months ago • written 3 months ago by ATpoint36k

Thank you for replying. I saw that statement in the manual but that it meant that the information will not be used for gene-level analysis but will still apply if I was to look at Diff expr at the transcript level. What is the purpose of the varReduce argument when importing the data?

I understand they are completely different methods of analysis. Based on other experimental data (qPCR, microarray), we know that there are differentially expressed genes (the mutant is of a transcription factor) and DEGs from deseq are consistent with what we would expect. It is also quite odd that PCA from the kallisto data showed poor separation of samples (particularly for one replicate), while PCA plot from featureCounts + DeSeq showed substantial separation of samples along one axis. I was wondering whether the bootstrapping was bringing out any underlying problems between the replicates. Kallisto-sleuth would be more convenient to use merely because of the speed of the analysis and I was trying to see if it is comparable to deseq.

ADD REPLYlink written 3 months ago by divya.nandakumar20

If one method does not agree with your expectations which have confirmation by other methods then do not use it, right? There are alternatives such as salmon if you want a lightweight quantifier. Salmon offers several handy features such as GC and sequence bias correction plus is now able to use decoy sequences and selective alignment to improve accuracy.

ADD REPLYlink written 3 months ago by ATpoint36k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1148 users visited in the last hour