Question: Using of TPM for miRNA sequencing
gravatar for H-K
21 months ago by
H-K0 wrote:

Hi. I have two libraries for treatment and control. I have no replicate. Can I use TPM for each library as input for DESeq2? How can I calculate log 2 fold change by GFOLD? Is there any software for calculate it? How can I calculate reliable fold change and obtain DEGs when I have not any replicates?Is it correct that I use of TPM as input for DESeq2? What is your suggestion?


rna-seq • 925 views
ADD COMMENTlink written 21 months ago by H-K0

So DESeq2/edgeR and limma works on count data and not TPM for any of the statistics that will build the model for DE analysis. So do not use TPM. Having said that it does not work for one sample per condition for an estimation of DESeq2. The have always maintained to not trust those answers. TPM is mostly used for visualization and not for estimating DEGs. GFOLD works but as far as I know it is still using count data. Take a look here. Other DE tools will not be advisable for usage for DE testing.

ADD REPLYlink written 21 months ago by ivivek_ngs4.8k

Thanks vchris. Its meaning that i use count of read that produced by miRDeep2 as input for DESeq2? What is seq(norm) that produced by miRDeep2?When I dont have any replicate, what can I do for calculating differentially expressed miRNAs?

ADD REPLYlink written 21 months ago by H-K0

Do you have bam file? you can use featurecounts of HT-Seq to pull the count data from your bam file. I have never used miRDeep2 so cannot tell you about it. What I understand from seq(norm) as you state is probably normalized by library size produced by mirDeep2. You need to check that in the manual. However, the link I sent you earlier shows how you can pull out counts from bam file with GFOLD as well and feed it to GFOLD for DE analysis.

ADD REPLYlink written 21 months ago by ivivek_ngs4.8k

Bam file produced by????I have sam file from bowtie 2. You are right about seq(norm). Now can I use seq(norm) for DESeq2 input? Its very confusing. Thanks a lot

ADD REPLYlink written 21 months ago by H-K0

you are missing the point, please read the manuals of GFOLD, the link I posted, it is your work to understand how and why you need to use tools. The link I have given for GFOLD explicitly comes with a tutorial how to use from sam files count data and use GFOLD for DE analysis without replicates.

For DESeq2 I will never use it for 1 vs 1 and to top it no you cannot use seq(norm) data in DESeq2 since DESeq normalization on its own relies on geometric norm which in your case will not work. So better to avoid it. However, if you are hell-bent on using DESeq2, please read the link below what ATPoint has provided. Without reading and understanding, it does not help.

ADD REPLYlink written 21 months ago by ivivek_ngs4.8k

Ok. Thank you very much.

ADD REPLYlink written 21 months ago by H-K0
gravatar for ATpoint
21 months ago by
ATpoint21k wrote:

How can I calculate reliable fold change and obtain DEGs when I have not any replicates

Unfortunately, you cannot. That is why replicates are strongly recommended. The whole idea of using replicates is based on (I am no statistician so use of vocabulary might be not fully accurate) testing the significance of your data by comparing within-group and between-group variance to distinguish between differences due to technical variability and true biological effects. Michael Love recommends running DESeq() as usual for unreplicated data, but points out that the results will only be exploratory and not reliable. Alternatively (I have seen this in papers quiet often) you can take the log2 fold changes for every gene and consider it DEG if the FC in any direction is larger than, lets say 1.5. Still, this is poor practice and your "level of desparation" meaning how urgently you need exactly this dataset, will decide if you do it or not.

TPM from a count matrix in R can be calculated like this :

# Normalize by gene length:
x <- counts.mat / gene.length

# get TPM
tpm.mat <- t( t(x) * 1e6 / colSums(x) )
ADD COMMENTlink modified 21 months ago • written 21 months ago by ATpoint21k

I would second that and also mentioned but there is only one tool which is GFOLD doing such ranking stuff based DE from 1 vs 1 without replicate. Although they should be only used for exploratory. However, the OP seems to have no way to to get replicates so even for exploratory GFOLD can be used. DESeq2 yes but I will still not use it as what M. Love proposed.

ADD REPLYlink written 21 months ago by ivivek_ngs4.8k

Don't think OP was the gene.length vector so I proposed as it is published on DE analysis on conditions without replicate. However, I still have doubts about it as well.

ADD REPLYlink written 21 months ago by ivivek_ngs4.8k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1498 users visited in the last hour