Question: Sinking into the swamp by Using DESeq without replicates !!
gravatar for Whoknows
5.6 years ago by
Whoknows770 wrote:

Hi friends

I have a project on Rice and my conditions (paired-end) unfortunately have no replicate.

I used tophat2 and after conversion to sam file is used HTSeq and I merged HTSeq output.

And today the bigest problem emereged, How can i handle DeSeq without replicate you can see my following codes:

pasillaCountTable <- read.table("MyCountTable.txt")
pasillaDesign = data.frame(row.names = colnames( pasillaCountTable ),condition = c( "C1","S1"),libType = c( "paired-end", "paired-end" ))
pairedSamples = pasillaDesign$libType == "paired-end"
countTable = pasillaCountTable[ , pairedSamples ]
condition = pasillaDesign$condition[ pairedSamples ]
cds = newCountDataSet( countTable, condition )
cds = estimateSizeFactors( cds )


and main part was,  is it fine or not?

cds = estimateDispersions( cds, method="blind", sharingMode="fit-only" ,fitType="local")

and rest of the codes:

res_C1S1 = nbinomTest( cds, "C1", "S1" )
write.table(res_C1S1,"C1_S1.txt",sep= "\t")

Thanks alot.


rna-seq deseq tophat htseq • 2.7k views
ADD COMMENTlink modified 5.5 years ago by Craig Henry20 • written 5.6 years ago by Whoknows770

If you want to do analysis of differentially expressed genes, that just won't be possible without replicates. If you don't know what's the variance within the samples from the same group (either condition or control), you cannot possibly tell if the difference you see between different groups (condition vs control) is significant or not. You may try to take a look at dispersion values from the very similar experiment if such data is available.

ADD REPLYlink written 5.6 years ago by Biomonika (Noolean)3.1k
gravatar for Jelena Aleksic
5.5 years ago by
Cambridge, UK
Jelena Aleksic910 wrote:

You just can't do a statistical analysis of differential expression without replicates. Basically all you can do is calculate the ratio between your sample and control, and then rank these, but it gives you no measure of confidence in the results, for obvious reasons. You can use DEseq to perform the normalization step with size factors, then calculate the log2 ratios with your normalized counts with just a bit of R code (you don't need DEseq for this part). But what this will give you, at best, is a list of top potential genes changing expression - you can filter by the magnitude of the log2 fold change and how highly expressed the genes are to begin with - there's usually a lot of variability in the low expressed genes. After that, you either need to do a ton of biological validation, or get a few more replicates!

ADD COMMENTlink written 5.5 years ago by Jelena Aleksic910
gravatar for Craig Henry
5.5 years ago by
Craig Henry20
United States
Craig Henry20 wrote:

I'd start to look how your transcripts are grouping together (the package "MBCluster.Seq" works great for this). Depending on the comparisons you want to make, an estimated dispersion based on these groups can be used as a prior for inference/significance testing (whether NB, logl, or EM), e.g., prior.dispersion = mu + mu^2*d.  Further, I would make it a point to work with the raw count data using library-size as your normalizing factor and filtering low-expressed transcripts before any testing or normalization takes place - in addition, transcript filtering post-normalization may introduce more bias into your analysis. At any rate, if you want to feel better about not having any replicates, try integrating more information as covariates into your analysis, e.g., gender, tissue, k-means, etc. This may help explain some of the variance confounded as a result of prior dispersion estimates. I'm a big fan of the package "edgeR" when it comes to RNA-Seq analysis - it gives you an option of using common and tag-wise dispersion estimates as well as a bit more intuitive script.


ADD COMMENTlink modified 5.5 years ago • written 5.5 years ago by Craig Henry20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 769 users visited in the last hour