Complex and Complicated RNAseq analysis
Entering edit mode
6.6 years ago
mfahim ▴ 10

I carried out RNAseq and smallRNAseq on WT, Mutant1, Mutant2 and Double Mutant M1M2 with NO replicates at two different temperatures.

Will the absence of Replicates affect my data analysis?

Can anyone advise how can I correlated the Transcriptome data (RNAseq) with data generated from smallRNAseq (ncRNAs)?

How can I run GO and Pathway enrichment analysis?

Is cummeRbund going to help me make sense of the data?


Differential Analysis cummeRbund Enrichment • 1.7k views
Entering edit mode
6.6 years ago
iraun ★ 3.9k
  • Yes, it will affect. It is always a good practice to have biological replicates of your samples.  The absence of biological replicates results in a low statistical power. I always recommend at least 3 biological replicates. If you do two, how do you know one isn't bad? If you do three, and one is bad, you can at least eliminate it and continue .For example, if you want to identify the genes that are differentially expressed between two strains of yeast then you will most likely grow each of the two strains in different flasks. Growing the strains in different flasks of course introduces some type of biological variance. This can be seen because if you grew two flasks of the same yeast strain then the expression would be different. These differences are caused by biological variance. Thus, biological replicates are required for this experiment, in order to discard the biological variance caused by the different flasks.
    You can read this thread about this topic: Rna-Seq Biological Replicates... .
  • What do you mean by correlate? What is your goal?
  • There are different tools for doing Gene Ontology enrichment analysis and Pathway enrichment analysis. For the first one you can use:
    - DAVID ( ).
    -  GOrilla (
    -  Ontologizer (
    -  And there are also R packages available through Bioconductor, like GOstats).
    In this thread you can find more (Tools To Find Gene Ontology Term Enrichment).

    For the pathway analysis I generally use KEGG. See this thread for more info: Best Way To Do Pathway Analysis Of A Set Of Genes?.
  • cummeRbund is used to create SQLite database using your analysis results describing appropriate relationships between genes, transcripts, transcription start sites, and CDS regions. Once the data is inside the database, it can be retrieved in a very efficiently and easy way, allowing to explore subfeatures of individual genes, or genesets as the analysis requires. So yes, if you use cufflinks and cuffdiff to perform your DE analysis, it could be a good idea to create a DB using cummeRbund package in order to storage, access, explore and manipulate your data in a clear and easy way. Furthermore, cummeRbund provides numerous plotting functions for commonly used visualizations.

    Hope it helps.
Entering edit mode
6.6 years ago
mfahim ▴ 10

Thanks Airan,

I am working with Arabidopsis.. so what I did, grow plants on different plates and then pooled them before sending for RNAseq.

I want to see if the ncRNA (smallRNAseq) are somehow related to their targets in transcriptome (longRNAseq)?.

Here is the complication with my data (if I still decide to ahead with analysis).

1. m1 at 16C

2. m2 at 16C

3. m1m2 at 16C

4. WT at 16C

5. m1 at 23C

6. m2 at 23C

7. m1m2 at 23C

8. WT at 23C

What kind of magic will help me get the best out of this.. You can understand that I already spent alot on doing the two types of  RNAseq on all these samples (doing them in triplicate would have bankrupted me).

I have the cuffdiff out put from cufflinks.. and am trying my luck with R and cummeRbund.. is there a way out in cummeRbund.. (please note I am the begining in R)..


Login before adding your answer.

Traffic: 1698 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6