Question: Time series RNA-Seq with Time-Matched Controls
0
gravatar for jmsyl.hong
2.4 years ago by
jmsyl.hong0
jmsyl.hong0 wrote:

Hey guys,

I'm new to doing bioinformatics so bear with me.

My experimental design is as follows:

  • 4 timepoints (3d, 7d, 14d, 56d)
  • 2 conditions, each with time matched controls
  • For every condition, there are 5 replicates per time point, 3 replicates for each time-matched control

That is, condition 1 at 3d has n = 5, and control of condition 1 at 3d has n = 3 etc...

The core facility provided the complete cufflinks output and provided a merged.gtf file that was made from cuffmerge.

How do I:

  1. Find DE genes between every time point of each condition and its time-matched control? Do I do individual EdgeR at every time point or is deseq or cuffdiff preferred?
  2. Find DE genes between the conditions (should I just filter for the genes DE to controls as found in #1?)
  3. Find the biological pathways (GO/KEGG) that are distinct between the two conditions?
  4. How do I then visualize these distinct networks (e.g. put them into figures), cytoscape?

Package names would help a lot, I have some proficiency in R and Java.

Thanks so much in advance!

rna-seq • 1.3k views
ADD COMMENTlink modified 2.4 years ago by WouterDeCoster37k • written 2.4 years ago by jmsyl.hong0
2
gravatar for Ron
2.4 years ago by
Ron910
United States
Ron910 wrote:

1.It really depends on you what package you would like to use.You can use either DEseq or EdgeR. Check out this post on differential expression analysis: Rnaseq Differential Expression

2 . No,Take all the timepoints of one condition vs all timepoints of other condition and then do differential expression again.

3 . For pathway analysis,GSEA from Broad Institute is one of the most preferred methods:http://software.broadinstitute.org/gsea/msigdb/ You can choose different signatures to look for enrichment in the comparison.

4 . There are different softwares available such as circos ,cytoscape Gene Network Construction... Web Based Tool

ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by Ron910

Thanks so much, Circos looks amazing. I played with Cytoscape a bit but found that it was really difficult to get what you see in the publication figures (they look amazing, but the default layouts aren't that great in cytoscape). Also, is there a reason for not filtering them out in 2, because I assume I would be taking the output of 2 into steps 3 and 4 (i.e. what is the point of 1?)

Thanks again!

ADD REPLYlink written 2.4 years ago by jmsyl.hong0
1

The differentially expressed genes are based on the comparisons you make.In 1) the genes are reported that change between different time points,whereas in 2) the genes are reported that change between different conditions irrespective of time points.

ADD REPLYlink written 2.4 years ago by Ron910
2
gravatar for WouterDeCoster
2.4 years ago by
Belgium
WouterDeCoster37k wrote:

1)Find DE genes between every time point of each condition and its time-matched control? Do I do individual EdgeR at every time point or is deseq or cuffdiff preferred?

There are optimized workflows for time course experiments, googling will give you quite a lot of hits. Example: http://www.bioconductor.org/help/workflows/rnaseqGene/#time-course-experiments

2) Find DE genes between the conditions (should I just filter for the genes DE to controls as found in #1?)

You can test conditions by taking all timepoints for condition one versus all timepoints for condition 2, but specify those timepoint-groups in the model for differential expression analysis

3) Find the biological pathways (GO/KEGG) that are distinct between the two conditions?

You can use GSEA or use tools like Enrichr to analyze overrepresented pathways

4) How do I then visualize these distinct networks (e.g. put them into figures), cytoscape?

Enrichr provides some basic networks, cytoscape is another good option although it probably requires some experience to get nice figures. You can also find some easy but informative visualizations (and code) in this post from getting genetics done

ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by WouterDeCoster37k

Thanks!

Would you still use deSeq2 for 2)?

ADD REPLYlink written 2.4 years ago by jmsyl.hong0
1

I would use DESeq2, edgeR and/or limma-voom. The results and statistics are quite similar.

ADD REPLYlink written 2.4 years ago by WouterDeCoster37k

One more question, if I'm using the original .bam files given, do I use the merged.gtf (from cuffmerge) to get the raw counts or use the up-to-date rat genome gtf file from ensembl (what are the differences?) and would I need to perform any normalization/filtration of the raw counts prior to inputting it into deseq2?

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by jmsyl.hong0
1

Interesting suggestion, and out of curiosity I would try both of the gtfs. I have no experience with cuffmerge but can't think of anything problematic right now with using the obtained gtf.

You absolutely shouldn't normalize your data prior to deseq2 as it expects raw counts. Low counts are filtered out by default so that's also something you don't have to worry about.

ADD REPLYlink written 2.4 years ago by WouterDeCoster37k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1371 users visited in the last hour