Question: Advice for RNAseq timepoint analysis
0
gravatar for macmath
16 months ago by
macmath130
France
macmath130 wrote:

I have three different time points

12 weeks WT vs KO 4 biological replicates and 4 technical replicates

4 weeks WT vs KO 3 biological replicates and 3 technical replicates

0 weeks WT vs KO 3 biological and 3 technical replicates

Need suggestions for the design, to find the kinetics and also identify differential genes in different conditions. I have all counts derived from Htseq-count. How to proceed with the analysis. I would also like to see comparison in all combinations.

rna-seq • 559 views
ADD COMMENTlink modified 16 months ago by mforde841.2k • written 16 months ago by macmath130
3
gravatar for mforde84
16 months ago by
mforde841.2k
mforde841.2k wrote:

Hi macmath,

I'd suggest looking into limma and edgeR for analysis.

You have to set up the coefficients and contrasts for the linear model. It looks like you have 2 experimental factors and one error term, one being KO status, and the second being time, and the last being known batch groups. You want to correct for batch as it could potentially effect your result. If you don't know the batch status, you can use something like SVA to do a generalized batch correction.

So starting off let's generate the model. I assume you probably have an excel file or csv file with each sample in a row and each column being either KO status, time, and sequencing batch. So read those in and generate a design for a linear model:

factor.table <- read.table("omics.csv", header=TRUE, row.names=1, sep="\t")
ko_status <- as.factor(factor.table$ko_status)
time_status <- factor.table$time_status 
batch <- as.factor(factor.table$rnaseq_batch)
design <- model.matrix(~0 + ko_status + time_status + batch + ko_status:time_status)

So that's the basic design, and what it means is that the linear model will generate an intercept for each factor plus any interactions between KO and time status. You then generate contrasts to test for significant differences between defined groups, eg:

cont.matrix <- makeContrasts(KO_true = ko_status1 - ko_status2,
    ..., levels = design)

You'll have to play around with this a bit to generate the proper contrasts. There are working examples in the limma documentation which will help you figure it out, so do your reading.

Next you filter and normalized the data, then run the linear model:

y <- DGEList(counts = raw_counts)
y <- y[!rowSums(y$counts == 0) == ncol(raw_counts),] 
A <- aveLogCPM(y)
y2 <- y[A>1,]
y3 <- calcNormFactors(y2, method = "TMM")
dge <- voomWithQualityWeights(y3, design=design, normalization="quantile", plot=FALSE)
fit <- lmFit(dge, design)
fit2 <- contrasts.fit(fit, cont.matrix)
fit3 <- eBayes(fit2, trend = TRUE, robust = TRUE)

I believe fit2 gives you the main effects, and fit3 will give you the DEG for whichever contrast. I also believe you can use one of the fits to do a repeated measures anova, but I think you can flush out the effect using the interaction term in the design.

ADD COMMENTlink modified 16 months ago • written 16 months ago by mforde841.2k

Thank you for the needful suggestions I have a doubt. I forgot to mention in the description of the question.

12 weeks WT vs KO Condition A
4 weeks WT vs KO Condition B
0 week WT vs KO Condition C

These KO are specific for each condition they are not common among the three

ADD REPLYlink modified 16 months ago • written 16 months ago by macmath130

Not really sure what your objective is. Could you explain your groups in biological terminology.

ADD REPLYlink written 16 months ago by mforde841.2k

WT = Wild-type, KO = knockout/mutant . The KO are not identical across the three conditions so these are separate experiments. Unclear how the time component fits in the picture.

ADD REPLYlink modified 16 months ago • written 16 months ago by genomax58k

To clarify, biologically the KO are same and I am a bit confused with the KO status, time, and sequencing batch. My files compose of count files generated by HTseq count for each biological replicate and its respective technical replicates.

12 weeks WT vs KO 4 biological replicates and 4 technical replicates i.e. 32 count files

4 weeks WT vs KO 3 biological replicates and 3 technical replicates i.e. 27 count files

0 weeks WT vs KO 3 biological and 3 technical replicates i.e. 27 count files

I would also like to know if the technical replicates should be collapsed?

ADD REPLYlink modified 16 months ago • written 16 months ago by macmath130

These KO are specific for each condition they are not common among the three

and

To clarify, biologically the KO are same

Those two statements (made in two separate posts) are conflicting. Which of those statements is correct? I would think the KO must be identical otherwise the time component has no meaning since the three sets would be separate non-comparable experiments.

Technical replicates for sequencing are not needed since sequencing has become pretty standard (unless you used different sequencer models that use different chemistry for sequencing e.g. NextSeq,HiSeq). You could collapse them.

ADD REPLYlink written 16 months ago by genomax58k

so what are you knocking out?

ADD REPLYlink written 16 months ago by mforde841.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1602 users visited in the last hour