Question: differential gene set testing combining genes from multiple panels
0
gravatar for glocke01
4 weeks ago by
glocke01160
United States
glocke01160 wrote:

I have RNA-seq data from two targeted panels. So, for each sample/mouse, I have one counts matrix for one set of genes, and another matrix for another set of genes. The major problem here is that the library size for a given sample is different across the two panels, so just slapping the two matrices together and proceeding as normal is right out (one of the samples has like 100 times fewer reads than the rest, but, surprisingly, other quality metrics look great). I've used limma+voom to test differential expression of individual genes, applying these to each panel separately.

I now wish to ask, "is this set of genes expressed higher in this group than in that?" The problem is that the gene sets I'm looking at combine genes from both panels.

I've applied cameraPR using the t statistics obtained in from each panel separately using the Broad's canoncial pathways. I'm not sure if this is hinky statistically, but it seems ok? If so, it also makes sense scientifically as a way of asking whether those gene sets are changing in specific contrasts across groups.

However, the null hypothesis tested by CAMERA is not "the level of expression for these genes is the same across groups." That's (roughly) the null for ROAST, and I want to test against that null for some specific gene signatures. Is there any hope for me?

Worst case, I'll use something like log-CPM and do parametric testing.

(Searching for combining panels mostly produces results about combining samples that have the same rows, but that's not the problem facing me. Any help would be most appreciated.)

voom rna-seq limma pathway • 116 views
ADD COMMENTlink modified 4 weeks ago by Gordon Smyth750 • written 4 weeks ago by glocke01160
2
gravatar for Gordon Smyth
4 weeks ago by
Gordon Smyth750
Australia
Gordon Smyth750 wrote:

You could rbind the two count matrices together and analyse the data in edgeR rather than limma. In edgeR, you can set the offset matrix to be the log library size, which allows you to set the library size differently for each row of data. The offset matrix then over-rides the library size vector itself.

edgeR has all the same gene set tests as does limma. I'd suggest using fry() has an edgeR equivalent of limma::roast().

ADD COMMENTlink written 4 weeks ago by Gordon Smyth750

awesome, thanks!!

ADD REPLYlink written 29 days ago by glocke01160

Thanks again, Gordon. I'm not sure that I'm using scaleOffset correctly, and I've asked another question requesting further assistance. how do I properly use edgeR::scaleOffset to combine two targeted RNA-seq panels? Any help from you or Aaron or anyone would be great.

ADD REPLYlink written 29 days ago by glocke01160
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 679 users visited in the last hour