Question

How to analyze gene expression datasets with rich controls

0

Entering edit mode

7.1 years ago

moxu ▴ 510

We treat tumor cells with compound A, compound B, compound A + B mixed, and compound A-B linked respectively with different concentrations of compound(s) for each group, and we have controls (total 5 groups). We measure the gene expression for each group.

One of our goals is to find out if A-B linked & A + B mixed causes differential gene expression. If yes, what the genes are. A-B linked meaning A & B are linked together chemically and become a new compound.

A straightforward comparison is just to compare A-B linked with A + B mixed (a nested question: do a linear regression of gene expression level ~ compound concentration and compare the beta's? DESeq or edgeR might not be appropriate I am afraid). But we have experimental data for the other three groups (control, A, B). Can we make use of such datasets as well?

gene R RNA-Seq • 1.5k views

ADD COMMENT • link updated 7.1 years ago by zjhzwang ▴ 180 • written 7.1 years ago by moxu ▴ 510

score 0 · Answer 1 · 2017-03-01

0

Entering edit mode

7.1 years ago

zjhzwang ▴ 180

For question 1:
I think you can use a lot of tools to do different expression analysis, such as DESeq2 ( if you data came from RNA-seq ) or limma ( if your data came from Microarray ).
For question 2:
If you want to use all experimental data, I think you can do Multivariate analysis of variance.

ADD COMMENT • link 7.1 years ago by zjhzwang ▴ 180

0

Entering edit mode

Q1: My understanding is that DESeq or limma only does group ~ group comparison, while we have more than just two groups -- we have control (no compound), 0.1mM, 0.2mM, 0.5mM ... of compound treatments

Q2: What would be the model given the annotation in question? Like gene_expression = A + B + A*B + A-linked-B? Then the A only, B only treatments seem to be redundant.

ADD REPLY • link 7.1 years ago by moxu ▴ 510