Question

How to find differential expression in lowly expressed transcripts?

0

Entering edit mode

3.7 years ago

Marcel • 0

Hello,

I have RNA-seq data from a wildtype and a knockout, each in two different conditions (3 replicates each). When pooling the data from my replicates, I noticed that some lowly expressed transcripts are down in WT_condition_1. This exactly fits my expectations.

After annotating these lowly expressed transcripts, I ran DESeq2, hoping to find them differentially expressed. This is how the example from above looks for the individual replicates.

Now, there are unfortunately not enough reads for the transcript to be called significantly differential with FDR correction. In fact, not a single one of my lowly expressed transcripts is significantly deferentially expressed after FDR correcting.

Is there anything, I can do, other than telling the wetlab to re-sequence deeper?

Doing without FDR correction is probably not okay.
I was thinking about using another software that does not require replicates and running it on the 4 different files I get when pooling the replicates (data from the first image). But I understand that this bears the danger that there is too much weight given to potential outliers.

RNA-Seq • 1.3k views

ADD COMMENT • link updated 3.7 years ago by ATpoint 82k • written 3.7 years ago by Marcel • 0

score 2 · Accepted Answer · 2020-08-27

2

Entering edit mode

3.7 years ago

ATpoint 82k

A couple of things:

1) Try to avoid loading BAM files directly into the browser in order to check expression levels. BAM files are not normalized at all, and even if they have the exact same read counts this is still not representative. If you want to check browser tracks then normalize them properly as suggested here A: ATAC-seq sample normalization (quantil normalization). The post talks about ATAC-seq but the same holds true for RNA-seq, just consider "regions" as genes.

2) It is not surprising that lowly-expressed genes are not significant at n=3. You will probably need many more replicates to have the power to call them DE. Can you show the output of plotMA and indicate where your genes are that you are interested in?

3) Doing without FDR correction is probably not okay. Yes, that would not be acceptable. Do more replicates if you are interested in lowly-expressed genes. The rule is simple, large effect sizes and genes with high expression require fewer replicates to have the same power as lowly-expressed genes and/or lower effect sizes.

4) I was thinking about using another software... Please not. Proper statistics needs replicates, and DESeq2 is absolutely fine. Your study is probably simply underpowered. n=3 is in fact a bare minimum for a DE analysis so you cannot expect to even get close to the power to find all DEs for lowly-expressed genes.

Is there anything, I can do, other than telling the wetlab to re-sequence deeper?

Don't sequence deeper, do more replicates this is far more important than read depth. You can explore the read depth effect by simply multiplying your raw counts by factor 2, 3, 10...and see how things change. This is of course a bit artificial but I doubt that you gain anything at n=3 by just sequencing deeper. You need more replicates.

ADD COMMENT • link 3.7 years ago by ATpoint 82k

0

Entering edit mode

Thank you for your detailed answer, I'll have a look at the post you suggested.

Here's an MAplot with all the lowly expressed transcripts I previously annotated and the example from above marked. This should not contain protein coding genes except for some falsely annotated ones that I still have to filter out.

Okay, so I guess, the only thing I can do, is to ask them to check some of my potential candidates by qPCR and if is is indeed differential, they will for sure be convinced to go for more replicates.

ADD REPLY • link 3.7 years ago by Marcel • 0

1

Entering edit mode

What are the p- and FDR values of the genes you are interested in? Are they like p-value = 1 so no sign of differential expression or are they at least trending towards being DE?

ADD REPLY • link 3.7 years ago by ATpoint 82k

0

Entering edit mode

FDR-values are all close to 1. Regarding raw p-values.. I have about 100 that are < 0.05 that might be candidates.

ADD REPLY • link 3.7 years ago by Marcel • 0

1

Entering edit mode

High FDR is expected if power is missing, but promising low nominal p-values are good, so you should have a change with more replicates.

ADD REPLY • link 3.7 years ago by ATpoint 82k

0

Entering edit mode

Thank you very much for your advice!

ADD REPLY • link 3.7 years ago by Marcel • 0

0

Entering edit mode

Sorry to bother with this simple question, but does this apply as well for p_value and p_adjusted of DESeq2 output?? Thanks!