Hi, I'm new to RNA-seq and differentially expressed gene (DEG) analysis using DESeq2, the R package. But I encountered a problem need for help.
The experiment has samples of 3 group (control, treat1, treat2) with 3 replications each. Having RNA-sequenced, the reads counts were inputed to DESeq2 for DEG analysis, according to DESeq2's manual.
dds=DESeqDataSetFromMatrix(countData = counts,
colData = samples,
design = ~ group)
dds = DESeq(dds)
result=results(dds, contrast=c("group", "treat1", "control"))
In the result, there was log2FoldChange, p value, padj, etc for each gene. The question is when I compared the log2FoldChange with the raw reads of some genes it seemed that the log2FoldChange compared "control VS treat1" (here treat1 was the base level) or "treat1 VS treat2", instead of "treat1 VS control" (here control was the base level, which was my expectation). So it would be the case that a gene up expressed in treat1 (by looking at the raw reads) was down expressed in the results.
Below is a subset of the result.
<style type="text/css">
.tg {border-collapse:collapse;border-spacing:0;border-color:#ccc;}
.tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 5px;border-style:solid;border-width:0px;overflow:hidden;word-break:normal;border-color:#ccc;color:#333;background-color:#fff;}
.tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 5px;border-style:solid;border-width:0px;overflow:hidden;word-break:normal;border-color:#ccc;color:#333;background-color:#f0f0f0;}
.tg .tg-0lax{text-align:left;vertical-align:top}
</style>
symbol | control-1_reads | control-2_reads | control-3_reads | treat1-1_reads | treat1-2_reads | treat1-3_reads | treat2-1_reads | treat2-2_reads | treat2-3_reads | control-1_norm | control-2_norm | control-3_norm | treat1-1_norm | treat1-2_norm | treat1-3_norm | treat2-1_norm | treat2-2_norm | treat2-3_norm | baseMean | log2FoldChange | lfcSE | stat | pvalue | padj |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
gene1 | 104 | 113 | 155 | 518 | 514 | 471 | 0 | 1 | 0 | 121.8857286 | 99.70389301 | 129.1534889 | 489.6363946 | 454.1863516 | 426.6076245 | 0 | 1.150867377 | 0 | 191.3693721 | 10.14006488 | 1.185315798 | 8.554736972 | 1.18E-17 | 6.17E-17 |
gene2 | 44 | 36 | 47 | 903 | 1020 | 1033 | 2 | 1 | 1 | 51.56703902 | 31.76407211 | 39.16267082 | 853.5553366 | 901.303655 | 935.6383781 | 2.048006023 | 1.150867377 | 1.223606775 | 313.0459591 | 9.221995767 | 0.729090182 | 12.64863524 | 1.14E-36 | 1.02E-35 |
gene3 | 28 | 30 | 38 | 133 | 132 | 119 | 0 | 0 | 0 | 32.81538847 | 26.47006009 | 31.66343599 | 125.7174527 | 116.6392965 | 107.784092 | 0 | 0 | 0 | 49.00996952 | 9.133180831 | 1.200947709 | 7.604977935 | 2.85E-14 | 1.30E-13 |
gene4 | 457 | 573 | 721 | 34150 | 35421 | 35562 | 65 | 65 | 62 | 535.5940189 | 505.5781477 | 600.7720354 | 32280.08277 | 31299.09487 | 32210.23427 | 66.56019573 | 74.80637953 | 75.86362004 | 10849.84292 | 8.790220784 | 0.109694018 | 80.13400281 | 0 | 0 |
gene5 | 2 | 1 | 1 | 14 | 9 | 15 | 473 | 395 | 372 | 2.343956319 | 0.882335336 | 0.833248315 | 13.23341607 | 7.952679309 | 13.58623008 | 484.3534243 | 454.592614 | 455.1817202 | 159.217736 | -5.33149575 | 0.272127405 | -19.59191047 | 1.81E-85 | 3.54E-84 |
gene6 | 34 | 55 | 44 | 158 | 162 | 187 | 3498 | 3292 | 2806 | 39.84725743 | 48.5284435 | 36.66292588 | 149.3485528 | 143.1482276 | 169.3750017 | 3581.962533 | 3788.655406 | 3433.44061 | 1265.663218 | -4.548224914 | 0.088341971 | -51.48430413 | 0 | 0 |
gene7 | 1170 | 1703 | 1749 | 337 | 339 | 424 | 8399 | 7277 | 7116 | 1371.214447 | 1502.617078 | 1457.351304 | 318.5472297 | 299.5509206 | 384.0374369 | 8600.601292 | 8374.861905 | 8707.185809 | 3446.218602 | -4.679805864 | 0.064073627 | -73.03794184 | 0 | 0 |
gene8 | 25 | 31 | 29 | 1194 | 1279 | 1196 | 22834 | 21585 | 19438 | 29.29945399 | 27.35239543 | 24.16420115 | 1128.621342 | 1130.164093 | 1083.275412 | 23382.08476 | 24841.47234 | 23784.46849 | 8381.211387 | -4.429334677 | 0.041503905 | -106.7209146 | 0 | 0 |
Could you please tell me why? Was there I'm doing something wrong? Thank you very much!
The object generated by
results()
should spell out pretty clearly which comparison you did as in the following example:Thanks for your reply. Yes, the lines in the head of result did indicate the logFC corresponding to treat1/control of group. But, as mentioned above, some of the logFC looked odd.
in your snapshot above, which logFC values seem odd to you?
gene5, gene6, gene7, gene8.
what do the rlog- or vst-transformed values look like?
gene 7 looks fine to me:
gene7 1170 1703 1749 337 339 424
EDIT: I agree, your table contains a couple of puzzling results. Here's a couple of thoughts:
Sorry for replying late.
I updated the DESeq2 normalized counts;
Yeah, gene7 is fine, I pasted the wrong line;
If I didn't count wrong, there about 2% genes seemed non-intuitive like this.
I merged the original counts data to the result by gene symbol;
Thanks for your reply.
Did you check whether those 2% fall into the category of 'lowly expressed'?
I think, having the counts in treatment 2 in there plays a role (just look at the
baseMean
values). I'm not super familiar with the nitty-gritty details of how DESeq uses thecontrast
argument for the logFC generation, but I have a feeling that the results would probably look more intuitive if you ran the same analysis using only the values for treat1 and control. I recommend you post that question over on https://support.bioconductor.org/ where the authors of DESeq2 are going to see the question. Do post the answer here once you have it.EDIT: Out of curiosity, can you post the results of
resultsNames(dds)
?I supposed the treat2 looked to have been used in the "contrast";
I had already posted to https://support.bioconductor.org, thanks;
Below is the output of resultsNames(dds):
What was the response at the bioconductor support page?
What happens when you use "group_treat1_vs_control" as the contrast name for retrieving the results?
Rather than the raw counts, post the normalized counts:
Sorry for replying late.
I updated the DESeq2 normalized counts above.
Thanks for your reply.
Is it possible that you've mixed up the "group" labeling with respect to samples?
Thanks for your reply, I checked but I didn't mix up.