I was hoping to get some clarification on the meanings of "differential expression" and "differential abundance."
Briefly, this is the RNA-seq experiment:
Align reads to genes (coding regions) with HISAT2.
Counts reads with featureCounts.
Perform differential analysis with edgeR using DGEList to create the matrix of reads and running a glmQLFTest to make comparison between samples to identify genes that have differential read counts.
In this scenario, are the terms "expression" and "abundance" equivalent to describe the reads? Am I measuring "gene expression" or "gene abundance"?
Van den Berge et al (2019) say, "One of the main uses of RNA-seq is to assess gene- and transcript-level abundances. Accurate abundance estimation is crucial to common downstream applications, including assessing all the notions of DE (differential expression)." So maybe "abundance" and "expression" are interchangeable?
Does anyone have any insight on this or any papers that discuss this?
Don't get caught up in the terminology; there's too much nuance. Just use them however you think is best and have a large-language-model refine your language. In any case...
A gene can be "expressed" but a gene cannot be "abundant". A gene is literally a heritable part of a DNA molecule. You aren't measuring how much DNA there is. You can measure how much "expression" is occurring from the DNA molecule (since the DNA is making stuff that you're measuring). Of course, transcripts (which refer to RNA molecules) can be "abundant" and you can also refer to them as "expressed". Expression refers to the process of biological production (if I'm in a chemistry lab artificially synthesizing RNA molecules, I could say those RNA molecules are "abundant" but I couldn't say they're expressed, right?).
You could say "gene-level abundance", however, because that refers to the process of summing up the transcript counts assigned to each gene (-level refers to resolution; just like how someone might say "population-level" in certain fields).