There's now tons of different packages available to do statistical analysis of gene expression data to assess differential expression. Many of them, like mainstays such as edgeR and DESeq, carefully estimate variance parameters for individual genes/contigs (through various fancy means), then fit linear models and ultimately rank your genes/contigs by p-value (or similar). The subset of contigs you believe to be differentially expressed are those that survive some FDR moderation procedure.
Another way to look at it is as a classification problem: say we're given gene expression measurements from two conditions, then you could perform suitably regularized logistic regression - let's say with a LASSO/sparsity penalty on the coefficients - and then pull out the subset of genes/contigs with nonzero coefficients after optimizing the sparsity penalty through some procedure.
What is the reason to prefer one over the other? Are the two methods really conceptually equivalent but stem from different scientific communities?