Does importing HTseqcounts with txtimport corrects for gene length?
Entering edit mode
6.0 years ago
salamandra ▴ 550

I have imported HTseqcount files to R with 'DESeqDataSetFromHTSeqCount' function, to then do differential gene expression with 'Deseq' package. However, I read online that by default 'Deseq' does not normalize for gene length, and that to normalize for gene length we need to import with txtimport() function.

By reading the manual of txtimport() I got confused: Does txtimport() does per se this gene length correction or is it only when counts are imported from 'Salmon', 'Sailfish', 'kallisto', 'RSEM' and 'StringTie'. I mean: does importing HTseqcounts with txtimport corrects for length or do we need to use txtimport with counts produced by the tools referenced above?

RNA-Seq deseq R • 1.8k views
Entering edit mode

You should note that tx imports transcript level estimates and aggregates them to the gene level. Be default, featureCounts or HTseq quantify counts already on the gene level, so using tximport does not make sense here anyway. DESeq does not take gene length into account, and it also does not need to, because when comparing multiple samples, the length of the gene is always the same. One would only need that if comparing genes within one sample (not discussing now if this makes sense or not). That means if you load your countmatrix into R/DESeq2, and stick to the manual, you are OK.

Entering edit mode
6.0 years ago

Do not try correcting for gene length, this is something that's needed when using tools like salmon and kallisto, for which you would use tximport and not featureCounts or htseq-count.


Login before adding your answer.

Traffic: 2852 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6