Question

RNAseq: Scaling or not scaling with rlog transformed counts

0

Entering edit mode

14 months ago

Laurie • 0

Dear Biostars,

I'm working on RNAseq data. I have rlog transformed counts (from the rlog transformation in DESeq2 package). I want to use them as an input for creating heatmaps and subsequent clustering to identify potential functional groups of transcripts.

I'm wondering if these rlog transformed counts should be scaled before attempting any clustering, either hierachical or k-means. Or should they be used as such? I can't really decide... So any insight would be hugely appreciated!

Thanks a lot!

scale rlog RNAseq clustering DESeq2 • 725 views

ADD COMMENT • link 14 months ago by Laurie • 0

score 2 · Accepted Answer · 2023-02-01

2

Entering edit mode

14 months ago

ATpoint 81k

Scaling RNA-Seq data before clustering?

ADD COMMENT • link 14 months ago by ATpoint 81k

0

Entering edit mode

Rlog transformation is similar to a log2 transfo. for genes with hight counts, while shrinking together the values for different samples for genes with low counts. Variances are already roughly homoskedastic after the rlog transformation.

So if scaling means transforming data to the Z-scale [ "deviation from the mean of all samples for that gene"], and variances are already approx. the same, isn't it redundant?

I read this post before, and I'm surely missing something, this is why I'm asking here...

ADD REPLY • link 14 months ago by Laurie • 0

1

Entering edit mode

rlog and vst (and standard log2) still preserve expression level differences. So you have genes with large snd genes with low counts. Z-scale measures deviation from the gene mean so the magnitude of counts is eliminated. It's two different concepts.

ADD REPLY • link 14 months ago by ATpoint 81k

0

Entering edit mode

OK, this is where I was completely lost.

Thanks for your help and answer.

ADD REPLY • link 14 months ago by Laurie • 0