Gene expression PCA with DESeq2 filtration
1
0
Entering edit mode
6.5 years ago
Picasa ▴ 640

Hi,

I would like to make a simple PCA at gene level of my RNA samples.

I am trying to follow this tutorial:

http://bioconductor.org/help/workflows/rnaseqGene/#pre-filtering-the-dataset

From what I understand, I need to filtered low expressed genes because they give no information:

http://bioconductor.org/help/workflows/rnaseqGene/#pre-filtering-the-dataset

In this workflow, it seems that removing sum of counts per gene is the method. But I remember reading somewhere that it's better to filter using CPM values rather than counts because they account for differences in sequencing depth between samples.

I am a bit confuse now, do you have advice ?

Thanks

PCA rna filtration • 2.5k views
ADD COMMENT
1
Entering edit mode
6.5 years ago

Hello,

This has been well answered already.

It is not necessary to filter low-variance transcripts prior to performing PCA. The default PCA function in the DESeq2 package does perform this filtering step but it's really not necessary. Then again, it's also perfectly fine to do this type of filtering - you just have to be aware of and report when you do it.

As PCA is fundamentally based on variance, you may get very different results in the PC1 versus PC2 bi-plot between filtered and unfiltered data.

It is best to perform the PCA on regularised log counts, or other binomial-distributed data.

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 1935 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6