Question: Getting rid of noise in gene expression
gravatar for ghunt
2.4 years ago by
ghunt10 wrote:

Hi, I am working with a data set containing gene-expression of cancer patients. And I am being told that the data obtained can be noisy. The gene expression value ranges from 0 to 20. And the number of patients is close to 2000. There are close to 50K of gene expression value of illumina id.

What would be the best way to filter out the noise due to the error of the illumina sequencing technique. Is there a general technique to get rid of noise.


noise-removal genome • 853 views
ADD COMMENTlink modified 2.4 years ago by informatics bot560 • written 2.4 years ago by ghunt10

If the data contains values 0 to 20 and an "illumina id", it is not sequencing data. It is microarray most likely.

ADD REPLYlink written 2.4 years ago by igor7.1k
gravatar for informatics bot
2.4 years ago by
United States
informatics bot560 wrote:

There are many ways to reduce noise in RNA-seq gene expression data. I personally have found the following approach useful when dealing with heterogeneous tissue and >100 samples.

1.) Remove genes with low gene expression.

2.) Remove samples that lack adequate sequencing depth (My lab usually sequences at least 8 million mapped genes)

3.) Remove samples based upon their standard deviations away from the mean on a PCA/MDS plot.

4.) Use R packages such as PEER and sva/combat to remove batch effects from the data.

5.) Profile you data with tools such as WGCNA, see if any individual samples are driving non-nonsensical modules that don't relate to biology.

ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by informatics bot560
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1566 users visited in the last hour