Question: Getting rid of noise in gene expression
gravatar for ghunt
23 months ago by
ghunt10 wrote:

Hi, I am working with a data set containing gene-expression of cancer patients. And I am being told that the data obtained can be noisy. The gene expression value ranges from 0 to 20. And the number of patients is close to 2000. There are close to 50K of gene expression value of illumina id.

What would be the best way to filter out the noise due to the error of the illumina sequencing technique. Is there a general technique to get rid of noise.


noise-removal genome • 684 views
ADD COMMENTlink modified 23 months ago by informatics bot530 • written 23 months ago by ghunt10

If the data contains values 0 to 20 and an "illumina id", it is not sequencing data. It is microarray most likely.

ADD REPLYlink written 23 months ago by igor6.2k
gravatar for informatics bot
23 months ago by
United States
informatics bot530 wrote:

There are many ways to reduce noise in RNA-seq gene expression data. I personally have found the following approach useful when dealing with heterogeneous tissue and >100 samples.

1.) Remove genes with low gene expression.

2.) Remove samples that lack adequate sequencing depth (My lab usually sequences at least 8 million mapped genes)

3.) Remove samples based upon their standard deviations away from the mean on a PCA/MDS plot.

4.) Use R packages such as PEER and sva/combat to remove batch effects from the data.

5.) Profile you data with tools such as WGCNA, see if any individual samples are driving non-nonsensical modules that don't relate to biology.

ADD COMMENTlink modified 23 months ago • written 23 months ago by informatics bot530
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1579 users visited in the last hour