Question: Should zero expression values be remove in eQTL analysis?
gravatar for whtopjazz
2.1 years ago by
whtopjazz0 wrote:

For example, RNA-seq expression for gene1 in 10 people are, GENE1=[0, 0, 1, 2, 3, 4, 3, 4, 2, 7]。 SNP1 with alleles A and G, and SNP1 in 10 people is SNP1=[0, 1, 2, 1, 1, 2, 2, 2, 0, 0], 0 means GG, 1 means AG, 2 means AA。

What I want to do it eQTL analysis. Simple put, I want to fit a linear model to find out if the expression GENE1 was regulated by SNP1。 Should I remove the zeros values in GENE1 expression values before fit the regression model? It should be noted, for many genes, if I removed the zeros, most of the samples will also be removed.

snp rna-seq • 511 views
ADD COMMENTlink modified 2.1 years ago by Fabio Marroni2.6k • written 2.1 years ago by whtopjazz0
gravatar for Fabio Marroni
2.1 years ago by
Fabio Marroni2.6k
Fabio Marroni2.6k wrote:


I would only remove genes that have 0 levels of expression in a very large proportion of samples.

More generally, you might want to filter genes with low variance across samples (see e.g. this paper, in the eQTL mapping paragraph of Materials and Methods section), since they are not informative for the analysis.

ADD COMMENTlink written 2.1 years ago by Fabio Marroni2.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1123 users visited in the last hour