Question: QQ plots for eqtl data
0
gravatar for jamespower
20 months ago by
jamespower40
jamespower40 wrote:

Hi,

What would you expect from a QQ plot in associations of SNP with gene expression across a whole chromosome? It should be extremely inflated but does anyone have any idea of how inflated? For example, in chromosome 22, out of ~2 million SNPs analyzed, about 200,00 have a p-value < 0.05. Is this expected?

Thank you for any insight!

eqtl matrixeqtl • 1.2k views
ADD COMMENTlink modified 20 months ago • written 20 months ago by jamespower40

...sample size?; ...number of cases?; ...number of controls?; ...any covariate adjustments?; ...any pre-filtering of SNPs?

Thank you!

ADD REPLYlink modified 20 months ago • written 20 months ago by Kevin Blighe52k

Hi Kevin, thanks for reply, sample size is 1,000 and it is a quantitative trait for gene expression, so assuming all pre-filters and adjustments are correct (filtering is done following regular thresholds using Hardy-Weinberg and keeping common SNPs only), would that be what we would expect?

ADD REPLYlink modified 20 months ago • written 20 months ago by jamespower40

That's a large enough dataset. I believe that any chromosome will show a skewed QQ plot if that chromosome is not well covered by markers, such as chr22 usually is (I believe). Which MAF did you use for filtering?

ADD REPLYlink written 20 months ago by Kevin Blighe52k

MAF is >5% and these numbers and inflated QQ plots are similar across chromosomes actually...

ADD REPLYlink written 20 months ago by jamespower40

Okay, could be related to your disease-alleles. Would help if you shared [some of] your QQ plots. You can do this via ImgBB, for example, by uploading and then pasting the HTML URL here.

ADD REPLYlink written 20 months ago by Kevin Blighe52k

Thank you, here it is... QQplot chr22 Note that I don't have disease alleles, this is association with gene expression... Thank you for any feedback.

ADD REPLYlink modified 20 months ago by genomax75k • written 20 months ago by jamespower40

Wait, thanks for giving the extra reminder at the very bottom, i.e., that these are expression trait loci p-values. Those are not expected to follow the typical quantile distribution as one would expect from GWAS. In summary: I do not believe that you need to worry too much about this. Please take a look at other literature where QQ plots have been generated from eQTL p-values.

ADD REPLYlink written 20 months ago by Kevin Blighe52k

great thank you Kevin, that makes sense.

ADD REPLYlink written 20 months ago by jamespower40

I agree with Kevin. However, may I ask which software did you use for eQTL analysis? In my experience, when using matrixEQTL I observed a lot of significant results (with very low pvalue) when one of the three genotypic classes is rare (and that was AFTER filtering for rare alleles). Some of those might be false positives.

ADD REPLYlink written 20 months ago by Fabio Marroni2.4k

Hi Fabio, thanks for your feedback! I have used matrixEQTL. That is worrying indeed, especially happening even after filtering rare SNPs before the association... is there any way you may know of filtering those cases further, or maybe another software that better controls for this? (or did you just resort to tabulating the genotypes for each associated SNP and removing those cases?)

ADD REPLYlink written 20 months ago by jamespower40

The last option :-( I actually, remove SNPs for which I observe a rare genotype BEFORE the analysis, but after a preliminary check to have an idea of "suspect results". For example, the one in the figure is - in my opinion - suspect (-1 is missing data, 0,1 and 2 are the three possible genotypes).

enter image description here

But you can easily find even worse cases if you look at your 10^-200 pvalues! I was considering about doing a first pass on matrixEQTL, only select a subset of genes/SNPs and then analyze the subset on EMMAX (which is presumably more robust, but of course will be slower). For the moment, I am happy with filtering, but I am still in the exploratory phase.

ADD REPLYlink modified 20 months ago • written 20 months ago by Fabio Marroni2.4k
1

Thank you Fabio for helping. I know that you have been doing a lot of GWAS work recently. Please continue the discussion! :)

ADD REPLYlink written 20 months ago by Kevin Blighe52k
1

I am just struggling to learn, actually!

ADD REPLYlink written 20 months ago by Fabio Marroni2.4k

Va bene!

ADD REPLYlink written 20 months ago by Kevin Blighe52k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1992 users visited in the last hour