Question: matrixEQTL giving discrete p value distribution for SNP-Phenotype pairs
gravatar for lc574
3.8 years ago by
lc57410 wrote:

Hey guys,

I am running MatrixEQTL on the BRCA TCGA exome sequencing dataset containing 83070 somatic SNPS extracted from the MAF files of 976 primary tumour cases. I am testing the association of these SNPS to 22 phenotypes calculated from the gene expression matrix for the same 976 TCGA cases. My SNP input to MatrixEQTL is a 83070 x 976 binary matrix with a 1 for if the sample contains the SNP at a given position and 0 if it does not. My gene expression file is a 22 x 976 matrix with standardised values for the phenotype. I have 6 covariates for each of the cases that I am including in the analysis. I am running the analysis in R. After running the analysis I get a thousand or so associations and several of them are extremely significant (10^-308). The issue is that the same phenotype will have say, 30 snps associated to it and all of which will have the same exact p value. I looked further into this and realised this is because only one sample has the SNP in question and thus the p values have become discretised. Is this association false given that only 1 sample has the series of SNPs which is associated with a given phenotype? My SNP matrix is very very sparse and thus the majority of SNPS are at most only carried by one sample.

Thanks for your help.

gwas genomics next-gen R matrixeqtl • 1.2k views
ADD COMMENTlink written 3.8 years ago by lc57410
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1014 users visited in the last hour