calculate p value and associated z score for snp-gene pair
1
0
Entering edit mode
10 months ago
rheab1230 ▴ 140

Hello everyone, I have genotype file and gene expression file. I want to see whether three of my rsid in genotype file is associated with gene expression. I used matrixeqtl R package to generate p value and see whether the selected snp is associated with the gene expression. I used this code:

me = Matrix_eQTL_engine(
  snps = snps,
  gene = gene,
  cvrt = cvrt,
  output_file_name = output_file_name,
  pvOutputThreshold = pvOutputThreshold,
  useModel = useModel,
  errorCovariance = errorCovariance,
  verbose = TRUE,
  pvalue.hist = TRUE,
  min.pv.by.genesnp = FALSE,
  noFDRsaveMemory = FALSE)


unlink(output_file_name)
show(me$all$eqtls)
output: 
Processing covariates
Task finished in 0.004 seconds
Processing gene expression data (imputation, residualization)
Task finished in 0.004 seconds
Creating output file(s)
Task finished in 0.022 seconds
Performing eQTL analysis
100.00% done, 0 eQTLs
No significant associations were found.

I looked at the snps and they have effect for this gene in gtex models but during matrixeqtl analysis it doesn't show any association.

I don't want to focus on eqtl but rather just generate pvalue and associated z score for the snp-gene pair. Is there any software or tool to do that?

snp association gene pvalue • 1.1k views
ADD COMMENT
2
Entering edit mode
10 months ago
LChart 3.9k

The simplest thing to do would be to 'force' matrixeQTL to output everything by setting pvOutputThreshold to 1; this should give you all association statistics between all snps in snp and all genes in gene by saving them to the file $output_file_name

ADD COMMENT
0
Entering edit mode

Thank you so much for this. Yes, in this case I get all the SNPs. I also have one more question:

SNP gene    beta    t-stat  p-value FDR
rs12160750  ENSG00000100116.16  0.191086848260097   2.78574185437024    0.00555828233127971 0.286529082199489
rs2285177   ENSG00000100116.16  0.187323375911886   2.72992752554224    0.00657473357007616 0.286529082199489

In this results: a lower p value less than 0.05 means the snp-gene pair is significant. What does the rest of the score implies? like t-stat,FDR value? Thank you.

ADD REPLY
2
Entering edit mode

Beta is the effect size (for rs12160750: going from 0 alleles to 1 allele raises the expression by 0.19108 on average). t-stat is the "z-statistic" you were looking for in your OP, but it really is a t-stat rather than a z-stat, though for high degrees of freedom the distributions are indistinguishable. FDR is the multiple-testing adjusted p-value using the false discovery rate method -- and this is the column you should use to determine significance, rather than p-value.

ADD REPLY
0
Entering edit mode

Okay, I got it. Thank you so much for the response. Can you tell me what should be the range of FDR value to consider for significance?

ADD REPLY
0
Entering edit mode

I think researchers would agree that FDR > 0.05 is not significant, though some would argue FDR between 0.05 and 0.1 is OK for an initial finding.

An informal survey I performed of bioinformatics post-docs had about 40% saying FDR < 0.05 is significant, 40% saying FDR < 0.01 is significant, and 20% were convinced by a then-recent article arguing for FDR < 0.005. All of these positions are defensible.

ADD REPLY
0
Entering edit mode

Thank you so much for the information.

ADD REPLY

Login before adding your answer.

Traffic: 2681 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6