GTEx eQTL data
2
1
Entering edit mode
4.1 years ago
jt ▴ 20

Hi!

Can someone explain what is the difference between *.egenes.txt.gz and *.signif_variant_gene_pairs.txt.gz files? They are from GTEx v8 single-tissue eQTL data. Also I would want to know what ma_count and ma_samples mean?

eQTL GTEx gene • 3.0k views
ADD COMMENT
0
Entering edit mode
4.1 years ago

eGene and significant variant-gene associations based on permutations. The archive contains a *.egenes.txt.gz and *.signif_variant_gene_pairs.txt.gz file for each tissue. Note that the *.egenes.txt.gz files contain data for all genes tested; to obtain the list of eGenes, select the rows with 'qval' ≤ 0.05.

[source: https://gtexportal.org/home/datasets]

How you use these files will depend on what are your downstream analyses. Take a look at the contents and you should be able to infer what each represents.

----------------------

Regarding ma_count and ma_samples:

  • ma_samples: number of samples carrying the minor allele
  • ma_count: total number of minor alleles across individuals

[[source: https://storage.googleapis.com/gtex_analysis_v8/single_tissue_qtl_data/README_eQTL_v8.txt]

Kevin

ADD COMMENT
0
Entering edit mode

I have noticed that the *.signif_variant_gene_pairs file contains all the genes from the *.egenes which qval ≤ 0.05. But I do not understand why there are many variants of the same gene in *.signif_variant_gene file because in *.egenes file there are only one variant?

ADD REPLY
0
Entering edit mode

I understand. I think that the egenes file is merely a sort of 'annotation reference', and that *.signif_variant_gene_pairs.txt.gz file is the main one that you should use.

Look at it another way: if you filter the egenes file for qval ≤ 0.05, then you will arrive at the list of genes that have at least one statistically significant association. You then have to look in the other file to determine the list of SNPs that comprise this statistically significant association.

You should confirm with the website, though.

ADD REPLY
0
Entering edit mode
2.9 years ago
Shicheng Guo ★ 9.4k

I notice the several different beta for eQTL between ENSG00000008128 and rs28544273. How to understand it? Thanks.

molecular_trait_id  chromosome  position    ref alt variant ma_samples  maf pvalue  beta    se  type    ac  an  r2  molecular_trait_object_id   gene_id median_tpm  rsid
ENSG00000008128.grp_1.contained.ENST00000356200 1   815963  T   A   chr1_815963_T_A 163 0.144112    0.84493 0.0240479   0.122892    SNP 178 1138    0.45909 ENSG00000008128.contained   ENSG00000008128 5.626   rs28544273
ENSG00000008128.grp_1.contained.ENST00000356937 1   815963  T   A   chr1_815963_T_A 163 0.144112    0.777514    -0.0337367  0.119338    SNP 178 1138    0.45909 ENSG00000008128.contained   ENSG00000008128 5.626   rs28544273
ENSG00000008128.grp_1.contained.ENST00000358779 1   815963  T   A   chr1_815963_T_A 163 0.144112    0.135332    0.178658    0.119457    SNP 178 1138    0.45909 ENSG00000008128.contained   ENSG00000008128 5.626   rs28544273
ENSG00000008128.grp_1.contained.ENST00000378633 1   815963  T   A   chr1_815963_T_A 163 0.144112    0.355045    -0.114477   0.123676    SNP 178 1138    0.45909 ENSG00000008128.contained   ENSG00000008128 5.626   rs28544273

maybe caused by different ENST?

ADD COMMENT

Login before adding your answer.

Traffic: 2557 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6