Assign Intergenic Snps To Gene.
1
5
Entering edit mode
11.0 years ago
Peixe ▴ 660

Hi,

This is a trivial question and I am surprised I could not find it anywhere else...

How can you map an intergenic SNP to a unique gene? I mean, is there any "canonical good way" to do it? Several ways to do it come to my mind: the closest in distance, the one with the higher R² value between the SNP and the first gene polymorphic position, etc... But I would like to know if any previous reference of someone doing it somehow do actually exist. GWAS usually report both genes (upstream & downstream), but I'd rather like to have a single hit.

Does anyone know a good way to do it?

snp mapping • 4.9k views
ADD COMMENT
1
Entering edit mode

I don't have time to write a full answer now, but have a look at this paper: Habegger et al, 2012. VAT: a computational framework to functionally annotate variants in personal genomes within a cloud-computing environment.

ADD REPLY
0
Entering edit mode

In this recent paper Raj et al, 2013, Common Risk Alleles for Inflammatory Diseases Are Targets of Recent Positive Selection they assigned SNPs to the closest gene inside the LD block.

ADD REPLY
1
Entering edit mode

I know... & you know we are doing a journal club on it soon... ;) hehehe...

ADD REPLY
5
Entering edit mode
11.0 years ago

Intergenic SNPs are likely to either affect transcription (mapping to a promoter or enhancer of a nearby gene) or map within a currently undescribed/unknown gene (eg, a novel lncRNA). One approach is to see if the SNP, or an LD partner SNP, has eQTL properties with a nearby gene. These data (from GenVar and UChicago eQTL tools) are mostly for protein-coding genes. If you're far from such genes, you may be in an enhancer and that's tougher to discern function and assign a relationship to a gene. There are enhancer tools and databases but showing the link to a gene controlled by that enhancer will be more difficult to obtain. Lastly, some non-coding RNAs might show sequence conservation across species but not always, and not at the precise position upstream/downstream of some anchor, like a protein-coding gene - which all means it could be difficult to say that your SNP maps to a gene encoding a novel non-coding RNA.

More to the point of your question: The Framingham Heart Study, which has initiated many GWAS, has used a distance to 60 kbp to assign a SNP to a gene. A distance greater than this is not assigned to a gene or done so but treated as "distant." You can always find the nearest gene, or one with an NM_# RefSeq mRNA (as opposed to a gene model mRNA), and the distance to that gene - carry forward both values.

ADD COMMENT
0
Entering edit mode

That's a nice explanation, @Larry_Parnell ! My SNPs are mostly derived from GWAS, so think I'd go for the "nearest gene" criterium. However, I think is worth to give a look to the Genevar. Thanks!

ADD REPLY
0
Entering edit mode

Thanks you. You should use as many eQTL resources as you can find. Estimates range from 30% upwards that GWAS hits affect transcription and so eQTL analysis will be very helpful for you.

ADD REPLY

Login before adding your answer.

Traffic: 2870 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6