5.9 years ago by
Boston, MA USA
Intergenic SNPs are likely to either affect transcription (mapping to a promoter or enhancer of a nearby gene) or map within a currently undescribed/unknown gene (eg, a novel lncRNA). One approach is to see if the SNP, or an LD partner SNP, has eQTL properties with a nearby gene. These data (from GenVar and UChicago eQTL tools) are mostly for protein-coding genes. If you're far from such genes, you may be in an enhancer and that's tougher to discern function and assign a relationship to a gene. There are enhancer tools and databases but showing the link to a gene controlled by that enhancer will be more difficult to obtain. Lastly, some non-coding RNAs might show sequence conservation across species but not always, and not at the precise position upstream/downstream of some anchor, like a protein-coding gene - which all means it could be difficult to say that your SNP maps to a gene encoding a novel non-coding RNA.
More to the point of your question: The Framingham Heart Study, which has initiated many GWAS, has used a distance to 60 kbp to assign a SNP to a gene. A distance greater than this is not assigned to a gene or done so but treated as "distant." You can always find the nearest gene, or one with an NM_# RefSeq mRNA (as opposed to a gene model mRNA), and the distance to that gene - carry forward both values.