Question: Analysis Of Snps In Gene Deserts
8
gravatar for Khader Shameer
9.3 years ago by
Manhattan, NY
Khader Shameer18k wrote:

I have identified a causal multiple SNPs in gene deserts for my phenotype of interest.

What will be the possible bioinformatics / statistical genetics / experimental analysis approach / methods that I can explore to associate this SNPs in gene desert with my phenotype ?

Please share your thoughts and related literature on analyzing such gene deserts and SNPs in gene deserts.

annotation snp gwas • 3.8k views
ADD COMMENTlink modified 23 months ago by Biostar ♦♦ 20 • written 9.3 years ago by Khader Shameer18k
3

I feel it is better to call the SNP associative than causal. Causal seems more definitive reg: fxn

ADD REPLYlink written 9.3 years ago by jvijai1.2k

When you say causal along with non-genic, it is not clear. Is there a specific functional motif that is changed by the SNP or is it associative in phenotype?

ADD REPLYlink written 9.3 years ago by jvijai1.2k

I mean SNP is in a region of a genome where no known gene is annotated. Gene desert / non-genic region are generally used for such regions for example: http://genome.cshlp.org/content/20/9/1191.full http://genome.cshlp.org/content/15/1/137.full

ADD REPLYlink written 9.3 years ago by Khader Shameer18k

I called it as causal, because of it's significant p-value and OR.

ADD REPLYlink written 9.3 years ago by Khader Shameer18k

So correlation == causation these days?

ADD REPLYlink written 8.6 years ago by Aaron Statham1.1k

@Vijai / Adrian: Question edited.

ADD REPLYlink written 8.6 years ago by Khader Shameer18k

@Vijai / Adrian / Aaron : Question edited.

ADD REPLYlink written 8.6 years ago by Khader Shameer18k
5
gravatar for Sean Davis
9.3 years ago by
Sean Davis26k
National Institutes of Health, Bethesda, MD
Sean Davis26k wrote:

You and everyone else doing GWAS....

In any case, besides looking for ncRNAs and undocumented transcripts, there is a load of data from the ENCODE and related projects available most accessibly from the UCSC genome browser. In particular, you could look at overlap between DNAse hypersensitivity, transcription factor binding, maximally conserved elements, etc. and your SNP. If you want to get really crazy, you could look at the Hi-C paper by Dekker et al. that purports to give long-range genomic interactions between various regions of the genome.

ADD COMMENTlink written 9.3 years ago by Sean Davis26k

Thanks a lot Sean. I will explore the ENCODE data and DNAse hypersensitivity tracks from UCSC genome browser with Hi-C data. Quick search shows that Hi-C associated resource don't have a way to browse the data. Maximally conserved elements as indicated in Vista (http://genome.lbl.gov/vista/) and related resources for comparative genomics ?

I am already running a TFBS search in the region.

ADD REPLYlink written 9.3 years ago by Khader Shameer18k

Thanks a lot Sean. I will explore the ENCODE data and DNAse hypersensitivity tracks from UCSC genome browser & Hi-C data. Quick search shows that Hi-C associated resource don't have a way to browse the data. Maximally conserved elements as indicated in Vista genome.lbl.gov/vista and related resources for comparative genomics ? I am already running a TFBS search in the region

ADD REPLYlink written 9.3 years ago by Khader Shameer18k

There are several conservation tracks at UCSC, including the phastConsElements tables. As for TFBS, you could also look at the tfbsConsSites table. My point in the answer above was that there are other genomic features besides genes that might be of interest. And, unfortunately, as you note with the Hi-C data, accessing them may not always be straightforward.

ADD REPLYlink written 9.3 years ago by Sean Davis26k

Thanks Sean. I am currently exploring ENCODE data and other tracks in UCSC.

ADD REPLYlink written 9.3 years ago by Khader Shameer18k
3
gravatar for Gww
9.3 years ago by
Gww2.7k
Canada
Gww2.7k wrote:

Perhaps you could use some RNA-mapping technique to look for non-coding RNA's containing the SNP you are interested in. Perhaps 3'- / 5'- RACE could be suitable. At least from this you could get a general idea if the SNP is associated with a transcript. However, your SNP could be part of some long range cis-enhancer region like they found in this paper, which could make experimental validation more complicated.

You could also try mining non-coding RNA lists in various databases such as ensembl (ie. small RNA / lincRNA / etc) to see if any of them map to that region as well.

ADD COMMENTlink written 9.3 years ago by Gww2.7k

Thanks GWW. I am exploring the ncRNA / miRNA track now.

ADD REPLYlink written 9.3 years ago by Khader Shameer18k
2
gravatar for Larry_Parnell
9.3 years ago by
Larry_Parnell16k
Boston, MA USA
Larry_Parnell16k wrote:

Well, Khader, you should have been at ASHG last week - and should have then run from talk to talk where all kinds of folks presented very similar situations. The MYC paper GWW cites is an example that comes to mind as well. In fact, I saw talks by two of those authors. I also saw a presentation of using ENCODE data to annotate the GWAS hits - so I agree with Sean's excellent idea. My notes from these talks are on my blog: Wasserman, Degner's use of ENCODE data, and Stamatoyannopoulos' talk on using ENCODE.

This is just a start, though. There is certainly a lot of ideas on this topic. Generally, the SNPs in gene deserts are SNPs regulating either expression of a distant protein-coding gene or are in/near to a non-protein-coding (RNA) gene. Ideally, you would have expression data (either from a genome-wide chip of mRNA probes) or RT-PCR data across the region (in which you can identify those novel transcription products) to compare to the GWAS data to look for a type of eQTL.

Added on 12 Jul 2011: The continued evolution of the epigenomebrowser.org site makes it a very good place to go to get the data mentioned here. There are now some very nice displays of Stamatoyannopoulos' DNaseI hypersensitivity sites, tissue-specific histone methylation and acetylation marks as well as transcription factor ChIP data.

ADD COMMENTlink modified 8.6 years ago • written 9.3 years ago by Larry_Parnell16k

True Larry, I should have attended ASHG. Thanks for your points. I had a cursory view at your blog posts during the #ashg2010 tweets. I will read them in detail now. Thanks for the point on expression data, I will check this, eQTL will be another interesting idea. Do you know about any curated/automated database of eQTL?

ADD REPLYlink written 9.3 years ago by Khader Shameer18k

No, I know of no such database. It is unfortunate that these data end up in supplementary files that really need to be swept into a larger database. Creating such a database should be a priority of funding at NIH, IMHO.

ADD REPLYlink written 9.3 years ago by Larry_Parnell16k

Thanks Larry. You are right a literature curated eQTL database should be a priority for the funding agencies.

ADD REPLYlink written 9.3 years ago by Khader Shameer18k
2
gravatar for Adrian Cortes
8.6 years ago by
Adrian Cortes520
Brisbane, Australia
Adrian Cortes520 wrote:

In my opinion you have to be careful when you say "you have identified a causal SNP". From your description you have identified a region of high association with no transcript annotation, AKA gene dessert. This is still not a causal SNP as you don't know how this variation affects your phenotype and whether a variation in LD with your candidate is the true causal SNP.

As it was suggested in some of the answers already you can combine your genotypes with expression data, here is a nice example in the literature.

You can also take a look at this eQTL browser.

Cheers!

ADD COMMENTlink modified 5 months ago by RamRS25k • written 8.6 years ago by Adrian Cortes520
1
gravatar for Larry_Parnell
9.2 years ago by
Larry_Parnell16k
Boston, MA USA
Larry_Parnell16k wrote:

Just to add a different point - one probably already known to Khader. This type of question is being considered at WikiGenes and the article here. Many of us (those into genome variation) either need to join this effort or keep abreast of what they write.

ADD COMMENTlink modified 9.2 years ago • written 9.2 years ago by Larry_Parnell16k

Sure Larry. I would like to add few points that I learned from my interaction with BioStar, I hope it will be useful for those areinterested in post-GWAS analysis. I will be adding my ideas in the discussion page - See you at the Wikigenes page.

ADD REPLYlink written 9.2 years ago by Khader Shameer18k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 774 users visited in the last hour