Question: Is There A Package In R Or Software That Can Help In Finding Independent Signals From A Gwas? (Similar To Snap)
2
gravatar for lillo.sim
5.4 years ago by
lillo.sim40
United Kingdom
lillo.sim40 wrote:

Hi,

I have found ~ 200 significant associations at a specific p-value threshold by running a GWAS between SNPs and a phenotype. I can map the significant SNPs to human genes, but I would like to find the independent signals associated with the phenotype, i.e. if the genes where the SNPs map to are nearby and there is LD between these genes, then I would like to consider the associations as a unique signal. I think this is a normal step to find the independent signals in post-GWAS processing.

My question is, is there a way to do this in R or any software, so retrieving the LD between a list of SNPs (something like BiomaRt) and then using this information to find independent signals, maybe creating LD clusters? I don't know if this is the usual way of finding independent signals in a GWAS, if not, could you tell me how this is normally done?

Thank you for any advice/help/suggestion!

EDIT by Michael: This boils down to the question if there is a tool like SNAP but for local installation.

gwas ld • 3.7k views
ADD COMMENTlink modified 5.4 years ago by yao.h.19880 • written 5.4 years ago by lillo.sim40

What do you mean by "consider as a unique signal", do you mean to consider them jointly or giving them a compound score? I wouldn't call that independent, because they are in LD, it is quite the opposite.

ADD REPLYlink modified 5.4 years ago • written 5.4 years ago by Michael Dondrup45k

Hi Micael, thanks for your reply. No I don't want a score, just the genes/signals found in thie GWAS that are independent, so for ex. if SNP1 maps to gene1 and SNP2 maps to genes2, but gene1 and gene2 are in LD, then there will be only one signal from this region… Isn't this how usually the number of independent associated loci are found?

ADD REPLYlink written 5.4 years ago by lillo.sim40

Ok, I think I understand now. So, you wish to find if significant markers are in LD with each other given a cut-off for r2? LD is only measured for markers not for genes.

ADD REPLYlink modified 5.4 years ago • written 5.4 years ago by Michael Dondrup45k

that is not an answer i think this paper may help

Investigation of a genome wide association signal for obesity: synthetic association and haplotype analyses at the melanocortin 4 receptor gene locus.

ADD REPLYlink modified 5.4 years ago by Michael Dondrup45k • written 5.4 years ago by Medhat8.0k
1
gravatar for Michael Dondrup
5.4 years ago by
Bergen, Norway
Michael Dondrup45k wrote:

You can query pairwise LD for sets of markers using SNAP's pairwise LD query. This will retrieve a list of pairwise r2 values for your list of significant markers with the possibility of using different reference panels (e.g. 1000Genomes, HapMap) and populations. The markers absent from pairs or with r2 < threshold would then qualify for independence. Is that what you had in mind?

ADD COMMENTlink written 5.4 years ago by Michael Dondrup45k

Hi Michael, yes this is what I had in mind to find independent associations, thank you. I guess this is normally how scientists do this when they say for ex. there are #N of independent associations for a specific phenotype? The problem is that SNAP has a limit for the number of SNPs I can query, and I have more than 1,000 in other studies. The other problem is that I can only query based on rs ids, and I also have indels in my dataset. I also wanted this to be part of an automatic process, that is why I was hoping there was an equivalent R package to query the LD of SNPs and rare variants from the 1000genomes and a large numberof variants, and cluster them for example by LD. Do you or anybody know whether something like this exists? Thank you!

ADD REPLYlink modified 5.4 years ago • written 5.4 years ago by lillo.sim40

I didn't see a note about a limitation of the number of SNPs you can upload, also it has 1kG pilot 1 data, if that is sufficient. If not, we have done something similar (http://services.cbu.uib.no/software/ldsnpr/) but it is sort of 'the other way around'. Still, I could give you a HDF5 file with computed LD values for HapMap and 1kG. This file is in the wrong format though (organized by chromosome, no index on rsids) for fast search using rsids only, it needed to be loaded into a (SQL) database and the rsid columns needed to be indexed.

ADD REPLYlink modified 5.4 years ago • written 5.4 years ago by Michael Dondrup45k

Thanks, that would be great…I get an error using SNAP when I try to load a file with >1000 SNPs. I have variants by chromosome and positions, not by rs id, since as I was saying there are many variants with no rsid, so searching by positions would be better. Where could I download this list from please?

ADD REPLYlink written 5.4 years ago by lillo.sim40

I have currently a LD file generated for the EUR (meta) population only, it is 1.7GB. It is not a list though but in compressed binary HDF5 format. Before you try to work with it, I would like you to try the chr12 hapmap test-data to see if you can handle the format http://www.ii.uib.no/svn/eSysBio/Rpackages/LDsnpR/inst/extdata/ld_chrom12.h5. If you want a population like CEU, I would have to generate it first using a makefile. This will take about one week. Please let me know if that data format is ok for you, it contains also positional annotation for each pair.

I could also post the Makefile so you can try to build LD files yourself, that requires some additional software though.

ADD REPLYlink modified 5.4 years ago • written 5.4 years ago by Michael Dondrup45k

thank you for your help… I am having trouble viewing the file in HDF5 format, maybe I can do this by following this old thread if there are no other new ways… 1000 genomes LD calculation

ADD REPLYlink modified 5.4 years ago • written 5.4 years ago by lillo.sim40
1

We have used Intersnp (http://www.ncbi.nlm.nih.gov/pubmed/19837719) for LD calculation instead of PLINK. I will post an update to the old question: A: 1000 genomes LD calculation.

ADD REPLYlink modified 5.4 years ago • written 5.4 years ago by Michael Dondrup45k
0
gravatar for lillo.sim
5.4 years ago by
lillo.sim40
United Kingdom
lillo.sim40 wrote:

This package http://cran.r-project.org/web/packages/postgwas/postgwas.pdf can map the SNPs in LD with genes, but it does not give the LD between the SNPs, so if a SNP maps to/is in LD to genes that overlap then it still counts these genes as two different loci while it is in effect one.

ADD COMMENTlink modified 5.4 years ago • written 5.4 years ago by lillo.sim40
0
gravatar for Bioch'Ti
5.4 years ago by
Bioch'Ti1000
France (Avignon)
Bioch'Ti1000 wrote:

Hi I think that the MLMM R/Python package (Segura et al., Nature Genetics, 2012) may be appropriate to answer your question. The principle is simple. The first round, a MLM screen the genome for association and in a second run, the model take the strongest associated loci as a cofactor to perform a new run of detection and so on until there is no genetic variance anymore. By this way, it avoids to detect all the SNP in LD with the 'true' associated SNP you are searching for. It is a robust and fast/efficient method. You should give it a try: http://www.nature.com/ng/journal/v44/n7/full/ng.2314.html?WT.ec_id=NG-201207

Best, C.

ADD COMMENTlink written 5.4 years ago by Bioch'Ti1000

Hi C, this looks like a really nice tool but I was asking about post-GWAS annotations after I have run the association analysis..

ADD REPLYlink written 5.4 years ago by lillo.sim40
0
gravatar for yao.h.1988
5.4 years ago by
yao.h.19880
yao.h.19880 wrote:

I think Haploview 4.2 can find LD block in your significant SNPs. Then you can use other method for example biomaRt to find the gene these LD blocks or independent SNPs

ADD COMMENTlink written 5.4 years ago by yao.h.19880

Haploview is great to visualise an LD block, but I think I need genotype data to get the LD which I don't have. I only have summary information and want to find the significant independent signals...

ADD REPLYlink written 5.4 years ago by lillo.sim40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1520 users visited in the last hour