Looking for LDL GWAS summary stats in hg38
1
0
Entering edit mode
15 months ago
arturtjaro ▴ 40

Hi All,

I think last time I posted on here was nearly 10 years ago (!)

I'm looking for a way to get summary statistics for a GWAS on LDL levels, where the statistics are in hg38. I found a study titled "Genome-wide study for circulating metabolites identifies 62 loci and reveals novel systemic effects of LPA", with accession number GCST90132686. When I go to the GWAS Catalog and search for it, it returns the result here. Note that on the left, they claim that the catalog is in hg38.

If you click on the study, then find the study below, and click through the downloads to the study page, it claims to have summary statistics. This leads to the FTP site and download page here.

Now, the downloadable file specifically states it's in GRCh37, which is not what I need. It's also a recent study (2016), so it's surprising that it's not hg38.

In fact, if you go to the GWAS Catalog summary stats page and browse through recent studies, it looks like most summary statistics are in GRCh37/hg19, not GRCh38/hg38.

So my question: is there a way to specifically retrieve summary statistics for hg38, particularly for LDL levels, without resorting to a potentially error-prone liftover? Are most GWAS analyses still done in hg19?

Thanks! Artur

GWAS summary Catalog statistics hg38 • 1.2k views
ADD COMMENT
0
Entering edit mode

GWAS Catalog summary statistics are available in two formats, the original ones from the author and harmonised ones which have already been lifted over to 38. You can see the harmonised file in a subfolder called "harmonised". More info is on their methods page. Hope this helps!

ADD REPLY
0
Entering edit mode
15 months ago

It might work for you to use a new framework that I have recently implemented that allows to perform liftover of summary statistics in a carefree manner. Using the BCFtools plugins here, after all required resources are properly installed you could obtain LDL summary statistics for GRCh38 with the following command (takes ~1 minute to run after download):

wget http://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/GCST90132001-GCST90133000/GCST90132686/GCST90132686_buildGRCh37.txt.gz
bcftools +munge -Ou -C colheaders.tsv GCST90132686_buildGRCh37.txt.gz -f human_g1k_v37.fasta | \
bcftools +liftover -Ou -- -s human_g1k_v37.fasta -f Homo_sapiens_assembly38.fasta -c hg19ToHg38.over.chain.gz | \
bcftools sort -Oz -o GCST90132686_buildGRCh38.vcf.gz

However, this particular summary statistics file contains indels for which the information about which allele is the reference allele was not recorded. For some of these indels it is not immediately obvious to recover whether the variant was an insertion or a deletion and so it is also not immediately obvious how to execute a perfect liftover. These indels are tagged by the +munge plugin with the FILTER tag IFFY. Unfortunately this is a long standing issue that plagues summary statistics that are not in the GWAS-VCF format and that affects the GWAS-SSF format encouraged by the GWAS Catalog.

Out of 11,871,460 variants in the summary statistics file, 9,060 are lost and 20,252 have reference and alternate allele swapped (and the +liftover plugin will automatically handle the reversal of the effect size for these) as a result of a reference change. However, 831,454 out of 947,266 indels are classified as IFFY. Although only for a few small number of SNPs the effect allele is not the alternate allele (8,512 out of 10,923,413), a similar fraction of indels could have the same property and they could be improperly handled as a result. Had the authors released the summary statistics as GWAS-VCF files, this problem would not exist.

Do notice that BCFtools +liftover is the only VCF liftover tool that will handle swapping of the reference and alternate allele for indels and that will reverse the effect sizes when a swap occurs as a result of a reference change. I would advise not to use Picard LiftoverVcf or CrossMap VCF when working with summary statistics.

ADD COMMENT

Login before adding your answer.

Traffic: 3208 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6