Export of SNP data from 1000Genomes for several hundred genes
1
0
Entering edit mode
8.4 years ago

Hello

I'm working on a project that requires me to access SNP data for several hundred (human) genes. We have been using 1000 Genomes browser so far, but downloading the required and filtered data for each individual gene is taking a small eternity and I'd like to automate the process. Would Biomart allow me to export the same data that I can find through the 1000Genomes browser / Ensembl? There are two separate datasets for SNP-variation, with e.g. cosmic being in one and dbSNP in the other - with the 1000Genomes browser displaying all short variation data together. Bringing me to think I need to get data from both sets to get all the same data as through the manual download from 1000Genomes browser page? The options for filtering in Biomart are different from the browser, so it has been difficult to assess whether the files produced would be identical or not. Another option I have looked into is using a client directly through the REST interface directly for Ensemble..

I am filtering the data based on sequence of variation (missense, stop gained etc), so this is why I am simply not taking an slice of the vcf-files.

Thank you in advance for your time.

SNP 1000Genomes • 2.1k views
ADD COMMENT
0
Entering edit mode

Maria, you might consider showing us an example of the data that you wish to retrieve. Perhaps you can show us the first few lines of a file that contains the data you need for a genomic locus? I don't think I understand exactly what you're trying to download.

ADD REPLY
0
Entering edit mode
8.3 years ago
trausch ★ 1.9k

Maybe just slice the 1000 Genomes VCFs:

bcftools view <1000Genomes.vcf> [region] > slice.vcf

and then annotate with snpEff:

java -Xmx4g -jar snpEff.jar GRCh37.75 slice.vcf > slice.ann.vcf
ADD COMMENT

Login before adding your answer.

Traffic: 3170 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6