Question: Downloading Snps From 1000Genomes, For A Given Individual.
2
gravatar for xenophiliuslovegood
7.4 years ago by
xenophiliuslovegood150 wrote:

Hi,

can you suggest a way to download from 1000genomes a set of SNPs (those located , for example, on chromosome 1) belonging to a certain individual?

Related to that, are all the individuals comparable when it comes to the reliability of the variation calling, or is there a subset which is more safe than others due to, say, a better sequencing technology, library preparation, etc...?

I'd appreciate if you could get me started on this.

genome samtools snp • 3.3k views
ADD COMMENTlink written 7.4 years ago by xenophiliuslovegood150
5
gravatar for Maxime Lamontagne
7.4 years ago by
Québec
Maxime Lamontagne2.1k wrote:

You can do it with tabix and vcftools.

Isolate SNPs for your region:

tabix -fh ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20101123/interim_phase1_release/ALL.chr12.phase1.projectConsensus.genotypes.vcf.gz 12:2345295-2345295 > genotypes.vcf

Change these values for your region: 12:2345295-2345295.

Isolate Individuals:

vcftools_0.1.4a/perl/vcf-subset -c NA06984,NA069860 genotypes.vcf > temp.vcf

NA06984,NA069860 are your individuals.

Extract in others format:

vcftools_0.1.4a/cpp/vcftools --vcf temp.vcf --plink --out Output

You can change the output format: vcftools.sourceforge.net/options.html

ADD COMMENTlink written 7.4 years ago by Maxime Lamontagne2.1k

if i want all the SNPs from chr12, then what positions should i mention ?

ADD REPLYlink written 4.8 years ago by cvu130

probably the lenght of chr12 from the 1000 genome browser, but it will take time. Maybe you can download all the data for chr12 and use this file as input. 

ADD REPLYlink written 4.8 years ago by Maxime Lamontagne2.1k
2
gravatar for Pascal
7.4 years ago by
Pascal1.4k
Barcelona
Pascal1.4k wrote:

Have a look also to the 1KG project page explaining how to get a subsection of VCF file (e.g. on a given chromosome, loci range).

Link: how-do-i-get-sub-section-vcf-file

Regards.

ADD COMMENTlink written 7.4 years ago by Pascal1.4k

The data slicer also allows you to pick a particular individual or population http://www.1000genomes.org/data-slicer

ADD REPLYlink written 7.3 years ago by Laura1.7k
1
gravatar for Damian Kao
7.4 years ago by
Damian Kao15k
USA
Damian Kao15k wrote:

You can download the SNP files in vcf format from their [?]ftp[?].

You can also view the SNP data from their 2010 study using their [?]genome browser[?]

You can read about how they did the SNP calling [?]here[?]

It was such a collaborative project with sequencing data from various sources, I don't think any strong case can be made as to which individual has much better data than another.

ADD COMMENTlink written 7.4 years ago by Damian Kao15k
1

Data quality does vary between individuals. Some are sequenced with old crappy 35bp reads to 4X coverage, while some with 100bp HiSeq reads to over 10X.

ADD REPLYlink written 7.4 years ago by lh331k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1705 users visited in the last hour