Finding CEU / NA12891 data on 1000 genome project
1
2
Entering edit mode
7.1 years ago
jnowacki ▴ 100

Why is it so hard to find CEU / NA12891 on 1000 Genomes:

If I do a site search:

http://www.internationalgenome.org/search/?q=NA12891

Nothing comes up. If I do a browse via the data portal filter CEU samples here:

http://www.internationalgenome.org/data-portal/sample

183 CEU samples show up but no NA12891

Yet clearly the data exists:

http://www.internationalgenome.org/data-portal/sample/NA12891

Why is it so hard to find? I found this thread describing a goal of excluding "all samples who were at least 2nd degree relations of other samples in the set, this was to avoid double counting alleles when calculating allele frequencies." So I guess it doesn't fit with the primary goal of 1,000 genomes but is there a way to find ALL data that's stored at that repository?

A: Why the CEU trio's parents NA12891 and NA12892 are not in 1000 Genomes phase 3

NA12891 CEU 1000 Genomes • 2.2k views
ADD COMMENT
1
Entering edit mode

I don't understand what you mean that it's hard to find. You found it by doing a search, so what's hard to find about it?

ADD REPLY
0
Entering edit mode
7.1 years ago
$ curl -s "ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/current.tree" | grep NA12891

ftp/data_collections/illumina_platinum_pedigree/data/CEU/NA12891/alignment  directory   263 Wed Oct 14 19:36:48 2015
ftp/data_collections/illumina_platinum_pedigree/data/CEU/NA12891/alignment/NA12891.alt_bwamem_GRCh38DH.20150706.CEU.illumina_platinum_ped.cram.crai file    2841310 Wed Oct 14 08:04:36 2015    a91d56f2e2705b33b0da27ba845707db
ftp/data_collections/illumina_platinum_pedigree/data/CEU/NA12891/alignment/NA12891.alt_bwamem_GRCh38DH.20150706.CEU.illumina_platinum_ped.bam.bas   file    631 Fri Aug 28 15:03:39 2015    37ca80626ca2b9838d689884c0dd878a
ftp/data_collections/illumina_platinum_pedigree/data/CEU/NA12891/alignment/NA12891.alt_bwamem_GRCh38DH.20150706.CEU.illumina_platinum_ped.cramfile  39317315882 Wed Oct 14 06:52:46 2015    c98e5b10b9bc16c2f012adf09d16d72a
ftp/data_collections/1000_genomes_project/data/CEU/NA12891/exome_alignment  directory   215 Tue Nov 10 11:12:10 2015
ftp/data_collections/1000_genomes_project/data/CEU/NA12891/exome_alignment/NA12891.alt_bwamem_GRCh38DH.20150826.CEU.exome.cram  file    6932970668  Mon Nov  9 07:07:52 2015    505e4a4918970e5941d8044f074dca2c
ftp/data_collections/1000_genomes_project/data/CEU/NA12891/exome_alignment/NA12891.alt_bwamem_GRCh38DH.20150826.CEU.exome.bam.bas   file    610 Sun Sep 13 14:13:49 2015    88bf398f9d92c17ddae54bf3c652e335
ftp/data_collections/1000_genomes_project/data/CEU/NA12891/exome_alignment/NA12891.alt_bwamem_GRCh38DH.20150826.CEU.exome.cram.crai file    487506  Mon Nov  9 07:07:53 2015    c2f3a4d325f3d4870125da798bd16095
ftp/data_collections/1000_genomes_project/working/20160216_pgenome_fastas/NA12891_pgenome_hg19.tgz  file    1769735888  Tue Feb  9 01:19:41 2016    046b919cbd8d98a431e23ea6be831050
ftp/phase1/technical/other_exome_alignments/NA12891/exome_alignment directory   212 Tue Dec 13 12:53:15 2011
ftp/phase1/technical/other_exome_alignments/NA12891/exome_alignment/NA12891.mapped.ILLUMINA.BWA.CEU.exome.20110521.bam.bas  file    543 Tue Jul 12 13:36:09 2011    21a42f7542a0e2eda7905208e31d3105
ftp/phase1/technical/other_exome_alignments/NA12891/exome_alignment/NA12891.mapped.ILLUMINA.BWA.CEU.exome.20110521.bam.bai  file    7847928 Tue Jul 12 09:09:52 2011    12c72a2f258611a1d76e5e00705d1763
(...)
ADD COMMENT
0
Entering edit mode

That is awesome. Thank you!

ADD REPLY
0
Entering edit mode

if it answered your question, please check the green mark on the left to close, thanks.

ADD REPLY

Login before adding your answer.

Traffic: 2132 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6