Question: 1000 Genomes and ESP Populations in Exome Aggregation Consortium Data
0
gravatar for Vivek
4.6 years ago by
Vivek2.3k
Denmark
Vivek2.3k wrote:

Is there any documented information on how many of the ESP/1000 genomes samples were included in the ExAC data release? I was under the impression that all samples were included but when I'm trying to annotate a few SNPs I can see some discordance in the allele frequencies.

For example I see this Exonic SNP with 0.119 allele frequency in 1000 genomes Phase 3 dataset but this cannot be found in ExAC data

5    131705587    rs13180043    C    T    100    PASS    AF=0.11901

Another example, present in both ESP & 1000 genomes but not in ExAC

1    156108976    rs7339    G    C    .    PASS    DBSNP=dbSNP_52;EA_AC=246,2936;AA_AC=561,823;TAC=807,3759;

1    156108976    rs7339    G    C    100    PASS    AF=0.185304;

I'd like to know preferably what kind of overlap exists between these 3 population sets and if possible what kind of capture regions were used for ExAC data.

ADD COMMENTlink modified 12 months ago by Biostar ♦♦ 20 • written 4.6 years ago by Vivek2.3k

Is there any source to your assumption that all 1000g/ESP samples were included in ExAC? The About page speaks of an analysis from scratch, which would imply that the results are independent of 1000g or ESP.

ADD REPLYlink written 4.6 years ago by RamRS22k

Check under contributing projects: http://exac.broadinstitute.org/about

Even if they did variant calling from the scratch if you include a sample set, you expect to see a SNP with high enough allele frequency in the population.

ADD REPLYlink modified 4.6 years ago • written 4.6 years ago by Vivek2.3k

That makes sense. I guess you could always email them for specific details or check if they have a preprint that you could read.

ADD REPLYlink written 4.6 years ago by RamRS22k
2
gravatar for rbagnall
4.6 years ago by
rbagnall1.4k
Australia
rbagnall1.4k wrote:

Neither of the variants are in the called regions. 

5	131701132	131701341	-	hsa-mir-3936
5	131705614	131706109	+	SLC22A5|ice_target_65179

1	156108820	156108949	+	LMNA|ice_target_14401
1	156109511	156109680	+	LMNA|ice_target_14402

See here for the full bed file of called regions:

ftp://ftp.broadinstitute.org/pub/ExAC_release/release0.1/exome_calling_regions.v1.interval_list

 

 

ADD COMMENTlink written 4.6 years ago by rbagnall1.4k

Thank you! This is what I was looking for. They don't have this for the latest release (0.2) but I guess its safe to assume it would be the same.

ADD REPLYlink modified 4.6 years ago • written 4.6 years ago by Vivek2.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1758 users visited in the last hour