Control For Population Stratification In Exome Sequencing Data
2
3
Entering edit mode
12.6 years ago
Motor Genetic ▴ 110

I have next generation exome sequencing data and genotype calls for some case control samples and like to know that the identified rare variant is not because of admixed population. is there any good source of sequencing data that can be used to control this. i don't have the gwas for these samples and only exome seq so i can not do normal pc analyses. I think if we perform imputation using phased haplotypes by impute2, the identified mutation will disappear if it doesn't match with the background haplotype. any suggestions...thanks

next-gen sequencing population • 3.4k views
ADD COMMENT
3
Entering edit mode
12.6 years ago
Genotepes ▴ 950

Not sure I understand how the mutations will disappear. The ones of interest are usually rare, then they can arise on any haplotype.

Besides that, I think that in the exome you still have a interesting amount of "frequent" mutations (obviously not scores of MAF > 0.3 like in GWAs genotypes). This is a good basis to try to perform PCA-like studies (MDS from Plink is not bad).

in Ng et al. they show that more than 70% of one person's ns SNPs are common. I guess these may be even more the casse in synonymous. So although you will not have the same resource as in GWAs, you can still match your sample with 1000 G, for instance.

ADD COMMENT
1
Entering edit mode
12.6 years ago

With VAAST we distribute a background file containing many ethnic groups from 1K genomes. This seems to offset population stratification. I would be happy to help you try VAAST. All you need are the VCF variant files. We have had luck identifying several disease causing genes and genes underlying morphological traits.

ADD COMMENT

Login before adding your answer.

Traffic: 960 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6