Just found this entry: Retrieving All Available Frequency Data For A Snp Using Ensembl Api Tools which is very close to what i need. Similar to Krisr I would like to retrieve all population frequency data available from 1000Genomes phase 1 for a SNP, if possible via SQL.
Ensembls Biomart provides minor allele information for the ALL superpopulation only. Pierre Lindenbaum's solution is almost getting me to the desired result - but when I run the sql statement (on homosapiensvariation6937), I only get results from 1000Genoms:pilot_1 - not from phase1.
select distinct V.name, S.handle, A.frequency, M.name, F.allele_string from ( allele as A, variation as V, subsnp_handle as S, variation_feature as F ) left join sample as M on (M.sample_id = A.sample_id ) where V.variation_id=A.variation_id and S.subsnp_id =A.subsnp_id and F.variation_id=V.variation_id and V.name="rs3" order by 2;
Any suggestions where I could find this data? Alternatively: is there a way to get the sql statements from bioperl - since Bert Overduin provided a nice perl-script (need sql for my workflow) ?