Question: Getting Population Specific Allele Frequency For Illumina Genotyping Data
0
gravatar for yling1
5.4 years ago by
yling10
United States
yling10 wrote:

I want to get allele frequencies of Illumina Omni2.5 in European population only and my file contains information on chr, pos, allele1, and allele2.  I though there could be two ways of doing it:

1. Annotate Reference and Alternate allele for each SNP (how??) and use Annovar to get 1000 genome population specific allele frequency

2. Download allele frequency information from HapMap/1000 Genome directly.

Does any one has any good suggestions on how to go about it or recommend any software?

Thanks in advance! 

 

snp sequence • 2.2k views
ADD COMMENTlink modified 5.3 years ago by Charles Warden7.2k • written 5.4 years ago by yling10

I wouldn't use HapMap frequencies, since they didn't use the 2.5 chip in their genotyping (according to the genotyping platform filter in HapMart)

ADD REPLYlink written 5.4 years ago by Katie D'Aco1000

@Katie D'Aco Then what would you use?

ADD REPLYlink written 4.5 years ago by Danielson40

I think 1000 genomes frequencies would be fine to use here.

ADD REPLYlink written 4.5 years ago by Katie D'Aco1000

I though the HapMap frequencies were deduced from the 1000genomes project. I hoped new projects, such as kaviar, would supersede the original 1000genomes. 

ADD REPLYlink written 4.5 years ago by Danielson40
0
gravatar for Charles Warden
5.3 years ago by
Charles Warden7.2k
Duarte, CA
Charles Warden7.2k wrote:

SeattleSNP can use a "custom" file format that will accept the FinalReport allele calls with a bit of reformatting (remove rows without allele calls):

http://snp.gs.washington.edu/SeattleSeqAnnotation138/

It will provide the reference sequence as well as specifically provide the European HapMap frequency.

That said, be careful how you export your data.  I think the default in GenomeStudio is to use the "TOP" format.  However, this often won't match the genomic reference.  Once you know the built for which the genome coordinates are defined (e.g. hg18 or hg19), you should use the "Plus" format to get the alleles with respect to that reference.  If you can only get hg18 coordinates, you'll need to use liftOver (in Galaxy, etc.) to convert them to hg19.

ADD COMMENTlink modified 5.3 years ago • written 5.3 years ago by Charles Warden7.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 780 users visited in the last hour