Question: Getting derived alleles from alternate/reference allele data
0
gravatar for Mr Locuace
3.8 years ago by
Mr Locuace90
Chile
Mr Locuace90 wrote:

Hello, I am new to genomics and a bit confused with an issue that might sound trivial for many of you. It has to do with ancestral and derived alleles from SNP data.

I have 2 data files (df1 and df2) containing SNP information from two human populations generated by PLINK (.frq format). These files contain the minor (A1) and major (A2) alleles for every SNP. As I understand, minor and major alleles correspond to alternate and reference alleles, respectively (is this correct?). Moreover, I understand that USUALLY reference alleles are the same as ancestral alleles.

So, for my SNP list, in order to get their ancestral alleles, I downloaded the file "SNPAncestralAllele.bcp" following the instructions given here: http://www.ncbi.nlm.nih.gov/books/NBK44409/#Build.allele_bcp_gz_and_snpancestralalle

Now comes the part where I am confused. When I do a match for ref alleles in df1 or df2 and ancestral alleles in  "SNPAncestralAllele.bcp", I get that aprox. 2/5 of the ref SNPs do not match the ancestral alleles in df3. I am surprised that so many ref alleles do not match the ancestral alleles from "SNPAncestralAllele.bcp". Could this be due to major alleles from df1 and df2 occuring in the (-) DNA strand?

So, is my reasoning correct?, and can I assume that the alternate (minor) alleles in the SNPs that do not match represent the derived alleles in my data?

Thanks in advance for your help !

 

 

ADD COMMENTlink modified 3.7 years ago by Biostar ♦♦ 20 • written 3.8 years ago by Mr Locuace90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2631 users visited in the last hour