HLA Typing understanding
2
2
Entering edit mode
6.2 years ago

Hello All,

id          subgroup     HLA_A1                  
HG00096  GBR    01:01:01:01 / 01:01:01:02N  / 01:04N / 01:22N / 01:32 / 01:34N / 01:37

I want to know in 1000Genome HLA project I downloaded the data, but i am not able understand for one specific type of ID how we can have so many HLA_A1 - antigen/Allele combination, basically I am not able to understand thse different "/". Basically allele is same, but ag are differents.

kindly let me know if any one has worked on it.

SNP • 1.8k views
ADD COMMENT
3
Entering edit mode
5.7 years ago
Michele Busby ★ 2.2k

I am not sure at what level you are confused, so forgive me if this is too basic.

This is an explainer on how the HLAs are named:

http://hla.alleles.org/nomenclature/naming.html

Functionally, you only care about the first few digits.

HLA-A - this is the gene. There are a bunch of HLA genes. HLA-A, HLA-B, and HLA-C are the class I ones. HLA-A*02: This is the major class of HLA alleles, where everything that is an HLA-A 02 is similar to each other, i.e. will bind similar peptides.

HLA-A*02:01 represents a protein. This is what binds to the peptides and is usually what you care about.

HLA-A*02:01:03 represents the same protein. The third digit specifies different synonymous substitutions within the DNA sequence. These shouldn't affect peptide presentation but are interesting from the point of view of population genetics.

HLA-A*02:01:03:04 represents some type of variant in a noncoding region

HLA-A*02:01:03:04N represents some type change in expression level.

It is pretty easy to call up to four digits in Class I HLAs. Optitype does great. It is harder to call beyond that. So I am guessing they are either classing them together, or didn't have the coverage to unambiguously class the HLA beyond the protein.

ADD COMMENT
1
Entering edit mode
5.7 years ago
Garan ▴ 690

I think the readme on the download site might have the answer:

The HLA typing data of 1,267 individuals related to the 1000 Genomes Project (Table 1) covers 14 populations encompassing 4 major ancestral groups. After specific PCR amplification, exons were sequenced by Sanger technique. The sequences were compared to available sequence information in the HLA allele database on exons 2 and 3 for class I and on exon 2 class II genes, therefore any polymorphism occurring in exon 4 of class I allele or exon 3 of class II gene was not investigated. Typing ambiguities between alleles were allowed since HLA-A, HLA-B, HLA-C gene products have identical sequences in exon 2 and exon 3 antigen recognition sites. Similarly, for class II genes, typing ambiguities occur if HLA-DRB1, HLA-DQB1 gene products have identical sequences in exon 2 antigen recognition sites. (See Appendix S1 in supplemental material for further details). The Allele Database version used in the report is IMGT 2.26.0 (Jul 2009), effective Feb 2010.

ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/working/20140725_hla_genotypes/README_20140702_hla_diversity

The original paper has the same paragraph

http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0097282

I guess that since the sequences for exons 2 and 3 of HLA-A (in this case as it's Class I) are the same for the 7 HLA types mentioned, you would need to sequence exon 4 to obtain the exact HLA allele.

HGVS nomen from IPD-IMGT/HLA

A01:01:01:01* NM_002116.7:c.[203G>A; 271G>A; 282G>C; 299T>C; 301G>A; 341C>A; 385T>C; 489G>A; 521C>T; 527A>C; 538T>C; 539T>G; 545C>T; 555T>G; 559A>C; 560C>G; 570G>C; 571T>G; 1077C>T]

A01:01:01:02N* NM_002116.7:c.[203G>A; 271G>A; 282G>C; 299T>C; 301G>A; 341C>A; 385T>C; 489G>A; 521C>T; 527A>C; 538T>C; 539T>G; 545C>T; 555T>G; 559A>C; 560C>G; 570G>C; 571T>G; 1077C>T]

A01:04N (A01:04:01:01N) NM_002116.7:c.[203G>A; 271G>A; 282G>C; 299T>C; 301G>A; 341C>A; 385T>C; 489G>A; 521C>T; 527A>C; 538T>C; 539T>G; 545C>T; 555T>G; 559A>C; 560C>G; 570G>C; 571T>G; 627_628dupC; ; 1077C>T]

A*01:22N NM_002116.7:c.[203G>A; 271G>A; 282G>C; 299T>C; 301G>A; 341C>A; 385T>C; 489G>A; 521C>T; 527A>C; 538T>C; 539T>G; 545C>T; 555T>G; 559A>C; 560C>G; 570G>C; 571T>G; 750G>C; 751delG]

A*01:32 NM_002116.7:c.[203G>A; 271G>A; 282G>C; 299T>C; 301G>A; 341C>A; 385T>C; 489G>A; 521C>T; 527A>C; 538T>C; 539T>G; 545C>T; 555T>G; 559A>C; 560C>G; 570G>C; 571T>G; 622C>A]

A01:34N (A01:01:38L) NM_002116.7:c.[203G>A; 271G>A; 282G>C; 299T>C; 301G>A; 341C>A; 385T>C; 489G>A; 521C>T; 527A>C; 538T>C; 539T>G; 545C>T; 555T>G; 559A>C; 560C>G; 570G>C; 571T>G; 705G>A; 1077C>T]

A*01:37 NM_002116.7:c.[203G>A; 271G>A; 282G>C; 299T>C; 301G>A; 341C>A; 385T>C; 489G>A; 521C>T; 527A>C; 538T>C; 539T>G; 545C>T; 555T>G; 559A>C; 560C>G; 570G>C; 571T>G; 755C>T; 1077C>T]

ADD COMMENT

Login before adding your answer.

Traffic: 1947 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6