Training dataset for NGS HLA typing (reads >200bp from PCR amplicons)
Entering edit mode
9.6 years ago

I'm looking for a training set from human HLA typing with long reads (454, IonTorrent or MiSeq 300bp) obtained by PCR (amplicon sequencing). I don't mind about MHC loci or if it's genomic or transcriptomic, but I need a dataset that contains:

- NGS reads
- Sequences of used primers
- Sequences of barcodes used to tag samples
- Reference genotypes of the samples to validate predictions (by Sanger sequencing or another well established method)

It's very hard to find any public data from literature. There a lot of papers about the topic, but most of them are from companies (for ex. Roche) and they don't publish the data.

Thanks in advance.

PD: HapMap and 1000 Genomes reads are not valid, they are not from PCR and they are too short ;)

NGS HLA Typing Amplicon • 2.8k views
Entering edit mode
6.4 years ago
Ömer An ▴ 260

The references to HLA typing tools might help as they usually train their software on public datasets:


Login before adding your answer.

Traffic: 2469 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6