High Accuracy and High Precision HLA Typing Using Illumina Reads: Typing Algorithm and Perspectives on Novel Allele Discovery

Authors: György Horváth, Krisztina Rigó, Tim Hague, Attila Bérces, Szilveszter Juhos

Wide acceptance of high-throughput next-generation sequencers in medicinal genomics and clinics brought the promise of fast HLA typing. We are presenting the algorithmic challenges and workflow of HLA typing with a special attention to ambiguous or poorly defined alleles, differences in intronic and UTR sequences and reads mapping to homologous regions. Our findings show that high-quality paired reads covering most or all of the genomic reference makes it possible to accurately determine novel alleles and find small differences both inside and outside exons. HLA typing from NGS reads is possible even if there is no available genomic reference, furthermore, it is possible to estimate the missing sequence parts if a similar genomic reference is available. Results derived from public data shows that occurrence of novel alleles are actually rather frequent. The presented typing algorithm produces quantitative measures to make quality control (QC) more reliable. Other issues related to reproducibility and precision are also discussed; e.g. choice of the reference database, six or eight digits typing, allele dropout and dealing with chimeric reads.

