PhD student University of São Paulo Ribeirao Preto, Sao Paulo, Brazil
Aim: To develop a strategy to analyze HLA-B in exome data by combining unbiased read alignment and SNP detection with allele imputation using a multi-ethnic reference panel.
Methods: Initially, we applied the hla-mapper v5 pipeline to BAM files from 233 Brazilian exomes (test samples) to correct mapping errors caused by high similarity between HLA genes. This step included specific remapping of the HLA region, variant normalization, and decomposition of multiallelic variants. We utilized a robust multi-ethnic reference panel including 5,196 high-coverage genomes (30X) from the 1000 Genomes, HGDP, and SABE projects. Based on this resource, we built an imputation model with the HIBAG package using 200 classifiers in an ensemble technique (attribute bagging), which combines the statistical power of multiple SNPs in different subsets to predict HLA alleles. Determining the most likely allele combination for each sample involved analyzing the posterior probability distribution produced by the model.
Results: The results showed high reliability of the method, with 97.4% of samples achieving posterior probabilities greater than 0.5, and 79% exceeding the 0.8 threshold, metrics generated by HIBAG that are considered indicators of imputation quality. Through systematic analysis of our test cohort, we identified genomic regions more susceptible to unbalanced heterozygosity (where one allele is substantially more captured than another), particularly in exons with high concentration of polymorphisms, illustrating regions that tend to show bias when performing exome-based HLA typing.
Conclusion: Our strategy offers an alternative for HLA-B typing using exome data, especially relevant for population and clinical studies in diverse cohorts. This method expands the possibilities for immunogenetic analysis across all populations, including those that are challenging for existing imputation methods due to underrepresentation in reference panels. By reducing potential bias in allele definition - particularly in admixed groups such as Brazilians - our approach contributes to a more comprehensive and equitable understanding of HLA system diversity in global contexts.