Skip to content

GPU-based implementation for the HLA genotype imputation method (HIBAG)

License

Notifications You must be signed in to change notification settings

Mignot-Lab/HIBAG.gpu

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HIBAG.gpu – GPU-based implementation for the HLA genotype imputation method

GPLv3 GNU General Public License, GPLv3

R package – a GPU-based extension for the HIBAG package

Requirements

News

Changes in v0.99.0 (Y2021):

  • new implementation using half and mixed precisions

  • a new function hlaAttrBagging_MultiGPU() to leverage multiple GPU devices

v0.9.0 (Y2018):

  • initial GPU-based implementation for the HIBAG algorithm

Benchmarks for model training

1) Speedup factors using small training sets (~1,000 samples)

CPU / GPU Precision Factor
CPU (AVX2, 1 thread) double 1
CPU (AVX2, 20 threads) double 14.7
1x NVIDIA Tesla T4 half 71.8
1x NVIDIA Tesla V100 half 74.7
1x NVIDIA Tesla T4 mixed 61.2
1x NVIDIA Tesla V100 mixed 68.3
1x NVIDIA Tesla T4 single 52.4
1x NVIDIA Tesla V100 single 61.3
1x NVIDIA Tesla V100 double 22.7

2) Speedup factors using medium training sets (~5,000 samples)

CPU / GPU Precision Factor
CPU (AVX2, 1 thread) double 1
CPU (AVX2, 20 threads) double 17.5
1x NVIDIA Tesla T4 half 101.6
1x NVIDIA Tesla V100 half 125.2
1x NVIDIA Tesla T4 mixed 72.0
1x NVIDIA Tesla V100 mixed 98.9
1x NVIDIA Tesla T4 single 60.8
1x NVIDIA Tesla V100 single 92.4
1x NVIDIA Tesla V100 double 18.4

† ‘mixed’ is a mixed precision between half and single

† models built on HLA-A, -B, -C, -DRB1 using HIBAG v1.26.1 and HIBAG.gpu v0.99.0, and the average is reported

† CPU (AVX2, 1/20 threads), optimization with Intel AVX2 instruction, using Intel(R) Xeon(R) Gold 6248 2.50GHz, 20 cores (Cascade Lake)

† This work was made possible, in part, through HPC time donated by Microway, Inc. We gratefully acknowledge Microway for providing access to their GPU-accelerated compute cluster (http://www.microway.com/gpu-test-drive/).

Citation

  1. Zheng, X. et al. HIBAG-HLA genotype imputation with attribute bagging. Pharmacogenomics Journal 14, 192-200 (2014). http://dx.doi.org/10.1038/tpj.2013.18

  2. Zheng, X. (2018) Imputation-Based HLA Typing with SNPs in GWAS Studies. In: Boegel S. (eds) HLA Typing. Methods in Molecular Biology, Vol 1802. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-8546-3_11

Installation

  • Development version from Github:
library("devtools")
install_github("zhengxwen/HIBAG.gpu")

The install_github() approach requires that you build from source, i.e. make and compilers must be installed on your system -- see the R FAQ for your operating system; you may also need to install dependencies manually.

wget --no-check-certificate https://github.com/zhengxwen/HIBAG.gpu/tarball/master -O HIBAG.gpu_latest.tar.gz
## or ##
curl -L https://github.com/zhengxwen/HIBAG.gpu/tarball/master/ -o HIBAG.gpu_latest.tar.gz

## Install ##
R CMD INSTALL HIBAG.gpu_latest.tar.gz

Examples

library(HIBAG.gpu)

## Loading required package: HIBAG
## HIBAG (HLA Genotype Imputation with Attribute Bagging)
## Kernel Version: v1.5 (64-bit, AVX2)
## 
## Available OpenCL device(s):
##     Dev #1: AMD, AMD Radeon Pro 560X Compute Engine, OpenCL 1.2
## 
## Using Device #1: AMD, AMD Radeon Pro 560X Compute Engine
##     Device Version: OpenCL 1.2 (>= v1.2: YES)
##     Driver Version: 1.2 (Jan 12 2021 22:17:03)
##     EXTENSION cl_khr_global_int32_base_atomics: YES
##     EXTENSION cl_khr_fp64: YES
##     EXTENSION cl_khr_int64_base_atomics: NO
##     EXTENSION cl_khr_global_int64_base_atomics: NO
##     CL_DEVICE_GLOBAL_MEM_SIZE: 4,294,967,296
##     CL_DEVICE_MAX_MEM_ALLOC_SIZE: 1,073,741,824
##     CL_DEVICE_MAX_COMPUTE_UNITS: 16
##     CL_DEVICE_MAX_WORK_GROUP_SIZE: 256
##     CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3
##     CL_DEVICE_MAX_WORK_ITEM_SIZES: 256,256,256
##     CL_DEVICE_LOCAL_MEM_SIZE: 32768
##     CL_DEVICE_ADDRESS_BITS: 32
##     CL_KERNEL_WORK_GROUP_SIZE: 256
##     CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE: 64
##     local work size: 256 (Dim1), 16x16 (Dim2)
##     atom_cmpxchg (enable cl_khr_int64_base_atomics): OK
## GPU device supports double-precision floating-point numbers
## Training uses a mixed precision between half and float in GPU
## By default, prediction uses 32-bit floating-point numbers in GPU (since EXTENSION cl_khr_int64_base_atomics: NO).
# make a "hlaAlleleClass" object
hla.id <- "A"
hla <- hlaAllele(HLA_Type_Table$sample.id,
    H1 = HLA_Type_Table[, paste(hla.id, ".1", sep="")],
    H2 = HLA_Type_Table[, paste(hla.id, ".2", sep="")],
    locus=hla.id, assembly="hg19")

# SNP predictors within the flanking region on each side
region <- 500   # kb
snpid <- hlaFlankingSNP(HapMap_CEU_Geno$snp.id, HapMap_CEU_Geno$snp.position,
    hla.id, region*1000, assembly="hg19")
length(snpid)

# training and validation genotypes
train.geno <- hlaGenoSubset(HapMap_CEU_Geno, snp.sel=match(snpid, HapMap_CEU_Geno$snp.id))
summary(train.geno)

# train a HIBAG model
set.seed(100)
model <- hlaAttrBagging_gpu(hla, train.geno, nclassifier=100)
summary(model)
## Gene: A
## Training dataset: 60 samples X 266 SNPs
##     # of HLA alleles: 14
##     # of individual classifiers: 100
##     total # of SNPs used: 239
##     avg. # of SNPs in an individual classifier: 15.60
##         (sd: 2.73, min: 10, max: 24, median: 16.00)
##     avg. # of haplotypes in an individual classifier: 40.44
##         (sd: 17.46, min: 18, max: 105, median: 36.00)
##     avg. out-of-bag accuracy: 93.00%
##         (sd: 4.94%, min: 78.95%, max: 100.00%, median: 93.48%)
## Matching proportion:
##         Min.     0.1% Qu.       1% Qu.      1st Qu.       Median      3rd Qu.
## 0.0004657777 0.0004773751 0.0005817518 0.0040887403 0.0112166087 0.0282807795
##          Max.         Mean           SD
##  0.4263384035 0.0393261556 0.0919757944
## Genome assembly: hg19
# validation
pred <- hlaPredict_gpu(model, train.geno)
summary(pred)

# compare
(comp <- hlaCompareAllele(hla, pred, call.threshold=0)$overall)
## total.num.ind crt.num.ind crt.num.haplo   acc.ind acc.haplo call.threshold
##            60          59           119 0.9833333 0.9916667              0
##  n.call call.rate
##      60         1

About

GPU-based implementation for the HLA genotype imputation method (HIBAG)

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 51.0%
  • R 43.9%
  • C 5.1%