-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rearchitect and Reinstate GPU #114
base: master
Are you sure you want to change the base?
Conversation
onurulgen
commented
Aug 29, 2024
- Reinstated GPU into the project by fixing or implementing missing functions.
- Rearchitected the project to combine GPU and CPU implementations by using virtual functions.
- Increased the performance up to 11 times by using GPU.
- Implemented unit and regression tests for GPU and CPU by using Catch2 framework.
- Developed GitHub Actions for tests, coverage, static code analysis, and producing CPU and GPU executables for Windows, Linux and macOS.
Welcome to Codecov 🎉Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests. Thanks for integrating Codecov - We've got you covered ☂️ |
✅Code Analysis Results - no issues found! ✅ |
I had to add Besides that, the build worked, and I was able to run the program. |
CMAKE_CUDA_ARCHITECTURES are set in here. If you need to set that, your cmake configure command must've failed while detecting CUDA, right? |
Yes, that is correct. It failed as soon as it reached line 156 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 670 out of 685 changed files in this pull request and generated 1 comment.
Files not reviewed (15)
- .github/workflows/linux.yml: Language not supported
- .github/workflows/macos.yml: Language not supported
- .github/workflows/windows.yml: Language not supported
- CMakeLists.txt: Language not supported
- Doxyfile.in: Language not supported
- cmake/FindOPENCL.cmake: Language not supported
- cmake/NIFTYREGConfig.cmake.in: Language not supported
- niftyreg_build_version.txt: Language not supported
- reg-apps/CMakeLists.txt: Language not supported
- reg-apps/reg_average.cpp: Language not supported
- reg-apps/reg_benchmark.cpp: Language not supported
- reg-apps/reg_gpuinfo.cpp: Language not supported
- reg-apps/reg_jacobian.cpp: Language not supported
- reg-apps/reg_measure.cpp: Language not supported
- reg-apps/reg_resample.cpp: Language not supported
Comments suppressed due to low confidence (1)
.github/workflows/tests.yml:32
- The use of
${{ matrix.sudo }}
might not be necessary for all platforms. Consider removing it or using a conditional statement.
${{ matrix.sudo }} cmake --build build/ --target install --config ${{ matrix.build_type }}
if (image == nullptr) | ||
throw std::runtime_error("Failed to read image from path " + path); | ||
|
||
size_t brickSize = image->nbyper * image->nx * image->ny * image->nz; |
Check failure
Code scanning / CodeQL
Multiplication result converted to larger type High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix AI 20 days ago
To fix the problem, we need to ensure that the multiplication is performed using the larger type (size_t
) to avoid overflow. This can be achieved by casting one of the operands to size_t
before performing the multiplication. This way, the entire multiplication will be done in the size_t
type, preventing overflow.
We will modify the line where the multiplication occurs to cast image->nbyper
to size_t
before the multiplication.
-
Copy modified line R1299 -
Copy modified line R1311
@@ -1298,3 +1298,3 @@ | ||
|
||
size_t brickSize = image->nbyper * image->nx * image->ny * image->nz; | ||
size_t brickSize = static_cast<size_t>(image->nbyper) * image->nx * image->ny * image->nz; | ||
image->data = calloc(1, nifti_get_volsize(image)); | ||
@@ -1310,3 +1310,3 @@ | ||
|
||
size_t brickSize = image->nbyper * image->nx * image->ny * image->nz; | ||
size_t brickSize = static_cast<size_t>(image->nbyper) * image->nx * image->ny * image->nz; | ||
image->data = calloc(1, nifti2_get_volsize(image)); |
nifti_image *niiImage = nullptr; | ||
if (readData) { | ||
|
||
uch *image_data = static_cast<uch*>(malloc(width * height * channels * sizeof(uch))); |
Check failure
Code scanning / CodeQL
Multiplication result converted to larger type High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix AI 20 days ago
To fix the problem, we need to ensure that the multiplication is performed using a larger integer type to avoid overflow. This can be done by casting one of the operands to unsigned long
before performing the multiplication. This way, the multiplication will be done using the larger type, and the result will be correctly represented.
The specific change involves casting one of the variables (width
, height
, or channels
) to unsigned long
before the multiplication. This change should be made on line 83 of the file reg-io/png/reg_png.cpp
.
-
Copy modified line R83
@@ -82,3 +82,3 @@ | ||
|
||
uch *image_data = static_cast<uch*>(malloc(width * height * channels * sizeof(uch))); | ||
uch *image_data = static_cast<uch*>(malloc(static_cast<unsigned long>(width) * height * channels * sizeof(uch))); | ||
if (image_data == nullptr) |
reduction(+:gg, dgg) | ||
#endif | ||
for (i = 0; i < num; i++) { | ||
gg += array2Ptr[i] * array1Ptr[i]; |
Check failure
Code scanning / CodeQL
Multiplication result converted to larger type High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix AI 20 days ago
To fix the problem, we need to ensure that the multiplication is performed using the larger type (double
) to avoid overflow. This can be achieved by casting the operands to double
before performing the multiplication. Specifically, we need to update the lines where the multiplication occurs to cast array2Ptr[i]
and array1Ptr[i]
to double
.
-
Copy modified lines R277-R278 -
Copy modified lines R291-R292
@@ -276,4 +276,4 @@ | ||
for (i = 0; i < num; i++) { | ||
gg += array2Ptr[i] * array1Ptr[i]; | ||
dgg += (gradientPtr[i] + array1Ptr[i]) * gradientPtr[i]; | ||
gg += static_cast<double>(array2Ptr[i]) * static_cast<double>(array1Ptr[i]); | ||
dgg += (static_cast<double>(gradientPtr[i]) + static_cast<double>(array1Ptr[i])) * static_cast<double>(gradientPtr[i]); | ||
} | ||
@@ -290,4 +290,4 @@ | ||
for (i = 0; i < numBw; i++) { | ||
ggBw += array2PtrBw[i] * array1PtrBw[i]; | ||
dggBw += (gradientPtrBw[i] + array1PtrBw[i]) * gradientPtrBw[i]; | ||
ggBw += static_cast<double>(array2PtrBw[i]) * static_cast<double>(array1PtrBw[i]); | ||
dggBw += (static_cast<double>(gradientPtrBw[i]) + static_cast<double>(array1PtrBw[i])) * static_cast<double>(gradientPtrBw[i]); | ||
} |
#endif | ||
for (i = 0; i < num; i++) { | ||
gg += array2Ptr[i] * array1Ptr[i]; | ||
dgg += (gradientPtr[i] + array1Ptr[i]) * gradientPtr[i]; |
Check failure
Code scanning / CodeQL
Multiplication result converted to larger type High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix AI 20 days ago
To fix the problem, we need to ensure that the multiplication is performed using the larger type double
to avoid overflow. This can be done by casting the operands to double
before performing the multiplication. This change should be made in the file reg-lib/Optimiser.cpp
on line 278.
-
Copy modified lines R277-R278
@@ -276,4 +276,4 @@ | ||
for (i = 0; i < num; i++) { | ||
gg += array2Ptr[i] * array1Ptr[i]; | ||
dgg += (gradientPtr[i] + array1Ptr[i]) * gradientPtr[i]; | ||
gg += static_cast<double>(array2Ptr[i]) * static_cast<double>(array1Ptr[i]); | ||
dgg += (static_cast<double>(gradientPtr[i]) + static_cast<double>(array1Ptr[i])) * static_cast<double>(gradientPtr[i]); | ||
} |
reduction(+:dggBw) | ||
#endif | ||
for (i = 0; i < numBw; i++) { | ||
ggBw += array2PtrBw[i] * array1PtrBw[i]; |
Check failure
Code scanning / CodeQL
Multiplication result converted to larger type High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix AI 20 days ago
To fix the problem, we need to ensure that the multiplication is performed using double
precision to avoid overflow. This can be achieved by casting one of the operands to double
before performing the multiplication. Specifically, we will cast array2PtrBw[i]
to double
in the multiplication expression.
-
Copy modified line R291
@@ -290,3 +290,3 @@ | ||
for (i = 0; i < numBw; i++) { | ||
ggBw += array2PtrBw[i] * array1PtrBw[i]; | ||
ggBw += static_cast<double>(array2PtrBw[i]) * array1PtrBw[i]; | ||
dggBw += (gradientPtrBw[i] + array1PtrBw[i]) * gradientPtrBw[i]; |
this->totalBlock[i] = in->totalBlock[i]; | ||
|
||
this->referencePosition = (float *)malloc(this->activeBlockNumber * this->dim * sizeof(float)); | ||
this->warpedPosition = (float *)malloc(this->activeBlockNumber * this->dim * sizeof(float)); |
Check failure
Code scanning / CodeQL
Multiplication result converted to larger type High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix AI 20 days ago
To fix the problem, we need to ensure that the multiplication is performed using a larger integer type to avoid overflow. This can be achieved by casting one of the operands to size_t
before performing the multiplication. This way, the multiplication will be done using the larger type, and the result will be correctly handled by the malloc
function.
Specifically, we will cast this->activeBlockNumber
to size_t
before multiplying it by this->dim
on lines 35 and 36. This change will ensure that the multiplication is performed using the size_t
type, preventing any potential overflow.
-
Copy modified lines R35-R36
@@ -34,4 +34,4 @@ | ||
|
||
this->referencePosition = (float *)malloc(this->activeBlockNumber * this->dim * sizeof(float)); | ||
this->warpedPosition = (float *)malloc(this->activeBlockNumber * this->dim * sizeof(float)); | ||
this->referencePosition = (float *)malloc((size_t)this->activeBlockNumber * this->dim * sizeof(float)); | ||
this->warpedPosition = (float *)malloc((size_t)this->activeBlockNumber * this->dim * sizeof(float)); | ||
for(int i=0; i<this->activeBlockNumber*this->dim ; ++i){ |
//params->activeBlock = (int *)malloc(params->activeBlockNumber * sizeof(int)); | ||
params->referencePosition = (float *)malloc(params->activeBlockNumber * params->dim * sizeof(float)); | ||
params->referencePosition = (float *)malloc(params->activeBlockNumber * params->dim * sizeof(float)); |
Check failure
Code scanning / CodeQL
Multiplication result converted to larger type High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix AI 20 days ago
To fix the problem, we need to cast the operands of the multiplication to a larger type before performing the multiplication. This will ensure that the multiplication is done using the larger type, preventing overflow. Specifically, we should cast params->activeBlockNumber
and params->dim
to size_t
before multiplying them. This change should be made on line 269 in the file reg-lib/cpu/_reg_blockMatching.cpp
.
-
Copy modified lines R269-R270
@@ -268,4 +268,4 @@ | ||
//params->activeBlock = (int *)malloc(params->activeBlockNumber * sizeof(int)); | ||
params->referencePosition = (float *)malloc(params->activeBlockNumber * params->dim * sizeof(float)); | ||
params->warpedPosition = (float *)malloc(params->activeBlockNumber * params->dim * sizeof(float)); | ||
params->referencePosition = (float *)malloc((size_t)params->activeBlockNumber * (size_t)params->dim * sizeof(float)); | ||
params->warpedPosition = (float *)malloc((size_t)params->activeBlockNumber * (size_t)params->dim * sizeof(float)); | ||
|
{ | ||
voxel[0]= (double) deformationFieldPtrX[index]; | ||
voxel[1]= (double) deformationFieldPtrY[index]; | ||
voxel[2]= (double) deformationFieldPtrZ[index]; | ||
} | ||
reg_mat44_mul(&transformationMatrix, voxel, position); | ||
Mat44Mul(transformationMatrix, voxel, position); |
Check failure
Code scanning / CodeQL
Multiplication result converted to larger type High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix AI 20 days ago
To fix the problem, we need to ensure that the multiplication is performed using a larger integer type to avoid overflow. We can achieve this by casting one of the operands to size_t
before performing the multiplication. This will ensure that the multiplication is done using the larger type, preventing overflow.
- Cast one of the operands to
size_t
before performing the multiplication. - Specifically, cast
z
tosize_t
on line 106. - No additional methods, imports, or definitions are needed to implement this change.
-
Copy modified line R106
@@ -105,3 +105,3 @@ | ||
{ | ||
index=z*deformationFieldImage->nx*deformationFieldImage->ny; | ||
index=(size_t)z*deformationFieldImage->nx*deformationFieldImage->ny; | ||
voxel[2]=(double) z; |
const double *logHistoPtr = jointHistogramLog[currentTimePoint]; | ||
const double *entropyPtr = entropyValues[currentTimePoint]; | ||
const double nmi = (entropyPtr[0] + entropyPtr[1]) / entropyPtr[2]; | ||
const size_t referenceOffset = referenceBinNumber[currentTimePoint] * floatingBinNumber[currentTimePoint]; |
Check failure
Code scanning / CodeQL
Multiplication result converted to larger type High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix AI 20 days ago
To fix the problem, we need to ensure that the multiplication is performed using the larger type (size_t
) to prevent overflow. This can be achieved by casting one of the operands to size_t
before performing the multiplication. This way, the multiplication will be done using the larger type, and the result will be correctly stored in the size_t
variable.
The specific change involves casting referenceBinNumber[currentTimePoint]
to size_t
before multiplying it with floatingBinNumber[currentTimePoint]
.
-
Copy modified line R428
@@ -427,3 +427,3 @@ | ||
const double nmi = (entropyPtr[0] + entropyPtr[1]) / entropyPtr[2]; | ||
const size_t referenceOffset = referenceBinNumber[currentTimePoint] * floatingBinNumber[currentTimePoint]; | ||
const size_t referenceOffset = static_cast<size_t>(referenceBinNumber[currentTimePoint]) * floatingBinNumber[currentTimePoint]; | ||
const size_t floatingOffset = referenceOffset + referenceBinNumber[currentTimePoint]; |
const double *logHistoPtr = jointHistogramLog[currentTimePoint]; | ||
const double *entropyPtr = entropyValues[currentTimePoint]; | ||
const double nmi = (entropyPtr[0] + entropyPtr[1]) / entropyPtr[2]; | ||
const size_t referenceOffset = referenceBinNumber[currentTimePoint] * floatingBinNumber[currentTimePoint]; |
Check failure
Code scanning / CodeQL
Multiplication result converted to larger type High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix AI 20 days ago
To fix the problem, we need to ensure that the multiplication is performed using a larger integer type to prevent overflow. This can be achieved by casting the operands to size_t
before performing the multiplication. This way, the multiplication will be done using the larger type, and the result will be correctly assigned to the size_t
variable.
-
Copy modified line R516
@@ -515,3 +515,3 @@ | ||
const double nmi = (entropyPtr[0] + entropyPtr[1]) / entropyPtr[2]; | ||
const size_t referenceOffset = referenceBinNumber[currentTimePoint] * floatingBinNumber[currentTimePoint]; | ||
const size_t referenceOffset = static_cast<size_t>(referenceBinNumber[currentTimePoint]) * static_cast<size_t>(floatingBinNumber[currentTimePoint]); | ||
const size_t floatingOffset = referenceOffset + referenceBinNumber[currentTimePoint]; |