Skip to content

Commit 07be522

Browse files
zhivko.tabakov@gmail.comjimregan
zhivko.tabakov@gmail.com
authored andcommitted
Issue 1351: OpenCL build - kernel_ThresholdRectToPix() not accounting for padding bits in the output pix?!
https://code.google.com/p/tesseract-ocr/issues/detail?id=1351 What steps will reproduce the problem? 1.Use tesseract build with OpenCL. 2.Pass full color image with width which is not multiple of 32. 3.Recognition is way too slow and does not recognize anything. I read the article on http://www.sk-spell.sk.cx/tesseract-meets-the-opencl-first-test and decided to give OCL a try. The initial result was as per point 3 above. After some debugging I figured the problem is that the OCL version of threshold rect generation does not account for padding bits in the output pix lines. To prove my discovery I made a quick fix in oclkernels.h replacing the definition of kernel_ThresholdRectToPix Just a reminder: it is necessary to force OCL kernel recompilation after changing this source (e.g. delete “kernel - <device>.bin” from the exec folder). The fix is working but I am not sure about it since the original source apparently works for other people (as per the article). If I am right the OS/GPU are irrelevant since the bug is algorithmic, but mine are Windows/AMD. Also similar fix is applicable to kernel_ThresholdRectToPix_OneChan(), but there the input array might have some padding bytes as well, so its indexing will need further adjustments. I can come with some prove/fix for it either - I have not played with it yet. Disclaimer: I have no prior experience with image processing and tesseract source or with GPU computing and OpenCL (but please do explain if I am wrong).
1 parent 7bc6d3e commit 07be522

File tree

1 file changed

+8
-7
lines changed

1 file changed

+8
-7
lines changed

opencl/oclkernels.h

+8-7
Original file line numberDiff line numberDiff line change
@@ -1045,19 +1045,19 @@ KERNEL(
10451045
// imageData is input image (24-bits/pixel)
10461046
// pix is output image (1-bit/pixel)
10471047
KERNEL(
1048-
\n#define CHAR_VEC_WIDTH 8 \n
1048+
\n#define CHAR_VEC_WIDTH 4 \n
10491049
\n#define PIXELS_PER_WORD 32 \n
10501050
\n#define PIXELS_PER_BURST 8 \n
10511051
\n#define BURSTS_PER_WORD (PIXELS_PER_WORD/PIXELS_PER_BURST) \n
10521052
typedef union {
10531053
uchar s[PIXELS_PER_BURST*NUM_CHANNELS];
1054-
uchar8 v[(PIXELS_PER_BURST*NUM_CHANNELS)/CHAR_VEC_WIDTH];
1054+
uchar4 v[(PIXELS_PER_BURST*NUM_CHANNELS)/CHAR_VEC_WIDTH];
10551055
} charVec;
10561056

10571057
__attribute__((reqd_work_group_size(256, 1, 1)))
10581058
__kernel
10591059
void kernel_ThresholdRectToPix(
1060-
__global const uchar8 *imageData,
1060+
__global const uchar4 *imageData,
10611061
int height,
10621062
int width,
10631063
int wpl, // words per line
@@ -1066,6 +1066,7 @@ void kernel_ThresholdRectToPix(
10661066
__global int *pix) {
10671067

10681068
// declare variables
1069+
uint pad = PIXELS_PER_WORD * wpl - width;//number of padding bits at the end of each output line
10691070
int pThresholds[NUM_CHANNELS];
10701071
int pHi_Values[NUM_CHANNELS];
10711072
for ( int i = 0; i < NUM_CHANNELS; i++) {
@@ -1076,22 +1077,22 @@ void kernel_ThresholdRectToPix(
10761077
// for each word (32 pixels) in output image
10771078
for ( uint w = get_global_id(0); w < wpl*height; w += get_global_size(0) ) {
10781079
unsigned int word = 0; // all bits start at zero
1079-
1080+
//decrease the pixel index for the padding at the end of each output line (=number of lines * padding)
1081+
uint pxIdxOffset = ( w / wpl) * pad;// = ( ( PIXELS_PER_WORD * w) / ( width + pad)) * pad;
10801082
// for each burst in word
10811083
for ( int b = 0; b < BURSTS_PER_WORD; b++) {
1082-
10831084
// load burst
10841085
charVec pixels;
10851086
for ( int i = 0; i < (PIXELS_PER_BURST*NUM_CHANNELS)/CHAR_VEC_WIDTH; i++ ) {
1086-
pixels.v[i] = imageData[w*(BURSTS_PER_WORD*(PIXELS_PER_BURST*NUM_CHANNELS)/CHAR_VEC_WIDTH) + b*((PIXELS_PER_BURST*NUM_CHANNELS)/CHAR_VEC_WIDTH) + i];
1087+
pixels.v[i] = imageData[w*(BURSTS_PER_WORD*(PIXELS_PER_BURST*NUM_CHANNELS)/CHAR_VEC_WIDTH) + b*((PIXELS_PER_BURST*NUM_CHANNELS)/CHAR_VEC_WIDTH) + i - pxIdxOffset];
10871088
}
10881089

10891090
// for each pixel in burst
10901091
for ( int p = 0; p < PIXELS_PER_BURST; p++) {
10911092
for ( int c = 0; c < NUM_CHANNELS; c++) {
10921093
unsigned char pixChan = pixels.s[p*NUM_CHANNELS + c];
10931094
if (pHi_Values[c] >= 0 && (pixChan > pThresholds[c]) == (pHi_Values[c] == 0)) {
1094-
word |= (0x80000000 >> ((b*PIXELS_PER_BURST+p)&31));
1095+
word |= (((uint)0x80000000) >> ((b*PIXELS_PER_BURST+p)&31));
10951096
}
10961097
}
10971098
}

0 commit comments

Comments
 (0)