Tks JJ.

The bugs in mine version are probably because i forgot to include a routine to check for numbers outside the range of the table as you did with:

` cmp ebx, [esi] ; <left?`

js @err

cmp ebx, [edx] ; >right?

jg @err

But, i was amazed it was also fast (despite the errors).

My version is not foolproof yet, either: search the source for seed, there is one that produces in pos 5/6 two very close numbers.

Maybe another final check for the Loword of the HIDWORD should make the trick to fix. It would work as the original verson but immediatelly after

` sub edx, pTable ; middle pos - original left`

sar edx, 3

xchg eax, edx

So, it could compare the index value found at eax, with the LOWORD of the HIDWORD of the next value (or previous one). If it also matches, then we have found the proper values.

I would have to see concrete applications of such an algo. It is indeed fast, maybe that's the only thing that counts.

Well...i can think on a "few" pratical applications where this couldd be used :icon_mrgreen: :icon_mrgreen: :icon_mrgreen: A faster binary search (for Real8 on this particular case and this situations of scanning on a ordered list) can be of extreme use for video, image and audio processing. On the functions i´m creating for image processing, such algorithm is a must, speciaqlly because colorpaces functions makes heavy usage of complex mathematical equations that could be easily replaced by simple pointers to tables and here iss where a faster binary search can be used.

For example, on my original version of CIELCHtoRGB, the part that actually converts the Value (In KFactor) to Red, Green or Blue colors are given by the following formula:

FinalColor = ((TempColor+0.055)/1.055)^2.4

The final color is computed also after checking for the threshold as i mentioned earlier

KFactor = [ColorNormalized+Offset)/(1+Offset))^Gamma] * 100 ; If ColorNormalized (Color/255) is bigger then Treshold

KFactor = (ColorNormalized/Slope) * 100 ; If ColorNormalized (Color/255) is smaller or equal to Treshold

On my original version i used the same way people often uses to calculate this stuff. So i inserted the formula on the final part of the color convertions routines such as:

`Proc CieLCHtoRGB:`

Arguments @pLuminance, @pChroma, @pHue, @Red, @Green, @Blue, @Flag, @WhiteRef

Structure @TempStorage 64, @pXDis 0, @pYDis 8, @pZDis 16, @TmpRedDis, 24, @TmpGreenDis 32, @TmpBlueDis 40, @pAFactorDis 48, @pBFactorDis 56

Uses esi, ebx, ecx, edx, edi

finit

(.....)

lea ecx D@TmpRedDis | fld R@pXDis | fmul R$esi+FloatMatrices.M1Dis | fld R@pYDis | fmul R$esi+FloatMatrices.M2Dis | faddp ST1 ST0 | fld R@pZDis | fmul R$esi+FloatMatrices.M3Dis | faddp ST1 ST0 | fstp R$ecx

lea ecx D@TmpGreenDis | fld R@pXDis | fmul R$esi+FloatMatrices.M4Dis | fld R@pYDis | fmul R$esi+FloatMatrices.M5Dis | faddp ST1 ST0 | fld R@pZDis | fmul R$esi+FloatMatrices.M6Dis | faddp ST1 ST0 | fstp R$ecx

lea ecx D@TmpBlueDis | fld R@pXDis | fmul R$esi+FloatMatrices.M7Dis | fld R@pYDis | fmul R$esi+FloatMatrices.M8Dis | faddp ST1 ST0 | fld R@pZDis | fmul R$esi+FloatMatrices.M9Dis | faddp ST1 ST0 | fstp R$ecx

call GammaLinearDecodingEx F_gamma, F_Offset, F_Slope, F_Treshold, F_OffsetPlusOne, D@Flag

lea ecx D@TmpRedDis

.Fpu_If R$ecx > R$F_Treshold

;Color = ((Color+0.055)/1.055)^2.4

fld R$F_gamma | fld R$ecx | fyl2x | fld1 | fld ST1 | fprem | f2xm1 | faddp ST1 ST0 | fscale | fxch | fstp ST0

fmul R$F_OffsetPlusOne | fsub R$F_Offset

.Fpu_Else

fld R$ecx | fmul R$F_Slope

.Fpu_End_If

mov ecx D@Red | fmul R$Float255 | fistp F$ecx

lea ecx D@TmpGreenDis

.Fpu_If R$ecx > R$F_Treshold

;Color = ((Color+0.055)/1.055)^2.4

fld R$F_gamma | fld R$ecx | fyl2x | fld1 | fld ST1 | fprem | f2xm1 | faddp ST1 ST0 | fscale | fxch | fstp ST0

fmul R$F_OffsetPlusOne | fsub R$F_Offset

.Fpu_Else

fld R$ecx | fmul R$F_Slope

.Fpu_End_If

mov ecx D@Green | fmul R$Float255 | fistp F$ecx

lea ecx D@TmpBlueDis

.Fpu_If R$ecx > R$F_Treshold

;Color = ((Color+0.055)/1.055)^2.4

fld R$F_gamma | fld R$ecx | fyl2x | fld1 | fld ST1 | fprem | f2xm1 | faddp ST1 ST0 | fscale | fxch | fstp ST0

fmul R$F_OffsetPlusOne | fsub R$F_Offset

.Fpu_Else

fld R$ecx | fmul R$F_Slope

.Fpu_End_If

mov ecx D@Blue | fmul R$Float255 | fistp F$ecx

; clip values outside the range

mov ecx D@Red

If D$ecx <s 0

mov D$ecx 0

Else_If D$ecx > 255

mov D$ecx 255

End_If

mov ecx D@Green

If D$ecx <s 0

mov D$ecx 0

Else_If D$ecx > 255

mov D$ecx 255

End_If

mov ecx D@Blue

If D$ecx <s 0

mov D$ecx 0

Else_If D$ecx > 255

mov D$ecx 255

End_If

EndP

See the problem ? To achieve the final color we need to do a bunch of mathematical operations using power functions to retrieve the proper value.

What i did was revert the formula and use them as a table that is pre-calculated way before CieLCHtoRGB is actually running. I found ouyt that the resultant values that generates the color have fixed fractions (i named them as Kfactor). And the total amount of fractions are only 256 ! So, one fraction per color.

All i had to do is create a routine to calculate all those values and insert them on a huge structure, which i named as "WSMatrix" (So far, the size of this structure is 4388 bytes, but i´ll probably increase it because i need 2 or 3 more tables). I created a function called

**SetupWorkSpaceMatrixDataEx** that can be used when the application starts, for example or under user choice only

**once**.

The function

**SetupWorkSpaceMatrixDataEx** pre-calculate literraly everything needed to convert from RGB to CieLCH and vice-versa, inserting all data (gamma, offset, white references, creating new matrices, calculating multiplicand fractions, stablishing limits etc etc) and put all of that on a single structure WSMatrix to be used internally.

The usage of a table and the pre-calculation of all those values is a must, specially because whenever i´m analysing a image/video etc, i no longer need to make all those computations for every pixel. All it is needed is for CieLCHtoRGB convertion is basically points to the precalculated data from WSMatrix and do simple math operations (if needed).

It´s a major advantage in terms of speed computation because we are no longer calculating everything for each pixel. For example, all those monster computationss i replaced with a simple :

`Proc CieLCHtoRGB_Ex:`

Arguments @pLuminance, @pChroma, @pHue, @Red, @Green, @Blue, @pMatrix

finit

mov edx D@pChroma

lea edi D@pAFactorDis | mov esi D@pHue | fld R$esi | fmul R$Degree_Radian | fcos | fmul R$edx | fstp R$edi

lea edi D@pBFactorDis | mov esi D@pHue | fld R$esi | fmul R$Degree_Radian | fsin | fmul R$edx | fstp R$edi

(...)

lea ecx D@TmpRedDis | fld R@pXDis | fmul R$esi+WS_Matrix.Inverted.Red_M1Dis | fld R@pYDis | fmul R$esi+WS_Matrix.Inverted.Green_M2Dis | faddp ST1 ST0 | fld R@pZDis | fmul R$esi+WS_Matrix.Inverted.Blue_M3Dis | faddp ST1 ST0 | fstp R$ecx

lea ecx D@TmpGreenDis | fld R@pXDis | fmul R$esi+WS_Matrix.Inverted.Red_M4Dis | fld R@pYDis | fmul R$esi+WS_Matrix.Inverted.Green_M5Dis | faddp ST1 ST0 | fld R@pZDis | fmul R$esi+WS_Matrix.Inverted.Blue_M6Dis | faddp ST1 ST0 | fstp R$ecx

lea ecx D@TmpBlueDis | fld R@pXDis | fmul R$esi+WS_Matrix.Inverted.Red_M7Dis | fld R@pYDis | fmul R$esi+WS_Matrix.Inverted.Green_M8Dis | faddp ST1 ST0 | fld R@pZDis | fmul R$esi+WS_Matrix.Inverted.Blue_M9Dis | faddp ST1 ST0 | fstp R$ecx

lea ebx D$esi+WS_Matrix.KFactorMapDis

mov edi D@Red

fld R@TmpRedDis | fmul R$Float100 | fstp R@TmpColorCheckDis

lea eax D@TmpColorCheckDis

call BinarySearch ebx, D$eax+4, 256

mov D$edi eax

mov edi D@Green

fld R@TmpGreenDis | fmul R$Float100 | fstp R@TmpColorCheckDis

lea eax D@TmpColorCheckDis

call BinarySearch ebx, D$eax+4, 256

mov D$edi eax

mov edi D@Blue

fld R@TmpBlueDis | fmul R$Float100 | fstp R@TmpColorCheckDis

lea eax D@TmpColorCheckDis

call BinarySearch ebx, D$eax+4, 256

mov D$edi eax

EndP

I can also gain a bit more speed using the

**BinarySearch** function inline. But i choose to use as a call to a function to see first, if it was working (and it is

)

Didn´t even need to benchmark, since the difference in speed when using binarysearch rather then calculating "Color = ((Color+0.055)/1.055)^2.4" all the time, is noticed on naked eyes :icon_mrgreen: :icon_mrgreen: :icon_mrgreen:

If you have time and can fix the binarysearch for foolproof it will be very handy.

Also, i´ll try to see a way to optimize the sin, cosinee and atan2 computations as well. Since both convertions RGB to CieLCH and it´s reversal) makes heavy usage of trigonometry operations like those, an optimizations is also needed to speed things even more.

Just to you have a small idea on how the optimization can make things better. When i started all of those convertion routines, the images i´m posting here to transfom Gray to Color took something around 40 minutes to finish processing. After using the tables technique, the convertion was done in 4 to 8 seconds (Including the time to pre-calculate all of this). Now...with your´s optimization and the fixes i made on the convertion routines, it is taking something around 2 to 4 seconds including the precalculation routines to colorize a 640*480 grayscale image using a 960*720 reference color image.

This is to say that with further optimizations, i can reduce this total time of less then 1 second including the precalculations. So, since the precalculations are made when the app starts for example, it means that the total amount of time of coloring a image can be done in a few miliseconds.

Also, once i suceed to make the colorization method output a file formed by a kind of sample structure, then all of this can be done in almost notime. All is needed is retrieve the pointers from an external file, make the necessary adjustements to those pointers, and voialáa, you can convert a grayscale image onto a colored one almost immediatelly.

If i can be able to do this with image, the same process can be applied to video processing. For a video with let´s say 50.000 frames, if to colorize each frame it takes, let´s say 100 miliseconds per frame, it means that the video will be totally colorized in something around 15 minutes rather then years since the original version that took around 45 minutes to colorize 1 single image