Hi Guys
Sometime ago, JJ helped me on a function that can scan a array of Real8 values. "BinSearchReal8" as described at http://masm32.com/board/index.php?topic=7659.15. Now, i´m testing a function i created to work with a Array of Structures (of any size) that can search a given value from a Real8 member located anywhere inside each structure. The goal is to make it fully configurable so it can scan the exact value or from desired position (so, inside a certain range). The array of structures are ordered (from the smaller value of the desired member to the higher one).
Ex:
Each structure is formed by 2 Real 8 followed by one Dword. So, it´s size = 24 bytes long
On this example all the 1st real 8 is ascending ordered and this are the values i want my value to be scanned. So, each value (in red) is located at Position 0 inside each structure. Also we have a total amount of 13 structures on the array.
[MyStruct:
MyStruct0: R$ 2, R$ 5, D$ 128
MyStruct1: R$ 5, R$ 115, D$ 28
MyStruct2: R$ 10, R$ 556, D$ 25
MyStruct3: R$ 600, R$ 52536, D$ 1
MyStruct4: R$ 650, R$ 511, D$ 0
MyStruct5: R$ 700, R$ 5184, D$ 12
MyStruct6: R$ 800, R$ 51548, D$ 11128
MyStruct7: R$ 930, R$ 5767, D$ 1218
MyStruct8: R$ 1600, R$ 5556, D$ 12848
MyStruct9: R$ 1800, R$ 577, D$ 11454128
MyStruct10: R$ 12525, R$ 5344, D$ 11528
MyStruct11: R$ 75555, R$ 325, D$ 12118
MyStruct12: R$ 100000, R$ 3235, D$ 1]
The search can be done in 3 ways:
a) Exact Search match. Value to be searched = 700. It must return the address where it is located = MyStruct5. If the value is not found, it must return 0
b) Previous Search match. Value to be searched = 650. It must return the nearest address where it is located (the previous one). So, on this case it will return MyStruct4, since it´s value starts at 650 and end before 700 (MyStruct5)
Also, if the value is located at 1st or last pos, it also return those values. Ex: if we are searching for 4 (or even 0, 1), it will return MyStruct0, and if we are searching for 300000, it will return MyStruct12 and so on.
It will always return the corresponding address
c) Next Search match. Value to be searched = 650. It must return the nearest address where it is located (the next one). So, on this case it will return MyStruct5. So one position before the previous case.
Also, if the value is located at the 1st pos or last pos, on the same way as the previous case, but with one position always ahead. (Except if the value is located after the last position, that on this case will always return MyStruct12...or on the 1st position (MyStruct0) if the value is located before the 1st one, ex: we are searching for 0, 1 etc...so any value smaller then 2 at MyStruct0)
It will always return the corresponding address
d) Works on both directions, no matter if the values of the members to be found are ordered ascending or descending.
e) Works with whatever size of the structure. No matter if it contains Dwords, BYtes, etc...All it is looking for is a Real8 at the given relative position of the members of the structure. On this example the Real 8 to be found is at Position 0 , but it cold also be at Pos 8, Pos 30, 37, 187 or whatever. position it is relative to the each start of structure .
What i did was something like this (RosAsm syntax). The problem is that for the exact search match, whenever the value is located at the last position "MyStruct12" it always result on a error, because it will try to located the value outside the true ending address. I tried to fix that with a small hack at the beginning of the function, but it ends up slowing a bit the function, and also seems t not being worked correctly.
[STRUCT_SEARCH_EXACT 0]
[STRUCT_SEARCH_NEXT 2]
[STRUCT_SEARCH_PREV 4]
pArray - Pointer to a Array of Structures
Value - A pointer to the Real8 value to be scanned
ElementsCount - Total elements of the array. So, the total amount of structures that formes the array
StructureSize - The size of each structure
PosinStruct - The position where the member we want to scan. So, the 1st Real8 is located at 0, the 2nd at 8 etc etc
Flag - A flag representing the type of search we ant to perform. it can be one of the following values:
STRUCT_SEARCH_EXACT 0 ---- Search for the exact value inside each structure
STRUCT_SEARCH_NEXT 2 ----- Search for the closest value inside each structure but output the next adress where it belongs
STRUCT_SEARCH_PREV 4 ------Search for the closest value inside each structure but output the adress where where it is inside it´s own range.
Proc StructSearchReal8Test::
Arguments @pArray, @Value, @ElementsCount, @StructureSize, @PosinStruct, @Flag
Local @Pivot, @NextTblSize, @TrueEnd
Uses esi, edi, edx, ebx ; <-------Simple Macro to preserve those registers
mov edx D@Value
mov ebx D@PosinStruct | movq XMM1 X$edx+ebx ; Our value to be retrieved
mov esi D@pArray
; EndAddress = StartAdd*SizeStruct
mov edx D@ElementsCount
mov D@NextTblSize edx | imul edx D@StructureSize | lea eax D$esi+edx | mov D@Pivot eax | mov D@TrueEnd eax
; small hack for exact search flag at the last position
If D@Flag = STRUCT_SEARCH_EXACT ; special case of the last value
mov edx eax | sub edx D@StructureSize
movlps XMM0 X$edx+ebx
SSE_D_If XMM0 = XMM1 ; ----> Just an If macro, but used for SSE2 ---> COMISD XMM0 XMM1 | jne XXXXX
mov eax edx | ExitP ; found it at the last location
SSE_D_Else
dec D@NextTblSize
SSE_D_End
End_If
@DoLoop:
.SSE_D_Do
SSE_D_Do
.If esi < eax ; Cur Address < EndAddr ; same as in masm. Simples address comparisions.
mov edi D@Pivot;eax
; Pivot = (CurAddr-EndAddr)/16 (2 Real8)
shr D@NextTblSize 1 | mov edx D@NextTblSize | imul edx D@StructureSize | lea eax D$esi+edx
mov D@Pivot eax
movlps XMM0 X$eax+ebx
SSE_D_Loop_Until XMM0 <= XMM1 ; <----- A macro to compare SSD instruction. ===> COMISD XMM0 XMM1 | jnbe XXXX (Start of the loop at @DoLoop)
add eax D@StructureSize | mov esi eax
mov eax edi | mov D@Pivot eax; | jne @DoLoop
.SSE_D_Loop_Until XMM0 = XMM1
.End_If
L1:
mov edx D@Flag
.If edx = STRUCT_SEARCH_EXACT
xor eax eax
If esi < D@TrueEnd
SSE_D_If XMM1 = XMM0
mov eax esi | sub eax D@StructureSize
SSE_D_End
End_If
.Else_If edx = STRUCT_SEARCH_PREV
mov eax esi | sub eax D@StructureSize
.Else
mov eax esi
.End_If
EndP
Example of usage:
[ScanValue: R$ 100000]
call StructSearchReal8Test MyStruct, ScanValue, 13, 8, 0, STRUCT_SEARCH_EXACT
Can someone help ? Wha i´m doing wrong...or how can it be coded to make it even faster and also using the needed arguments/parameters ?
Hmm; pardon me if I'm missing something (and please let me know if I am), but this seems like a very trivial exercise in searching. Especially since it's a structure of known composition and size.
What I would do is something like this (in pseudocode, not actual ASM):
o Set pointer at starting position (beginning or end of structure)
o Do forever:
o Compare value at current position to search value
o If equal,
o Return current position
o else
o advance pointer to next (or previous) structure element
o If at end of structure
o BREAK (and return whatever indicates "No match")
The "if at end of structure" would evaluate the index you use to access each element and see if it exceeds the size of the structure.
I don't see why there should be a problem at the first or last element. Nothing special about them. You just have to make sure not to "fall off the end" of the structure and try to address possibly inaccessible memory.
I don't know anything about how you'd be doing the REAL8 comparisons, and it doesn't make any difference anyhow, so long as you don't do any illegal memory accesses.
The problem here is speed
The goal is make it search the fast as possible, that´s why it is choosing a pivot (elements count/2) to perform the search on each iteration. Doing it by half (as JJ proposed) can reduce the number of iterations a lot. Say, i have a file with 400 Mb to be scanned. This file is formed by something around 130.000 structures. The correct pointer can be retrieved in something around 12 iterations (or less), rather then performing a loop on each of the 130.000 structures.
The problem i´m facing is that when the search is made on the exact value to be scanned, whenever the value is at the end of the whole array, it goes outside the last address (And, thus will perform an illegal memory acess)
So it's like (or is) a binary search, where you split the list in half each time until you find what you're looking for?
Quote from: NoCforMe on January 15, 2022, 04:27:47 PM
So it's like (or is) a binary search, where you split the list in half each time until you find what you're looking for?
That makes sense if the file is sorted. Otherwise, linear search is less complicated and faster.
Quote from: guga on January 13, 2022, 07:13:48 PM
Each structure is formed by 2 Real 8 followed by one Dword. So, it´s size = 24 bytes long
On this example all the 1st real 8 is ascending ordered and this are the values i want my value to be scanned. So, each value (in red) is located at Position 0 inside each structure. Also we have a total amount of 13 structures on the array.
[MyStruct:
MyStruct0: R$ 2, R$ 5, D$ 128
MyStruct1: R$ 5, R$ 115, D$ 28
MyStruct2: R$ 10, R$ 556, D$ 25
MyStruct3: R$ 600, R$ 52536, D$ 1
MyStruct4: R$ 650, R$ 511, D$ 0
MyStruct5: R$ 700, R$ 5184, D$ 12
MyStruct6: R$ 800, R$ 51548, D$ 11128
MyStruct7: R$ 930, R$ 5767, D$ 1218
MyStruct8: R$ 1600, R$ 5556, D$ 12848
MyStruct9: R$ 1800, R$ 577, D$ 11454128
MyStruct10: R$ 12525, R$ 5344, D$ 11528
MyStruct11: R$ 75555, R$ 325, D$ 12118
MyStruct12: R$ 100000, R$ 3235, D$ 1]
Post an example of this file.
The data (Real8) is sorted. The sort is in ascending order no matter if the Real 8 is the 1st, 2nd, 3rd etc element.
I´ll try to port it to masm to you guys see what´s wrong.
Is it sorted by the element you are searching for? Otherwise the tree search makes no sense.
Two more aspects:
1. Exact search: you have REAL8 values, but 1.23456789 != 1.23456789 depending on precision set
2. What is your usage:
a) Sort once (very slow), then grab a Million times the values?
b) load database once, then grab one value?
A first test (it creates once a 32MB file with random REAL8 numbers):
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz (SSE4)
+19 of 20 tests valid,
1487 cycles for 10 * FindData linear
8144 cycles for 10 * Fcmp
1259 cycles for 10 * FindData linear
8109 cycles for 10 * Fcmp
1199 cycles for 10 * FindData linear
8068 cycles for 10 * Fcmp
1206 cycles for 10 * FindData linear
8082 cycles for 10 * Fcmp
107 bytes for FindData linear
72 bytes for Fcmp
GugaData STRUCT
cost REAL8 ?
taxes REAL8 ?
price REAL8 ?
profit REAL8 ?
GugaData ENDS
FindDataL proc find8:REAL8, element
lea ecx, gugadata(0, cost) ; ArrayPtr
add ecx, element
push gugadata(?) ; #elements
push eax ; copy of #elements
mov eax, dword ptr find8
andX=11111111111111111111111111111100b ; tolerance mask for rounding errors
and eax, andX
.Repeat
mov edx, [ecx]
and edx, andX
.if eax==edx
mov edx, [ecx+4]
.Break .if edx==dword ptr find8[4]
.endif
add ecx, GugaData
dec stack
.Until Sign?
pop edx
pop eax
sub eax, edx
ret
FindDataL endp
...
invoke FindDataL, FP8(205.81415888176433), GugaData.price ; element #18
With 262.1826729473792738 it takes a bit longer: that is element 999990
Wait, wait (or should that be fwait, fwait?): You wrote:
FindDataL proc find8:REAL8, element
lea ecx, gugadata(0, cost) ; ArrayPtr
add ecx, element
push gugadata(?) ; #elements
push eax ; copy of #elements
mov eax, dword ptr find8
andX=11111111111111111111111111111100b ; tolerance mask for rounding errors
and eax, andX
.Repeat
mov edx, [ecx]
and edx, andX
.if eax==edx
mov edx, [ecx+4]
.Break .if edx==dword ptr find8[4]
.endif
add ecx, GugaData
dec stack
.Until Sign?
pop edx
pop eax
sub eax, edx
ret
FindDataL endp
...
invoke FindDataL, FP8(205.81415888176433), GugaData.price ; element #18
I see at least two problems:
1. You're pushing EAX, which is supposed to be the # of elements, before setting it. Or is the caller expected to set it?
2. You set ECX to point to the start of the structure (lea ecx, gugadata(0, cost)), then you're adding "element" to get to the nth element, but that's wrong, isn't it? because "element" would be an index, when what you want is an offset, so you'd be pointing into the middle of an element.
Sorry to pick nits here, but ...
Also, I don't understand the following things:
lea ecx, gugadata(0, cost) ; ArrayPtr
push gugadata(?) ; #elements
Are those MasmBasic operators? I see what the first one is doing, but not the second. What does gugadata(?) mean?
And of course nobody's yet answered the question Guga asked about running off the end of the structure ...
If you allocate much more memory and print last adress?
Make use of SIMD version of compare faster,without branch prediction?
Produces mask to AND pointer with ffffffffffffffffh or zero
Quote from: NoCforMe on January 16, 2022, 06:57:06 PM
I see at least two problems:
1. You're pushing EAX, which is supposed to be the # of elements, before setting it. Or is the caller expected to set it?
2. You set ECX to point to the start of the structure (lea ecx, gugadata(0, cost)), then you're adding "element" to get to the nth element, but that's wrong, isn't it? because "element" would be an index, when what you want is an offset, so you'd be pointing into the middle of an element.
This is element in the sense of structure element; and GugaData.price is indeed an offset:
GugaData STRUCT
cost REAL8 ?
taxes REAL8 ?
price REAL8 ?
profit REAL8 ?
GugaData ENDS
.code
MiscellaneousDeclarations: ; called from Init
Dim gugadata() As GugaData ; create an array of structures
retn
...
ArrayRead gugadata(), "GugaArray.dat" ; read data into the array
...
invoke FindDataL, FP8(262.1826729473792738), GugaData.price
Quote lea ecx, gugadata(0, cost) ; ArrayPtr
push gugadata(?) ; #elements
Are those MasmBasic operators?
Yes. The lea ecx, gugadata(0, cost) loads the address of the first array and first structure elements. So this is the start of the data.
QuoteAnd of course nobody's yet answered the question Guga asked about running off the end of the structure ...
The
push gugadata(?) in combination with
dec stack takes care of that.
Perhaps it becomes clearer when you see the *.asc source in RichMasm (as in the screenshot below) or in WordPad.
(http://www.jj2007.eu/pics/GugaFindStruct.png)
I attach a version with more comments. It searches for one element towards the end of the array.
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz (SSE4)
-+18 of 20 tests valid,
66512 kCycles for 10 * FindData linear
316696 kCycles for 10 * Fcmp
66904 kCycles for 10 * FindData linear
316687 kCycles for 10 * Fcmp
66768 kCycles for 10 * FindData linear
316935 kCycles for 10 * Fcmp
66527 kCycles for 10 * FindData linear
317312 kCycles for 10 * Fcmp
107 bytes for FindData linear
72 bytes for Fcmp
999990 = eax FindData linear
999990 = eax Fcmp
Btw the second test uses Fcmp (https://www.jj2007.eu/MasmBasicQuickReference.htm#Mb1201), which takes account of precision problems but is relatively slow:
xor ecx, ecx ; zero the index
dec gugadata(?) ; #elements of the array, minus 1
push eax ; #elements-1
.Repeat
Fcmp gugadata(ecx, price), FP8(262.18267294738), high ; note the approximate value!
.Break .if Zero?
inc ecx
dec stack
.Until Sign?
xchg eax, ecx ; return index
pop edx ; balance the stack
With
medium instead of
high precision, it would find a matching element already at index 19142. This is the problem with REAL8 precision: if you don't allow for some tolerance, you'll
find the element only if you pass the absolutely exact value to a search algo.
Hi JJ, many thanks..i´ll take a look. Here is a file that can be tested to search the data. About your questions, yes, it is a huge file loaded in memory. It is an array of structures ordered by a certain member. On this example, it contains 123800 structures (Ordered through the 1st and 2nd members - boith are Real8). Each structure is formed as below:
QuoteLCHUnorderIndex.Chroma: R$ 0
LCHUnorderIndex.Hue: R$ 0
LCHUnorderIndex.Luma: B$ 0
LCHUnorderIndex.Red: B$ 0
LCHUnorderIndex.Green: B$ 0
LCHUnorderIndex.Blue: B$ 0
The equates related to the structure are:
[LCHUnorderIndex.ChromaDis 0
LCHUnorderIndex.HueDis 8
LCHUnorderIndex.LumaDis 16
LCHUnorderIndex.RedDis 17
LCHUnorderIndex.GreenDis 18
LCHUnorderIndex.BlueDis 19]
[Size_Of_LCHUnorderIndex 20]
Btw: I´ll later rename the structure. It is actually ordered and not unordered
A example of what it contains is something like this:
LCHUnorderIndex.Chroma.Data3: R$ 18.5245
LCHUnorderIndex.Hue.Data3: R$ 15.5468
LCHUnorderIndex.Luma.Data3: B$ 2
LCHUnorderIndex.Red.Data3: B$ 0
LCHUnorderIndex.Green.Data3: B$ 0
LCHUnorderIndex.Blue.Data3: B$ 0
LCHUnorderIndex.Chroma.Data4: R$ 19.2558
LCHUnorderIndex.Hue.Data4: R$ 16.45868
LCHUnorderIndex.Luma.Data4: B$ 2
LCHUnorderIndex.Red.Data4: B$ 0
LCHUnorderIndex.Green.Data4: B$ 0
LCHUnorderIndex.Blue.Data4: B$ 0
(...)
LCHUnorderIndex.Chroma.Data255: R$ 89.2558
LCHUnorderIndex.Hue.Data255: R$ 96.45868
LCHUnorderIndex.Luma.Data255: B$ 2
LCHUnorderIndex.Red.Data255: B$ 0
LCHUnorderIndex.Green.Data255: B$ 0
LCHUnorderIndex.Blue.Data255: B$ 0
etc
So...if you choose the function to search from the 1st member (in red) = Chroma...it is always ordered in ascending order. Also if you intend to search from the 2nd Real8 (In green), it is also ordered in ascending order.
The filei´m posting is just a piece of he whole one. It contains all 16 million pixels containing all the values of the CieLCH color space, but ordered from Luma, Chroma and Hue. I choosed to put Luma in the 3rd member of the structure because it is a byte (from 0 to 255), to avoid some slow when the function uses the movsd to get the data from the Real8.
The exact match (STRUCT_SEARCH_EXACT flag) works like this...If i want to search for the value 89.2558, it wil then return the address at LCHUnorderIndex.Chroma.Data255. Otherwise it will return 0 (FALSE)
If we search the Previous one (STRUCT_SEARCH_PREV flag), it works like this. If i want to search for the value 19, it will return the adrress at LCHUnorderIndex.Chroma,Data3, because it´s range for Chroma goes from >= 18.5245 to < 19.2558
(located at the next structure). So the STRUCT_SEARCH_PREV will works for conditions of the number be smaller or equal to the one ou are searching. Also..it will return LCHUnorderIndex.Chroma.Data2 if the value is located at the 1st struture or (LCHUnorderIndex.Chroma.Data3 on this example) or the last one if it is located at the end (LCHUnorderIndex.Chroma.Data255 on this example). Or ehen it is outsided it´s boundaries, ex: Value to be searched = 0.5, since this does not exists, but is smaller then the 1st one, it will return LCHUnorderIndex.Chroma.Data3, because 0.5 is smaller then 18.5245 or return the last one if the value exceeeds the one found at the last structure.
If we search using the next one (STRUCT_SEARCH_NEXT flag), it will do the oposite as above. It will return the next structure always (and not the previous one). So, if we are looking for the value 19, it will return "LCHUnorderIndex.Chroma.Data5" etc etc. Also if we are looking for the value 0.5 (That doesn´t exists, but is smaller then the 1st one), it will return the address at LCHUnorderIndex.Chroma.Data3, because 0 is smaller then the 1st value (18.5245). The same situyation goes if we are looking for a value higher then the last one, it will always return the last structure.
Using the method you did on the binarysearchReal8 (dividing by 2) is way faster then do a direct search, specially considering we are dealing with image that contains lots of pixels to be analysed. So, performing a seraching using the division by 2 is much faster, then a direct scaan, and even more faster then if i have to calculate the value to be found.
You can get the file to be tested heree: https://www.4shared.com/s/fJbjB6Mg5iq
I couldn´t upload on the post due to it´s size that exceeded the limits.
Dividing the 2 (as you did before) is way faster, because since the values of the members are ordered we need a few loops until it is found.
Ex: Total amount of structures (elements of the array) = 123800
1st Loop, divide it by 2 and set the pivot (tempoirary end), that goes from
0 - 61900 (pivot = new end) = toal elements = 61900
61901 - 123800
If the number to be found is smallfer then the 61900th value (Located at the 1st or 2nd or whatever member position inside the structure), it will do the loop (and division) again.
2nd loop. Total elements = 61900
0 - 30950
30951 - 61900
If the value is smaller then the one found in 30950th element, it will do agaain.
And also the same goes if the value if higher. Ex:
1st Loop, divide it by 2 and set the pivot (tempoirary end), that goes from
0 - 61900 (pivot = new end) = toal elements = 61900
61901 - 123800
The value to be found is bigger then the 61901 element....so we need to search from 61901th to 123800 th. Do the loop again
New start = 61901
End = 123800
So.. on the next loop we will look for the value in the next 2 tables
61901 - 92850 1st new array of elements
92851 - 123800 2 nd new array of elements
untill it reaches thee condition defineed by the flaag (seraching equal, previous or next).
It´s teh same techique you used before. The problem i´m facing is on the exact search that is giving me the incorrect address, specially when the total amount of elementsto be scanned is odd. Or result in a odd amount after each loop.
Quote from: guga on January 17, 2022, 12:55:49 AMSo...if you choose the function to search from the 1st member (in red) = Chroma...it is always ordered in ascending order. Also if you intend to search from the 2nd Real8 (In green), it is also ordered in ascending order.
How can it be sorted by two elements simultaneously? That makes only sense if you have multiple identical entries for the first element.
How can it be sorted by two elements simultaneously? That makes only sense if you have multiple identical entries for the first element.
Not necessarily. I didn´t analysed yet all the contents of the generated data for the 2nd member, but it most likely the 2nd member is also ordered ascending since it is calculation depends on Chroma. So, it is a kind of ratio that results the Hue. The way i did, i created all Hue and Chroma to work on a percentage of a certain range of grey (Luma).
Can we have duplicated values for the 1st members and differents for the 2nd one ? Yes we can, but i didn´t analysed all the values on the 2nd member yet. What it seeems is that since they are related to each other, most likely the 2nd member is also sorted. (I´ll check it later once suceed to fix this search function).
Ex: We have a Luma (Gray) whose value is 128, then i knowfrom the table that this grey lavel ave specific minimum and maximum values for chroma and hue. All i did was normalize that range (from 0 to 100%). The resultant percentage on each grey level seems to be unique for hue and chroma and also have the property to obbey the same order. So, if hue is in ascending order, Chroma is doing that too and vice-versa.
The table can have these pixel contents
Pixel______Grey ________________Chroma Percentage_________Hue Percentage
1_________128_____________________38.15%_________________15.989% (Where the pixel is located on this given range)
2_________128_____________________45.5%__________________16.989% (Where the pixel is located on this given range)
etc
160125____200______________________54.88%________________14.98
etc
What are those percentages ? They indicates the position of chroma and hue for a specific Grey level (Luma). I created another table containing the minimum and maximum, so each grey may result of a combination of 120000 pixels, each one of them with his unique hue and chroma. But, on this other table is where it contains the actual value of minimum and maximum.
Ex for Grey = 128
Min Crhoma = 18.6
Max Chroma = 235
Min Hue = 65
Max Hue = 70
For Grey = 200
Min Crhoma = 66.6
Max Chroma = 300
Min Hue = 80
Max Hue = 85
etc etc
etc.....
So, rather then calculate the true value of hue and chroma, i created a RGbtoCieLab and CieLabtoRGB that uses the percentage of each hue/chroma relative to its´own Luma (Grey level). Upon this i found out that they Hue/Chroma are always related to each other whicch makes easier to create a better color convertion function, since i can now calculate the true properties of the generated Color Space.
I suceded to make the converter stays the hue within something around 90º always and correctly classify the colors. I fixed the CieLCH algorithm to behave like that. But the problem is that i´m trying to make it search faster, so i can go further and finishes the rest of the analysis of this Color Convertion techique.
This is the same thing as if we use ListView with sereral different data and sort them from the 1st column, then the second, then the 3rd and so on. (Or in excel, that works too)
Of course, since the RGBtoCieLCH is not 100% perfect, it may have some values that maches, and in fact, it do exists some RGB combinations whose Hue and Chroma percentage for a given grey level is the same....But...this happens on a very few situations, (due to some rounding error i presume).
I just want to test the search routines the same way you did previously, so i can advance and see those other details, to compare tyhe values for Hue or Chroma or analyse how they behave when i want to decrease the chromacity of aa given image aand also preserving it´s hue proportions, or decreaseing the hue at each leevel and preserving chroma etc.
What i did was simply classify all possible colors so, the total array starts with Luma 0 to 255, each one of them containing XXXXX pixels displayed in order of chroma and hue,
I´m not sure if i´m making it clear how this thing works....but What i found out after analysing the functionality of CieLab and CieLCH convertions is that what we call "luma" is nothing more than a certain range of grey.
So, from the 16 million colors we have a specific amouunt of combination of pixels that will results on a grey of 127, and other amount that will result on grey = 200 and so on. What i did was calculate the exact value for hue and chroma for each pixel, then i took the maximum and minimum values for hue/chroma for the whole colorspace (those 16 million colors) and saw what hapened.
What i figure it out is that for each grey level we have specific ranges for hue and Chroma and they seems to be unique for each level.
So, i can have more than one pixel with a chroma of let´s say 85%, but....they are related to other types of luma, because their true values will never match. This is why i used percentage for each luma level rather then their actual values.
ex:
Example, we have a pixel whose Grey = 200, then i know that the range of chroma/hue for this level is XXX and YYY. So if i have 160000 pixels attached to the grey=200, each one of them will have their unique value for chroma and hue, but displayed in percentage related to that specific level.
Pixel1 = Grey/Luma = 200, Hue = 85%, Chroma = 65% (True Ranges for hue/ chroma = XXX and YYY) . The percentage is related to the position of the values in the range of XXX to YYY
Pixel2 = Grey/Luma = 130, Hue = 85%, Chroma = 65% (True Ranges for hue/ chroma = ZZZ and KKK) The percentage is related to the position of the values in the range of ZZZ to KKK
Although they share the same percentage of Hue/Chroma, the two pixels refers to different levels of grey, each one of them with their certain ranges of chroma and hue. So, even if the percentages matches, this does not means that they have the same hue/chroma.
Doing this way i also found out that Grey is actually handled like any other color in the CieLab/CieLCH convertions. They also have their specific percentages and values for hue/chroma. What people is used to do is simply clip the values when foudn some grey color, but i didn´t limited it. I forced the function to calculate the precise values even if the pixel is grey, because, in fact, they do contains some values for hue and chroma.
This is to say, if i want to desaturate a image using the CieLCH colorspace, i don´t need to perform all the heavy computations. All i need to do is look on the proper tables for ther specific ranges.
So, if i know that a pixel is converted to grey (say: 124) when it have a hue = 15% and Chroma = 65%, all i have to do is use those values as a limit. So, if i want to desaturate a color, all i have to do is grab the pixel whose luma = 124 and decrease the values of Hue/Chroma until it reaches 15% of Hue and 65% of chroma if i want my pixel turns onto grey = 124
What i also found it that is interesting is the fact that certain colors are deeply attached to a grey level. Ex: The color red have specific ranges for grey while blue color have uses only others grey levels.
So, no matter how many pixels we have, all red will turn onto pixel 68, whereas light blue will always result in grey = 201 etc...We will never have a red resulting on grey of 201 or vice-versa. This is also handy for color classification, colorization or even color match techniques.
OK, let's assume that the dataset is sorted by two elements simultaneously. That means you can use the "binary" search, i.e. narrow down by halving the range until you got the element.
Now the problem is still speed: comparing REAL8 numbers is not very fast, because you do it in two steps. First the high dword, then the low one. Is there a specific reason why you are using REAL8, instead of (much faster) REAL4?
Hi JJ.
Indeed you were right about the order of the 2nd element. I converted the result to text to see it´s value. The 2nd member only makes sense to be sorted if the 1st one is duplicated. it won´t make much difference anyway, since i can simply create another table making it order by hue, if an extra table could be needed.
The only problems is that it do contains repeteaed Chroma (a very few ones, in fact. Less then 3% of the data or so, for some gray levels), but since the hue is sorted on those specific cases, then it wont affect the result i guess.
Luma: 127 - Chroma = 4.91902228406846 - Hue = 44.83984657131926
Luma: 127 - Chroma = 4.92021648723465 - Hue = 45.52085782555232
Luma: 127 - Chroma = 4.92106465082628 - Hue = 27.81011666729317
Luma: 127 - Chroma = 4.92175724228554 - Hue = 5.97894268127245
Luma: 127 - Chroma = 4.92217636717869 - Hue = 40.40063351810875
Luma: 127 - Chroma = 4.92270364859599 - Hue = 38.13873462681201
Luma: 127 - Chroma = 4.92317135437391 - Hue = 14.33676438731702
Luma: 127 - Chroma = 4.92400158699392 - Hue = 28.68843746495414
Luma: 127 - Chroma = 4.92409200831756 - Hue = 45.50254504647477
Luma: 127 - Chroma = 4.92513148351412 - Hue = 44.81169849491374
Luma: 127 - Chroma = 4.92617916980371 - Hue = 8.12017422689824
Luma: 127 - Chroma = 4.92629781474959 - Hue = 10.87254845163805
Luma: 127 - Chroma = 4.92720821669483 - Hue = 44.14771869516645
Luma: 127 - Chroma = 4.92728409414898 - Hue = 3.79537687982782
Luma: 127 - Chroma = 4.92797163083742 - Hue = 45.48423042700837
Luma: 127 - Chroma = 4.92913024597369 - Hue = 26.00484700886813
Luma: 127 - Chroma = 4.93045309288895 - Hue = 4.34774791311986
Luma: 127 - Chroma = 4.93051962287680 - Hue = 5.58126019413311
I made it in Real8 just for precision. Later, whern teh search is ok, i´ll analyse teh resultant values and also see if it will not change the image that much. So, maybe i can use Real4 insetad, but 1st it would be beetter fixing it so i can see if the usage of a Real4 can also result on accurated values.
About... " First the high dword," I don´t get it...the last time you didn´t used the high dword. You basically scanned the Real 8 using comisd before copying the value to XMM0. This is what you did previously:
Proc BinSearchReal8:
Arguments @pArray, @Value, @ElementsCount
Uses esi, edi, edx
mov edx D@Value
movq XMM1 X$edx
mov esi D@pArray
push esi
mov edx D@ElementsCount
lea eax D$esi+edx*8
@DoLoop:
cmp esi eax | jae @EndLoop
mov edi eax
sub eax esi
sar eax 04
lea eax D$esi+eax*8
movlps XMM0 X$eax
comisd XMM0 XMM1 | ja @DoLoop
lea esi D$eax+8
mov eax edi | jne @DoLoop
@EndLoop:
pop eax
comisd XMM0 XMM1
setne dl
movzx edx dl
sub esi eax
mov eax esi
sar eax 3
dec eax
EndP