Hi all,
; by Andrew Howe
outerloop: lea ebx,[edi+ecx*4]
mov eax,[edi]
cmploop: sub ebx,4
cmp eax,[ebx]
jle notyet
xchg eax,[ebx]
notyet: cmp ebx,edi
jnz cmploop
stosd
loop outerloop
I'am thinking of modifying this really interesting and old code from Andrew Howe to act on
a type of pointer sequence to arbitrary back-end data. Just before I re-invent
the wheel is there anything in MASM32 library that already does something like that ?
Envisioned :
Input: List of pointer sequences ->memory dword references (of known sequence length = n), that point to data structure with discreet element of interest to compare.
Process: Use bubble-sort code to swap pointer reference as mechanism of sort.
Output: List of pointer sequences with sorted underlying referenced data.
Thanks
Raistlin
:biggrin:
Make me wiser here, unless you are sorting very close to fully sorted data, there is not much a bubble sort can do that anything else will not do better.
@hutch--
Yes the underlying data is pretty much "generally" sorted - RE: APIC data as enumerated per CPU Core/Package.
However the sequence of data reported is specifically NOT guaranteed to be sorted by Intel/AMD. In practice
however they are pretty much already sorted. Sooooo... does this help to focus the application of such as choice?
Is the data numeric or string ? Next question, if the data is numeric, does it fit within a number range, say 0 to 10 million ? If so forget exchange sorts, do what is called a pidgeon hole sort, make an array big enough to handle the range then do one pass incrementing each value in the array as it is found in the data. Once that is done, do a single pass through the array writing the results in order.
@hutch--
Yes, the data for comparison is numeric in the range 0 - 255 held in dword memory type.
However the data is part of larger data structure with other data types included.
I was thinking of swapping the memory pointer to the data rather than the data;
based on the underlying data comparison per iteration. May-hap I'am confusing the issue?
Quote from: Raistlin on October 18, 2016, 07:23:53 PM
However the data is part of larger data structure with other data types included.
Fixed or variable size?
@jj2007
Fixed structure, determinant on number of CPU APIC's identified we heap allocate pointer's to each destination struct of type:
ApicID dw ?
CPUPackageNumber dw ?
CPUCoreNumber dw ?
CPUSMTNumber dw ?
Ideally I'd like to sort to each Package - then each Core within a package - then each hardware thread within a Core.
You always swap pointers unless you must write data in sorted order to disk as you do with some databases but the next question is, what is the count size that you must sort ? If its only a few hundred then anything will do.
Raistlin, I'll ignore the question whether what you want to do is the best way to do it. To modify the code is easy. Assume that edi points to the list of pointers to the data to be compared. (You'll have to create it if it doesn't already exist). Presumably edi points to the beginning of the record, but the datum you want to compare ("the data for comparison is numeric in the range 0 - 255 held in dword memory type") is at an offset, let's call it "offset_to_datum".
Ok, the only place in the code you have to change is "cmp eax, [ebx]". eax is a pointer, and ebx points to the next pointer in the the list. In this simple code he's comparing them directly but you want, instead, to compare the two "datums". So you could do the following. First "mov edx, offset_to_datum[eax]". That puts the datum of eax's record, into edx. Then two statements for ebx: "mov esi, [ebx]" , "mov esi, offset_to_datum[esi]". I.e. first put the pointer pointed to by ebx into esi, then get the associated datum. Now finally change the comparison to "cmp edx, esi".
Since I'm not testing this code there's probably something wrong with it, but it shows the basic idea. And of course there are other ways to do it, a bit more efficient, but this is the clearest way I could think of.
Finally to answer your question "is there anything in MASM32 library that already does something like that", I don't think so. But it's a lot easier to reinvent the wheel, than find it.
Thanks all - all done = might be able to generalize the outcome as a macro or proc call for include in sdk.
Worked out pretty much exactly how rrr described it.
@rrr314159 - you never cease to amaze :t
I see that this code use xchg. Have any benchmark about xchg? I tried to find on this board but no luck.
My test show that if xchg r32,r32 is more quickly than the next one, but if we insert memory values so next code appears more quickly.
xor eax,[ebx]
xor [ebx],eax
xor eax,[ebx]
xchg is slow with memory, that's been discussed on the board a few times. But in this case speed is not a big issue.