Managing data structures: lists

jj2007 · February 02, 2024, 07:10:10 AM

Quote from: NoCforMe on February 02, 2024, 06:41:58 AMWelll, regarding the Masm32 library, the only function I've ever used from that

I see you made good use of our libraries in your Lissajous project

Code Select

	INVOKE	atodw, ADDR buffer	; line 380
	INVOKE	atodw, ADDR buffer	; line 389
	INVOKE	FpuAtoFL, ADDR buffer, ADDR floatField1, DEST_MEM ; line 441
	INVOKE	FpuSin, ADDR floatField1, ADDR floatField2, SRC1_REAL
	INVOKE	FpuMul, ADDR floatField2, SinScaleValue, ADDR floatField1...
	INVOKE	FpuTrunc, ADDR floatField1, ADDR xPos, SRC1_REAL OR DEST_IMEM
	INVOKE	FpuAtoFL, ADDR buffer, ADDR floatField1, DEST_MEM
	INVOKE	FpuSin, ADDR floatField1, ADDR floatField2, SRC1_REAL
	INVOKE	FpuMul, ADDR floatField2, SinScaleValue, ADDR floatField1,...
	INVOKE	FpuTrunc, ADDR floatField1, ADDR yPos, SRC1_REAL OR DEST_IMEM

NoCforMe · February 02, 2024, 07:48:10 AM

Oh yeah, forgot about Raymond's indispensible FPU stuff. I've since learnt how to use the FPU directly, but yes, they definitely come in handy (if you're not that concerned about speed).

NoCforMe · February 04, 2024, 04:28:25 PM

Here's my first cut of the LM (list management) package. Testbed program included.

Functions:

LMinit(): Allocates heap memory for list
LMinsert(): Inserts a new list element into list
LMdelete(): Deletes a list element
LMmove(): Moves a list element up or down in list
LMreplace(): Replaces a list element's contents
LMdealloc(): Deallocates heap

There's a "master structure" (LMINFO) which contains data needed for all operations, created by the caller and passed to each function. The structure contains "bookkeeping" members which give the number of elements and bytes used and free.

The testbed program (LMtest) lets you exercise these functions (except for LMdealloc()). Try it out. It's been minimally tested, seems to work OK so far. To insert new elements, select the element you want to insert before in the listbox, type some identifying text into both edit fields, select LMinsert() in the combobox and hit the "Make It So" button. (Select "LMinsert($LM_InsAtEnd)" to insert at the end of the list.) The maximum # of elements is set to 10 so I could see if the code that checks for a full list works. You can change this in the source, of course.

Let me know what you think. One thing: the move function (LMmove()) moves one element up (towards the beginning of list) or down one "notch". This suits my purposes for this package. I thought of making this function able to move any element to any "slot", which I could do, but I'm not sure this would actually be useful. Let me know your thoughts on this.

Biterider · February 04, 2024, 05:33:38 PM

Hi NoCforMe
Really good stuff! Well written, documented and works well.

Nice that you wrote an application to test it.

Looking at the code, I see that you are using different methods to move the data.
LMinsert is where most of the fun happens. To open a gap you use a loop that moves bytes and in other cases you use "REP MOVSB".
Depending on the data size you are aiming for, there are some better and more efficient (speed-wise) methods to do this.

Well done!

Regards, Biterider

NoCforMe · February 04, 2024, 07:43:20 PM

Thanks. I should point out that for this code I really don't care that much about speed, at least whatever microseconds of difference there may be between using MOVs and using REP MOVSBs. The reason I used MOVs in LMinsert() is because I have to step backwards through the list, whereas in LMmove() and LMdelete() I can use the string instruction since I'm going forwards. (Yes, I realize I could set the direction flag to go backwards, but I like the way I'm doing it here better.)

With any uses of this package that I can think of, I can't see any good reason to be obsessed with speed here. I did try to be conscientious about error checking. If someone wants to use this and optimize it for speed, be my guest.

jj2007 · February 04, 2024, 08:57:05 PM

Quote from: NoCforMe on February 04, 2024, 04:28:25 PMLet me know your thoughts on this.

Solid work

Re speed, I'm using RtlMoveMemory for Insert(), which in turn uses rep movsd under the hood. The movdqa + movntdq combo is faster for huge allocations, but I wouldn't lose time over it ;-)

Quote from: jj2007 on December 14, 2021, 03:54:57 AMI changed my testbed to one big allocation (0.5GB), in order to minimise cache use. Now movntdq shines, of course, but rep movs is only about 15% slower:

Quote from: hutch-- on December 14, 2021, 09:01:47 AMIntel(R) Core(TM) i7-5820K CPU @ 3.30GHz (SSE4)
242232 kCycles for 1 * rep movsb
215571 kCycles for 1 * rep movsd
227473 kCycles for 1 * movlps qword ptr [esi+8*ecx]
230151 kCycles for 1 * movaps xmm0, oword ptr [esi]
146063 kCycles for 1 * movdqa + movntdq
146242 kCycles for 1 * movdqu + movntdq
146090 kCycles for 1 * movdqu + movntdq + mfence

NoCforMe · February 05, 2024, 08:48:18 AM

So here's the thing about my data-movement code: yes, I know that if speed were an issue here, I could make it go mo' faster by, say, moving DWORDs, or even QWORDs instead of measly bytes.

However, that would complicate things greatly; now before any move you have to figure out how many units of what size to move, and who knows? that code alone might reduce some of your speed advantage. (I suppose you could figure this out in advance, say in the initialization routine, set a flag and then branch to the appropriate move code when needed.)

And what if the "elements" you're working with are like this?

Code Select

SomeStruct    STRUC
   bigD        DD?
   lilW        DW?
SomeStruc    ENDS

Now you're limited to moving words. Worse yet:

Code Select

SomeStruct    STRUC
   bigD        DD?
   lilB        DB?
SomeStruc    ENDS

which puts you back to moving bytes (or some super-complex scheme where you move a DWORD and then a byte). Sure, you could say this is poor data design, but who knows? maybe this is just what the user needs.

sinsi · February 05, 2024, 10:23:12 AM

That's one reason I suggested you use a block of memory to hold pointers to your objects, it's easier to move a few DWORD pointers than arbitrarily-sized blocks. This way you could also store objects of different sizes, like strings.

NoCforMe · February 05, 2024, 10:52:44 AM

That is a really good idea: add another level of indirection to the list of structures, just a flat pointer list which would add 4 bytes per element, not a huge burden. Adding elements would be easy, and of course rearranging the list by moving pointers rather than structure elements.

However, element deletion would still be a problem: easy to delete from the pointer list, but you'd either have to waste structure space by "stranding" them, or else rearrange the structure list itself (as I'm doing now) to salvage that space and then go through the pointer list and update it. In any case, do-able.

sinsi · February 05, 2024, 11:14:53 AM

If you don't use any fancy switches for Windows, the high bit of an address is free to use since Windows will only allocate memory in the first 2GB. You could use that bit as a flag to indicate an empty slot and reuse it (and the old objects memory), or you could wait until you run out of slots and then do the garbage collection.

NoCforMe · February 05, 2024, 01:29:53 PM

One reason for this whole thread was that I wanted to show what I had to go through to figure out the mechanics of handling lists, specifically inserting new elements. (Deleting elements poses similar problems.) Every time I implement a scheme like this I have to sit down with paper and pencil and draw things out so I can figure out how to set up source and destination pointers and move counts. Which was kind of the motivation to code up a package that would do this and eliminate all that head-scratching.

So I'm "showing my work" here:

I'm wondering how many of you can solve problems like this without resorting to doodling on paper. To those who can, my hat is off to you. I guess I'm just a mere mortal in this regard; obviously I'm not about to win any Nobel Prizes for advanced physics.

Biterider · February 05, 2024, 06:33:12 PM

Hi NoCforMe
These data management constructs (lists, vectors, collections, ...) are the cornerstones of more advanced applications and are well worth the time to develop.

As a source from where you can take some code and ideas I can offer you
https://github.com/ObjAsm/ObjAsm-C.2/blob/master/Code/Objects/Collection.inc
https://github.com/ObjAsm/ObjAsm-C.2/blob/master/Code/Objects/DataCollection.inc

or the recent code template discussion about vectors here
https://masm32.com/board/index.php?topic=11659.0

A more advanced concept used in combination with IOCP
https://github.com/ObjAsm/ObjAsm-C.2/blob/master/Code/Objects/DataPool.inc

Regards, Biterider

jj2007 · February 05, 2024, 09:00:07 PM

Quote from: NoCforMe on February 05, 2024, 01:29:53 PMI'm wondering how many of you can solve problems like this without resorting to doodling on paper.

I can! But it costs me a lot of time to fumble code that would have been crystal clear with a piece of paper. Unfortunately I am too lazy to sit down and use a pencil.

NoCforMe · February 06, 2024, 07:06:13 AM

If that's the case, JJ, I heartily recommend next time that you overcome your laziness and take out paper and pencil. Amazing how visualizing things can help you zero in on the correct solution quickly.

NoCforMe · February 06, 2024, 02:55:05 PM

Continuing on, this is a part of my storage scheme that involves heap management, not list management, but since it's related I'm describing it here.

Here's what that list I've been playing with is used for: basically storing two strings, one a "display" string that shows on-screen, the other the "insert" string that gets inserted into the editor text when the user selects it. Pretty simple.

But now I'm adding a configuration editor to this scheme, which means that 1) strings can change length, becoming either shorter or longer, and 2) strings can be deleted. This poses problems.

If a string grows longer, it obviously can no longer live where it does in the heap, so the current heap entry will need to be abandoned (deleted) and a new one allocated. A shorter string can occupy the same space, perhaps with some left over (wasted).

So what do we do if a string grows or is deleted? Well, the easiest thing would be to do nothing, assuming we have enough room in the heap to waste space. But what if we want to recover that space?

One thing that occurred to me (actually last night in bed when I should have been sleeping!) was to use the string space in the heap to insert the string length; after all, we no longer need the string contents and can overwrite them, but we do need to know one crucial piece of info, namely how long the string is. So we can insert a 2-byte (WORD) length value there. Then when we need to allocate a new string (or when our garbage collector comes through the heap) we can use that value to see if there's enough space for the new string. (To make that work we'd have to "tag" the first byte of that word by setting the high bit to 1, and then be sure to strip off that bit when taking the value. This would work since no valid ASCII chararcter, at least in my scheme here, will have its high bit set.)

But there's a problem there (more lost sleep): We would have to ensure that there are at least 2 bytes available in each string. This isn't a problem for a non-NULL string, the shortest of which is 2 bytes, but what about a NULL string? That could just be a single byte (0). So we'd have to make sure that each string is at least 2 bytes long. That could work.

BTW, just to be clear, the length I'm talking about is not the string length obtained with strlen() but that value plus one to include the NULL terminator. (We're dealing with ASCII strings here, not Unicode.)

Just some more of the sticking points when designing storage strategies ...

The MASM Forum

News:

Managing data structures: lists

jj2007

NoCforMe

NoCforMe

Biterider

NoCforMe

jj2007

NoCforMe

sinsi

NoCforMe

sinsi

NoCforMe

Biterider

jj2007

NoCforMe

NoCforMe