News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

MSVCRT aligned malloc

Started by MichaelW, July 28, 2012, 05:46:26 PM

Previous topic - Next topic

TouEnMasm

I don't see where is the problem.
All results seems perfectly predictable,perhaps a more speaking sample ?.
If it is more,it is good ,no ?
Quote
------- would have to base the alignment on the specified size ---------
The alignment is based only on the value of the adress,that all.

Fa is a musical note to play with CL

MichaelW

Quote from: ToutEnMasm on April 29, 2013, 11:35:27 PM
Quote
------- would have to base the alignment on the specified size ---------
The alignment is based only on the value of the adress,that all.

http://msdn.microsoft.com/en-us/library/vstudio/ycsb6wwf.aspx

Quotemalloc is guaranteed to return memory that's aligned on a boundary that's suitable for storing any object that could fit in the amount of memory that's allocated. For example, a four-byte allocation would be aligned on a boundary that supports any four-byte or smaller object. Memory alignment on a boundary that's suitable for a larger object than will fit in the allocation is not guaranteed.

I interpret this to mean that an 8-byte allocation would have an 8-byte alignment, a 16-byte allocation a 16-byte alignment, a 32-byte allocation a 32-byte alignment, and so on. Here are my results under Windows XP for malloc when linking with MSVCRT.lib:

16      16
32      8
64      16
128     8
256     64
512     64
1024    8
2048    16
4096    8


And for malloc when linking with LIBC.lib:

16      32
32      8
64      32
128     8
256     16
512     8
1024    16
2048    8
4096    64

Well Microsoft, here's another nice mess you've gotten us into.

TouEnMasm

I agree that microsoft explain is unclear.
The value of the alignment had nothing to do with the size of the allocated mem.
Perhaps microsoft want to say,that you need to choose an alignment for each size
of object you want to be aligned.
A 4 bytes object need a 4 bytes alignment
A 8 bytes object need a 8 bytes alignment,and so on.


The _aligned_.. functions give allways the good result.
if you had used the malloc function (unalign) , there is a minimal alignment of 8 not granted.
Alignment of malloc is >= 8 (result given only by test)
All alignments > 8 are just random and depend of what is in memory.

If you change the lib,you modify the content of the memory and the random part of alignment > 8 is changed.

Fa is a musical note to play with CL

hutch--

 :biggrin:

Looking at a complicated mess like this is much the reason why I use an API memory allocation strategy and align it myself.  :P

jj2007

Quote from: hutch-- on April 30, 2013, 03:20:03 PM
Looking at a complicated mess like this is much the reason why I use an API memory allocation strategy and align it myself.  :P

That is certainly a good strategy, Hutch, but it fails completely when you are using third party software (in this case: GNU Scientific Library).

By the way, we had an earlier thread about this.

Thanks to everybody, especially Michael for the tests :icon14:

japheth

Quote from: jj2007 on April 30, 2013, 04:25:26 PM
That is certainly a good strategy, Hutch, but it fails completely when you are using third party software (in this case: GNU Scientific Library).

Not necessarily. You can intercept the malloc/free memory allocation functions and replace them with your own ones.

However, I'm not sure whether you're ready for such slightly advanced stuff.

jj2007

Quote from: japheth on April 30, 2013, 05:33:57 PMHowever, I'm not sure whether you're ready for such slightly advanced stuff.

O Grösster Coder Aller Zeiten, Dein unwerter Schüler verneigt sich vor Dir :eusa_boohoo:

Of course, I am not a professional coder like you, just a humble economist, and as such I will have to balance the effort of diving into the fascinating world of hooking HeapAlloc against the relative easiness of checking the pointer and deciding whether to use movaps or movlps & movhps.

MichaelW

The older statement:

"The storage space pointed to by the return value is guaranteed to be suitably aligned for storage of any type of object."

With the assumption that the object must fit in the storage space, could also be interpreted in the same way, but AFAIK malloc has never behaved this way.

Well Microsoft, here's another nice mess you've gotten us into.

qWord

Quote from: jj2007 on April 30, 2013, 09:03:29 PMthe fascinating world of hooking HeapAlloc
the GNU GPL doesn't prohibits you to modify the available source code.
MREAL macros - when you need floating point arithmetic while assembling!

nidud

#24
deleted

jj2007

Quote from: qWord on April 30, 2013, 11:42:00 PM
Quote from: jj2007 on April 30, 2013, 09:03:29 PMthe fascinating world of hooking HeapAlloc
the GNU GPL doesn't prohibits you to modify the available source code.

Well, that's another option, but it means wasting a lot of time trying to find the exact configuration of whatever C compiler they are using. Besides, it's Linux...
The internet is full of desperate posts asking "what does that error message mean? where do I find the xyx header file?" etc etc - far too much trouble for saving a cycle on movaps vs movlps+movhps.

@nidud:
> Think it will be easier to align the pointer
Yes, I use the technique in Recall. But that's my own source, and it is assembler...

japheth

Quote from: jj2007 on April 30, 2013, 09:03:29 PM
Quote from: japheth on April 30, 2013, 05:33:57 PMHowever, I'm not sure whether you're ready for such slightly advanced stuff.

O Grösster Coder Aller Zeiten, Dein unwerter Schüler verneigt sich vor Dir :eusa_boohoo:

Of course, I am not a professional coder like you, just a humble economist

What's your problem, jj? You've shown quite a few times that your skills in C are limited and that you're a bit biased.

Btw, I'm also "economist" ( Diplom-Volkswirt ).


nidud

#27
deleted

jj2007

Quote from: nidud on May 01, 2013, 04:09:00 AM
> But that's my own source, and it is assembler...
So what is the problem then?

Recall is my own source, gsl_matrix_alloc isn't.

Quote
> In short: I need a 16-byte aligned matrix.

The functions in these libraries test the alignment of pointers

They don't. GSL just doesn't use SSE, that's why I need to make it faster:
tmp+=gsl_matrix_get(mxM, 0, i)*gsl_matrix_get(mxC, j, i);
Looks simple but when you see it through the eyes of Olly you'll get the creeps.

Re changing the GSL source code: Wow, nice idea.

There are 350+ examples in \Masm32\examples, and exactly one does not assemble "out of the box" (because Bill chose a strange name for rc files, so you need to handle that). That is what I call quality code, and in this respect, Hutch is a hero for me :t

In contrast, in my numerous attempts to get other coder's C proggies running, I have never ever experienced a case where it just compiled without throwing some obscure error messages. I am just not enough masochist to attack the GSL source code, it's a waste of time.

Besides, movaps and movlps/movhps show exactly the same cycle count, at least on my trusty old Celeron. In fact, the whole point of this thread was to demonstrate that Microsoft is lying when they write "malloc is required to return memory on a 16-byte boundary" (VS 2005, VS 2008).

nidud

#29
deleted