News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

HeapAlloc alignment

Started by xanatose, January 31, 2015, 06:49:28 PM

Previous topic - Next topic

xanatose

Assuming the OS is a 64 bit windows (not 32 bit)

Is it safe to assume that HeapAlloc will return memory aligned to 16 bytes?

jj2007

Maybe for 64-bit code, but for 32-bit code it's still 8 bytes.

MichaelW

#2
In my quick test, coded as a 64-bit app using Pelles C and running under Windows7-64, the alignment was 32 bits, strangely enough, but for higher alignments there are the _aligned_malloc and _aligned_offset_malloc functions, which should be readily callable from assembly code.

Edit: After my liver has had time to process the bottle of wine I drank, make that 32 bytes, and in further testing sometimes 16.

alignment.asm:

;----------------------------------
; poasm /AAMD64 /Gr alignment.asm
;----------------------------------

.CODE
 
_alignment PROC PARMAREA = 40
    xor   rax, rax    ; prep for return zero
    bsf   rcx, rcx    ; scan passed pointer from bit 0 for first set bit
    jz    @F          ; return if no set bit
    mov   rax, 1      ; set bit 0
    shl   rax, cl     ; shift left by index of first set bit
  @@: 
    ret
_alignment ENDP
END


#include <windows.h>
#include <stdlib.h>
#include <stdio.h>
#include <conio.h>
//#include <malloc.h> // Per Pelles IDE "Use <stdlib.h> instead of non-standard <malloc.h>"

int _cdecl main(void)
{
    int _alignment(void *);

    void *p1 = HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, 1000000);
   
    printf("%d\n", _alignment(p1));

    void *p2 = _aligned_malloc(1000000, 128);

    printf("%d\n", _alignment(p2));

    HeapFree(GetProcessHeap(), 0, p1);

    _aligned_free(p2);

    _getch();
    return(0);
}


32
256
Well Microsoft, here's another nice mess you've gotten us into.

rrr314159

Sorry, didn't notice this q. b4.

Agner Fog has this to say on p.120 of his manual "Optimizing C++", 2013 (u can google it)
Quote12.8 Aligning dynamically allocated memory

Memory allocated with new or malloc is typically aligned by 8 rather than by 16. This is a problem with vector operations when alignment by 16 is required. The Intel compiler has solved this problem by defining _mm_malloc and _mm_free.
This statement should apply to 64 bit Windows since the document always mentions any differences between 32 - 64 OS's. Of course he could have missed it; but he's saying it's NOT safe to make your assumption. (No doubt new / malloc is calling HeapAlloc.) MichaelW's test indicates it's always aligned to 16; I don't suppose Pelles is doing that under the hood?

There's also the famous document "How to use Pageheap.exe in Windows XP, Win 2000, and Server 2003", an MS support page. I assume you're familiar with it, it's referenced all over the place. It says:

QuoteThe Windows heap managers (all versions) have always guaranteed that the heap allocations have a start address that is 8-byte aligned (on 64-bit platforms the alignment is 16-bytes).

It also points out that this alignment can be circumvented:

QuoteNOTE: Some programs make assumptions about 8-byte alignment and they stop working correctly with the /unaligned parameter. Microsoft Internet Explorer is one such program.

But this doc is not official MS dogma, and it's pretty old.

As long as I'm on the topic, my notes from a while ago say this:

Quote...MSDN GlobalAlloc is the ONLY function that mentions alignment:

"Memory allocated with this function is guaranteed to be aligned on an 8-byte boundary." (applies to 32 and 64)

GlobalAlloc is strongly related to HeapAlloc. On a related page:

Quote"Starting with 32-bit Windows, the global and local functions are implemented as wrapper functions that call the corresponding heap functions using a handle to the process's default heap."

I should re-check this but 2 lazy; my notes are probably correct. And, somewhere I saw that the minimum memory that can be allocated with 32-bit Windows is 8 bytes, but with 64-bit it's 16; didn't note where. That might imply (if you're optimistic) that heap alignment also changed from 8 to 16.

Putting it all together MS definitely does NOT guarantee 16-byte alignment (64-bit Win) but it may well be provided. But to be safe I would use _aligned_malloc, as MichaelW suggests, or similar.
I am NaN ;)

hutch--

Its trivial to align memory, what's the big deal ?

rrr314159

I'm just dumping everything my notes say on the subject - one of many subjects. If you're doing dynamic mem allocation with XMM it can be important, especially using lib functions u can't directly control (YMM less problem unaligned). As usual the solution is hand-code in assembler to get exactly what u want ... also don't use dynamic allocation! I was at first but no longer, use data? with "align 16" (as appropriate) instead, use same space for multiple structs when they don't conflict, re-use mem, put stuff on the stack, whatever's convenient. Eliminated various hard-to-trace bugs that cropped up with my amateur alloc'ing. But what do I know? - some people need aligned alloc, I suppose. "What's the big deal?" - don't ask me, ask an expert - e.g. yourself  :biggrin:
I am NaN ;)

hutch--


; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
    include \masm32\include\masm32rt.inc
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

comment * -----------------------------------------------------
                        Build this  template with
                       "CONSOLE ASSEMBLE AND LINK"
        ----------------------------------------------------- *

    memalign MACRO reg, number
      add reg, number - 1
      and reg, -number
    ENDM

    alignby equ <64>

    .code

start:
   
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    call main
    inkey
    exit

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

main proc

    LOCAL hMem  :DWORD
    LOCAL pAlg  :DWORD

    mov hMem, alloc(8192+alignby)

    mov eax, hMem
    memalign eax, alignby
    mov pAlg, eax               ; pointer to aligned memory

    print str$(pAlg),13,10

    free hMem

    ret

main endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

end start

jj2007

There is also StackBuffer(), which is guaranteed to be aligned for use with SIMD. As the name says, it uses the stack, and up to half a megabyte it is faster than HeapAlloc.

sinsi

http://www.masmforum.com/board/index.php?topic=16837.msg140127#msg140127

jj2007

Quote from: sinsi on February 07, 2015, 10:51:19 PM
http://www.masmforum.com/board/index.php?topic=16837.msg140127#msg140127

http://support.microsoft.com/kb/286470
Quoteon 64-bit platforms the alignment is 16-bytes
(Redmond speaking)

include \masm32\include\masm32rt.inc
.code
start: xor ebx, ebx
.Repeat
print str$(ebx), 9
invoke HeapAlloc, rv(GetProcessHeap), 0, 4
test al, 15
.if !Zero?
print hex$(eax), 9, "FOUL", 13, 10
.else
print hex$(eax), 9, "OK", 13, 10
.endif
inc ebx
.Until ebx>40
exit
end start


Win7-64:


15      002B3B50        OK
16      002B3B60        OK
17      002D50A8        FOUL
18      002D50D0        OK
19      002D50E0        OK


So the correct quote for our friends in Redmond should be:
Quoteon 64-bit platforms the alignment is 16-bytes, most of the time 8)

dedndave

out of curiosity....

what happens if you allocate sizes that are multiples of page size (4 KB) ?
i guess the heap is a collection of pages already allocated

sinsi

I read that to mean 64-bit code...

jj2007

Quote from: sinsi on February 08, 2015, 06:39:09 AM
I read that to mean 64-bit code...

"platform" is indeed a little bit ambiguous. Maybe somebody can test it with 64-bit code?

sinsi

Allocating 55 bytes at a time

0000000000363D40
0000000000363D80
0000000000363DC0
0000000000363E00
0000000000363E40
0000000000363E80
0000000000363EC0
0000000000363F00
0000000000363F40
0000000000363F80
000000000039DE70
000000000039DEE0
000000000039DF20
000000000039DF60
000000000039DFA0
000000000039DFE0
000000000039E020
000000000039E060
000000000039E0A0
000000000039E0E0
000000000039E120
000000000039E160
000000000039E1A0
000000000039E1E0
000000000039E220
000000000039E260
000000000039E2A0
000000000039E2E0
000000000039E320
000000000039E360
000000000039E3A0
000000000039E3E0

rrr314159

@sinsi, jj2007 - proving once again, long posts don't get read :P I quoted that doc above (found it on the thread sinsi ref'ed). As I said, it's 2003, and NOT official dogma - just a casual aside in a tutorial for XP. Doesn't apply to modern OS's. U can't trust such a ref.

@hutch - you're still wondering, who cares? when (as you show) it's a trivial problem. The word u may be looking for is "pedantic". When it comes to pedantry, I'm an oldbie
I am NaN ;)