A slightly modified heap allocation routine that I found on the forum. I was curious, that if HeapCreate succeeds and HeapAlloc fails
is the HeapFree call still valid the way it is written. I would think that lp_HeapAlloc is invalid.
Create_Buffer PROC
mov dw_HeapFail, 0 ; no error yet
invoke HeapCreate, 0 , 0 , 0
mov lp_HeapCreate, eax
.IF ( eax == NULL )
mov dw_HeapFail, 1
jmp Exit_CreateBuffer
.ENDIF
invoke HeapAlloc, lp_HeapCreate, HEAP_ZERO_MEMORY, MB2
mov lp_HeapAlloc, eax
.IF ( eax == NULL )
mov dw_HeapFail, 1
invoke HeapFree, lp_HeapCreate, 0, lp_HeapAlloc
.ENDIF
Exit_CreateBuffer:
ret
Create_Buffer ENDP
If HeapAlloc fails then nothing has been allocated, so there's nothing for you to free :t
Is there a reason you're creating a new heap? You can just call GetProcessHeap instead.
Also, all that global variables aren't the best...
i use global variables :P
during program init, i GetProcessHeap and store the hHeap in a global
unless you plan on freeing the block in the same PROC as it was allocated, global storage is useful
Yea I use globals, because I can access them from other proc.
Also I am too use to the old days where stack overflow was a problem, and it is my understanding that local variables are stored on the stack.
I was going to use GlobalAlloc but according to the data sheets Microsoft consider that API to be depreciated, and recommend using HeapCreate. This is my first experience with heap calls and I am unfamiliar with what the size limitations are on the process heap so I figured that it was best just to allocate a new one.
Thanks for you input. It is appreciated.
i generally use HeapAlloc because it is simple and fast
there are cases where they tell you to use GlobalAlloc
(functions involving data streams, or the clipboard)
i wouldn't be too concerned about the "depricated" message
i suspect GlobalAlloc is going to be around for a long time
for very large allocations, VirtualAlloc may be used (perhaps as a fall-back method)
Thanks.
My old memory is failing again but I think I used to use HeapAlloc instead of GetProcessHeap because of the possibility of Heap fragmentation. If it happens I can always delete my Heap and create a new one??
James
I don't think it's something you need to worry about.
If you're in a position to be able to delete your private heap, then presumably you either have or can free all memory allocations that belong to it. In which case, there will be no fragmentation.
Each process has its own memory space, and only you can allocate memory from that. So the only fragmentation is caused by your allocations. If you free everything then there are no fragments left to cause a problem.
(Note: some memory is allocated for loading the exe & dlls, but this exists regardless and shouldn't increase fragmentation risk.)
Basically wasn't mentioned HeapCompact - that solves problems once.
@ Don57
Were did you find that routine ? I searched the forum and could not find it ...
I would like to see the full example.
Also does anybody have a simple example of allocating heaps?
I am still not sure when it is better to use the stack with local variables or the heap with global (dynamic allocation)
If I am using global variables , I would like to know when it i proper to allocate the blocks
I looking at my old example of animating regions ( which looked so slow with more frames )
And I suspect this has something to do with it !
I'm not sure that creating heap was minded by authors of Windows memory manager - how's map new heap to flat memory space ?!
That's potential ability, and I suppose that it's only for special versions with extended system memory manager, or servers. Process have one heap, returned by GetProcessHeap - others uses with system additions.
Quote from: hfheatherfox07 on February 03, 2013, 06:11:45 AM
I am still not sure when it is better to use the stack with local variables or the heap with global...
There is a compromise: Use the .data? section. Nothing is faster ;-)
include \masm32\include\masm32rt.inc
FatBufLen= 10000000h ; one big buffer
ForLocals= 10000h ; plus some space for other Locals
.data?
align 16
TheBase LABEL byte
ORG $+FatBufLen+ForLocals-1
db ?
TheEnd LABEL byte
.code
OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
start proc noargs
LOCAL somedword, rc:RECT, myR8:REAL8, FatBuffer[FatBufLen]:BYTE
mov ebp, offset TheEnd
mov somedword, 12345678h ; use the locals
mov rc.left, 12h ; we'll print them below
mov rc.top, 34h
mov rc.right, 56h
mov rc.bottom, 78h
lea esi, FatBuffer
mov ecx, FatBufLen
push ecx
print str$(ecx), " bytes allocated", 13, 10
pop ecx
sar ecx, 2 ; DWORDs needed for lodsd
.Repeat
lodsd
.if eax
INT 3 ; just a quick check whether the OS initialises the 256 MB properly (it does...)
.endif
mov [esi-4], ecx ; let's write something to our 256MB buffer
dec ecx
.Until Zero?
print hex$(rc.left), 9, "rc left", 13, 10
inkey hex$(rc.bottom), 9, "rc bottom", 13, 10
exit
start endp
OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef
end start
Output:
268435456 bytes allocated
00000012 rc left
00000078 rc bottom
as far as where the memory "comes from",
there is probably little difference in allocating from the stack or from the heap
in other words, your process only gets so much total RAM :P
if you allocate from the stack, you have to take care of detecting "insufficient memory" errors
whereas if you allocate from the heap, it tells you that there was too little memory available by returning 0
freeing stack allocated memory is a little simpler
and - the stack is a bit faster
note that you may have to probe the stack down for anything other than very small allocations
it can still be faster than HeapAlloc
heap allocation may be used globally in different PROC's
Considering the amount of memory that most recent systems have installed, instead of probing the stack and incurring an exception-handling delay each time you probe into the guard page, why not just tell the linker to increase the stack size, setting the reserve and commit values to whatever you anticipate needing?
i don't see a big delay for probing into the guard page
but - you're right - why probe at all if you know how much you will be needing
Quote from: MichaelW on February 03, 2013, 10:30:27 AM
Considering the amount of memory that most recent systems have installed, instead of probing the stack and incurring an exception-handling delay each time you probe into the guard page, why not just tell the linker to increase the stack size, setting the reserve and commit values to whatever you anticipate needing?
You means put your buffers in the .data section? And initialize them?
no - he means alter the stack allocation with a linker switch
i do measure a delay of about 2 clock cycles per 4 kb when probing 256 kb
that is the first time you probe only, of course
if i re-use that stack space, i get 20 cycles or something - the probe loop overhead
still, that is WAY faster than HeapAlloc
it suffers the same anomoly - but many more cycles
the first time i allocate and free 256 kb with HeapAlloc, i get something over 90000 cycles
after that, it is more like 50000 cycles
for the benefit of newer coders, a little info about probing the stack...
QuoteBy default, Windows reserves 1 meg of virtual memory for the stack. No page of stack memory is actually allocated
(committed) until the page is accessed. This is demand-allocation. The page beyond the top of the stack is the guard
page. If this page is accessed, memory will be allocated for it, and the guard page moved downward by 4K (one page).
Thus, the stack can grow beyond the initial 1MB. Windows will not, however, let you grow the stack by accessing
discontiguous pages of memory. Going beyond the guard page causes an exception. Stack-probing code prevents this.
as quoted from osdev.org
http://wiki.osdev.org/How_kernel,_compiler,_and_C_library_work_together (http://wiki.osdev.org/How_kernel,_compiler,_and_C_library_work_together)
you can probe the stack down, 4 kb at a time until you get the amount you want
or, you can adjust the commit amount on the linker command line
the later is most useful if you know how much will be needed in advance
Thanks dedndave ,
I would love to learn more about it.....
Is there an example some where of how to do that? What are the switches ?
Is there a batch file for that ?
the linker switch is /STACK:reserve[,commit]
you can add that switch to the existing switches
so - just make a copy of the batch file you have and modify it
/STACK:1048576,16384
reserves 1 mb (the default under xp)
commits 16 kb of space
the behaviour varies a little on older operating systems
http://msdn.microsoft.com/en-us/library/8cxs58a6.aspx (http://msdn.microsoft.com/en-us/library/8cxs58a6.aspx)
Quote from: dedndave on February 03, 2013, 02:14:56 PM
the linker switch is /STACK:reserve[,commit]
you can add that switch to the existing switches
so - just make a copy of the batch file you have and modify it
/STACK:1048576,16384
reserves 1 mb (the default under xp)
commits 16 kb of space
the behaviour varies a little on older operating systems
http://msdn.microsoft.com/en-us/library/8cxs58a6.aspx (http://msdn.microsoft.com/en-us/library/8cxs58a6.aspx)
Did I do It right ?
I used an example from Vortex
Quote from: dedndave on February 03, 2013, 11:04:27 AM
i don't see a big delay for probing into the guard page
I guess it depends on what the meaning of "big" is. On my P3 Windows 2000 system I get roughly 9000 cycles for each probe into the guard page.
;==============================================================================
include \masm32\include\masm32rt.inc
.686
;==============================================================================
.data
count dq 0
.code
;==============================================================================
start:
;==============================================================================
invoke GetCurrentProcess
invoke SetProcessAffinityMask, eax, 1
invoke Sleep, 5000
ASSUME fs:NOTHING
mov ebx, fs:[8]
printf("%d\n", ebx )
sub ebx, 4096
mov DWORD PTR [ebx], 0
mov ebx, fs:[8]
printf("%d\n\n", ebx )
xor eax, eax
cpuid
rdtsc
push edx
push eax
mov ebx, fs:[8]
sub ebx, 4096
mov DWORD PTR [ebx], 0
xor eax, eax
cpuid
rdtsc
pop ecx
sub eax, ecx
pop ecx
sbb edx, ecx
mov DWORD PTR count, eax
mov DWORD PTR count+4, edx
printf("%I64d cycles\n\n", count)
mov ebx, fs:[8]
printf("%d\n\n", ebx )
inkey
exit
;==============================================================================
end start
1236992
1232896
9041 cycles
1228800
I get the same error trying to compile this :
Microsoft (R) Macro Assembler Version 6.14.8444
Copyright (C) Microsoft Corp 1981-1997. All rights reserved.
Assembling: main.asm
main.asm(19) : error A2008: syntax error : (
main.asm(25) : error A2008: syntax error : (
main.asm(46) : error A2008: syntax error : printf
main.asm(49) : error A2008: syntax error : (
_
Assembly Error
Press any key to continue . . .
Is it my linker ?
here is the test code i used, Michael
i used your timing macros and set the loop count to 1
i used the same 3 lines of code at startup as you have - with a 5 second delay
then, i made 5 passes on the test
the first pass shows the effect of probing
the remaining passes show the time for the probe loop overhead
it is advisable to run the test program several times to see realistic results
ASSUME FS:Nothing
mov eax,esp
mov edx,esp
sub eax,262144
and al,-16 ;16-aligned
.repeat
push edx
mov esp,fs:[8]
.until eax>=esp
mov esp,edx
ASSUME FS:ERROR
about 64 passes are made on the loop (256 kb / 4 kb)
so, you can take the first pass result
subtract out the loop overhead
and divide by 64 to get the effects of probing 1 page
QuoteIs it my linker ?
As qword pointed out in another thread, the problem is probably your version of MASM32. The printf and related macros require MASM32 version 11.
oh - i can use printf - i just can't use parens inside of parens :P
i typically use masm v 6.15, but i can switch to newer versions (not 11)
Heather
i looked at your linker switch batch file - it looks correct
you can observe the result using EditBin, or write code to see where the guard page is, relative to ESP
or, you can just access what you are supposed to have and see if an exception is generated :biggrin:
Quote from: dedndave on February 03, 2013, 11:18:39 PM
oh - i can use printf - i just can't use parens inside of parens :P
i typically use masm v 6.15, but i can switch to newer versions (not 11)
I guess Michael meant Masm
32 v11, not ML.exe 11.0 ;-)