i needed a little function to probe the stack down
i started out by getting the page size with GetSystemInfo
then, i found that there is no need to know the page size
(i know - it's always 4 kb, but that could change on newer OS's)
in the NT_TIB structure, EXCEPTION_REGISTRATION_RECORD, there is a member named "StackLimit"
as it turned out, the code is small enough to use in-line
ASSUME FS:Nothing
mov eax,esp
sub eax,<bytes required>
and al,-16 ;must be at least 4-aligned, 16-aligned used in this case
@@: push eax
mov esp,fs:[8]
cmp eax,esp
jb @B
mov esp,eax
ASSUME FS:ERROR
or, if you prefer...
ASSUME FS:Nothing
mov eax,esp
sub eax,<bytes required>
and al,-4 ;must be at least 4-aligned
.repeat
push edx
mov esp,fs:[8]
.until eax>=esp
mov esp,eax
ASSUME FS:ERROR
any register may be used, really - doesn't have to be EAX
and - the register that is PUSH'ed can be any general register
doesn't have to be the one used for calculation
it may be slightly faster if a different register is PUSH'ed, due to dependancy (as in second example)
it could be made into a macro rather easily
with a second optional arg for alignment
i am not very good with macros
maybe you guys can tell me if i did this right :P
probe MACRO nbytes:REQ,alignment:VARARG
IFNB <alignment>
AlignMent=alignment
ELSE
AlignMent=4
ENDIF
ASSUME FS:Nothing
mov eax,esp
sub eax,<bytes required>
and al,-AlignMent
.repeat
push eax
mov esp,fs:[8]
.until eax<=esp
ASSUME FS:ERROR
EXITM <eax>
ENDM
Good idea, but shouldn't your alignment code be adding 3 or 15 to the required size to compensate for the bits lost in the and operation?
I don't have time to test your macro, but I think the VARARG is not necessary. And what is <bytes required>?
well - i can see the macro doesn't do what i want, yet - lol
<bytes required> is the amount of desired stack space
let's say it's 31245 bytes - or, perhaps you have calculated the requirement in ECX or something
after that value has been subtracted from the current ESP (in EAX),
it can be aligned by AND'ing out the lower bits
i am using the code in-line, at the moment, with a calculated value in register
(see the second example in the first post)
it works great :P
Quote from: dedndave on January 26, 2013, 09:20:45 PM<bytes required> is the amount of desired stack space
let's say it's 31245 bytes
so, why you don't use a thrid parameter for that?
Quote from: dedndave on January 26, 2013, 09:20:45 PM
<bytes required> is the amount of desired stack space...
I should have worded my question differently. The macro parameter is nbytes, and I was asking why you refer to <bytes required> in the macro body, but I see the possibility now that <bytes required> was actually a sort of pseudo code.
This illustrates what I meant by compensating for the bits lost in the AND operation:
;==============================================================================
include \masm32\include\masm32rt.inc
;==============================================================================
;----------------------------------------
; Returns the maximum alignment of _ptr.
;----------------------------------------
alignment MACRO _ptr
push ecx
xor eax, eax
mov ecx, _ptr
bsf ecx, ecx
jz @F
mov eax, 1
shl eax, cl
@@:
pop ecx
EXITM <eax>
ENDM
;==============================================================================
.data
.code
;==============================================================================
start:
;==============================================================================
FOR A,<12,13,14,15,16>
mov ebx, A
and ebx, -16
printf("%d\t%d\t%d\n", A, ebx, alignment(ebx))
mov ebx, A
add ebx, A
and ebx, -16
printf("%d\t%d\t%d\n\n", A, ebx, alignment(ebx))
ENDM
inkey
exit
;==============================================================================
end start
12 0 0
12 16 16
13 0 0
13 16 16
14 0 0
14 16 16
15 0 0
15 16 16
16 16 16
16 32 32
But I see now that it will work regardless because the only side effect of the aligned value being below what is required is that the size of the stack may end up being one page larger than necessary.
qWord,
the <bytes required> is replaced by "nbytes", the first parameter, which is required
it would be nice if the first parameter could be a reg32, mem32, or imm32
the second parameter is optional, allowing for alignment higher that 4 (default=4)
Michael,
if we were to adjust the bytes required before subtracting it from the current ESP,
we would want to ADD (align-1), then AND (-align)
however, we subtract first
now, all we need to do is AND off the lower bits
let's say the current ESP is 0012FFFCh (some value that is always 4-aligned)
we want to "allocate" 31245 bytes, or 00007A0Dh
0012FFFCh - 00007A0Dh = 001285EFh, a value that is not 4-aligned
001285EFh is the value we'd like to use as a new stack pointer to allocate our space
to 4-align it, all we have to do is knock off the lower 2 bits
and al,-4
now, we have 001285ECh
the end result is, we get 3 more bytes than requested, but it is 4-aligned
the most we ever over-allocate is 4092 bytes (page size - 4)
Quote from: dedndave on January 27, 2013, 02:01:02 AM
the <bytes required> is replaced by "nbytes", the first parameter, which is required
it would be nice if the first parameter could be a reg32, mem32, or imm32
the second parameter is optional, allowing for alignment higher that 4 (default=4)
probe MACRO nbytes:REQ,alignment:=<4>
IF (OPATTR nbytes) AND 11110y
IF (OPATTR nbytes) AND 100y
IF nbytes GT 1000000 ; some limit for constants
.err <invalid parameter>
EXITM
ENDIF
ELSEIF (TYPE nbytes) NE 4
.err <wrong size for 1. parameter>
EXITM
ENDIF
ELSE
.err <invalid parameter>
EXITM
ENDIF
;...
endm
i don't think there is any simple way to test the limit of the allocated amount
i was going to leave that open-ended, at the programmer's discretion
it depends on how much memory the system has and how much is currently in use
i think stack space is sort of a "shared resource" with the current process heap
the way i am using it, it will never be greater than 32768
as for alignment,
any even power of 2 that is equal-to or greater-than 4 and less-than or equal-to page size is acceptable
(4,8,16,32,64,128,256,512,1024,2048,4096 are practical values)
my thought was to return the value from EAX, like this
mov esp,probe(edx,16)
or
mov esp,probe(ecx)
or
mov esp,probe(dwAllocAmount)
or
mov esp,probe(31245)
that way, it is obvious that you are adjusting ESP