News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Static RSP built in JWasm

Started by habran, June 14, 2013, 03:39:17 PM

Previous topic - Next topic

japheth

Quote from: japheth on July 07, 2013, 11:08:50 AM
because that is what ML64 emits, it has obviously abandoned S_BPREL32

I wondered why ML64 has abandoned S_BPREL32; after a bit of testing it turned out that it apparently wasn't just the joy to do things differently. S_BPREL32 works correctly for register EBP only - that is, if the upper 32-bits of RBP aren't zero, S_BPREL32 isn't appropriate.

It's a bug, but usually it doesn't matter, because even if the base address of your 64-bit binary is beyond the 4 GB frontier, the stack will still reside in the first 4 GB.

Here's a test case:

ExitProcess proto :dword

includelib <kernel32.lib>

.data?

stack db 8000h dup (?)
eos  dq 4 dup (?)

.code

option frame:auto
option win64:3

p2 proc frame a1:dword, a2:qword
local l1:qword
local l2:qword
mov rax,123
mov l1, rax
mov rcx,234
mov l2, rcx
ret
p2 endp

start proc frame
mov rsp, offset eos
invoke p2, 1, 2
invoke ExitProcess, 0
start endp

end start


assemble: jwasm -Zi -win64 test.asm
link: link /debug /base:0x140000000 test.obj /libpath:\wininc\lib64

habran

QuoteIt's a bug, but usually it doesn't matter, because even if the base address of your 64-bit binary is beyond the 4 GB frontier, the stack will still reside in the first 4 GB.
does it mean that we need to us RSP for a greater stack allocation?
is it an Intel bug?
Cod-Father

japheth

Quote from: habran on July 08, 2013, 07:15:36 PM
does it mean that we need to us RSP for a greater stack allocation?

Not at all. It just means what I already did say: that due to Windows current 64-bit implementation details the upper 32 bits or RSP ( and RBP, if it's used as frame pointer ), is zero - and that's why S_BPREL32 works "by chance".

habran

so, if I understood correctly,  in any case we can not allocate stack more than 4 GB?
Cod-Father

habran

I have fixed one bug and improved saving of xmm registers

before ve had this:

00000000000D1176  mov         qword ptr [rsp+8],rcx 
00000000000D117B  mov         qword ptr [rsp+10h],rdx 
00000000000D1180  sub         rsp,38h                        ;here we subtract rsp for locals xmm regs
00000000000D1184  movdqa      xmmword ptr [rsp],xmm1 
00000000000D1189  movdqa      xmmword ptr [rsp+10h],xmm2 
00000000000D118F  movdqa      xmmword ptr [aVar],xmm3 
00000000000D1195  sub         rsp,30h                       ;here we subtract rsp again for locals and shadows
00000000000D1199  mov         eax,dword ptr [val2] 
00000000000D119D  mov         dword ptr [bVar],eax 
00000000000D11A1  mov         qword ptr [val],21h 
00000000000D11AA  mov         rdx,qword ptr [val] 
00000000000D11AF  mov         rcx,0D4008h 
00000000000D11B9  call        printf (0D1292h) 
00000000000D11BE  mov         rax,22h   
00000000000D11C5  mov         qword ptr [aVar],rax 
00000000000D11CA  mov         rdx,qword ptr [aVar] 
00000000000D11CF  mov         rcx,0D400Fh 
00000000000D11D9  call        printf (0D1292h)   
00000000000D11DE  call        testproc2 (0D11FAh)   
00000000000D11E3  movdqa      xmm1,xmmword ptr [rsp+40h]  ;wrong displacement shpuld be 30h
00000000000D11E9  movdqa      xmm2,xmmword ptr [rsp+50h]  ;wrong displacement shpuld be 40h
00000000000D11EF  movdqa      xmm3,xmmword ptr [rsp+60h]  ;wrong displacement shpuld be 50h
00000000000D11F5  add         rsp,68h 
00000000000D11F9  ret   


after fix:

0000000000DE1176  mov         qword ptr [rsp+8],rcx 
0000000000DE117B  mov         qword ptr [rsp+10h],rdx 
0000000000DE1180  sub         rsp,68h                ;here we subtract at ones space for xmm and locals
0000000000DE1184  movdqa      xmmword ptr [rsp+30h],xmm1 
0000000000DE118A  movdqa      xmmword ptr [rsp+40h],xmm2 
0000000000DE1190  movdqa      xmmword ptr [rsp+50h],xmm3 
0000000000DE1196  mov         eax,dword ptr [val2] 
0000000000DE119A  mov         dword ptr [bVar],eax 
0000000000DE119E  mov         qword ptr [val],21h 
0000000000DE11A7  mov         rdx,qword ptr [val] 
0000000000DE11AC  mov         rcx,0DE4008h 
0000000000DE11B6  call        printf (0DE1292h) 
0000000000DE11BB  mov         rax,22h   
0000000000DE11C2  mov         qword ptr [aVar],rax 
0000000000DE11C7  mov         rdx,qword ptr [aVar] 
0000000000DE11CC  mov         rcx,0DE400Fh 
0000000000DE11D6  call        printf (0DE1292h)   
0000000000DE11DB  call        testproc2 (0DE11F7h)       
0000000000DE11E0  movdqa      xmm1,xmmword ptr [rsp+30h]  ;now location is correct
0000000000DE11E6  movdqa      xmm2,xmmword ptr [rsp+40h] 
0000000000DE11EC  movdqa      xmm3,xmmword ptr [rsp+50h] 
0000000000DE11F2  add         rsp,68h 
0000000000DE11F6  ret


Cod-Father

habran

I have find out that debugging depends on the linker
if you build with MSVC8 you can debug it with MSVC8 Debugger or WinDbg 6.12
If you build it with MSVC12 you can debug it with MSVC12 Debugger or WinDbg 6.2
Cod-Father

habran

sorry, mea culpa :icon_redface:
the last bug eliminated(hopefully) :bgrin:
Cod-Father

habran

the last bug was not really the last, it just pretended to be the one :icon_eek:
but this one was real one, just undercover :biggrin:
now everything glides :t
Cod-Father

japheth

Quote from: habran on July 18, 2013, 09:40:21 AM
but this one was real one, just undercover :biggrin:

Wow, great!

I always wonder, when releasing another jwasm version with a few dozen bug fixes, how any of the previous versions could ever have been regarded as "stable".  :bgrin:

habran

we realists always hope for best and at the same tame expect the worst :biggrin:
I experience now the responsibility which a programmers have by publishing their code :icon_mrgreen:
but at the same time I am happy to be able to create that perfect tool  8)
Cod-Father

habran

one more small fix for debug info :biggrin:
Cod-Father

habran

I have replaced the folder with the better one ::)
If you are curious why am I insisting on this version of jwasm when Japheth published new version
JWasm211, which does include option STACKBASE, the answer is:
1) JWasm211 doesn't align first local to 16 byte
2) JWasm211 doesn't use home space if it is free to store registers
3) JWasm211 doesn't have .for/.endfor

I appreciate Japheth and his precious JWasm, I just added little bit more to it :t   
Cod-Father

habran

I uploaded at the top a new, better version of JWasm.exe with some more fixes
as far as I tested it, now debug info works in all situations

I waned to upload complete source with MSVC12 project and exe but it doesnt fit in allowed forum size
even without exe it is not small enough when compressed with winzip
so, I will attach here main folder with .c extension and in new post H folder
I think that the upload size should be increased to 1 MB :biggrin:
Cod-Father

habran

here are headers, just decompress it and drop it in JWasm folder
Cod-Father

habran

Hi again :biggrin:

I have worked more on JWasm and added some more sophisticated features to it
Now it can decide by itself if there is a need for reserved stack
if function is not having invoke inside, there is no need for alocating the stack space
also, if there is USES command and a space for up to 4 registers in the home space
it will PUSH the last one for alignment instead SUB RSP,8

all together it makes it intelligent tool
you don't need any more use:
option win64:0
OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE

it will realize that there is no need for  PROLOGUE

the binaries are at the top of this thread

it is now advanced so much that Germans would call it "Vorsprung durch Technik" ;)

here is the .c source
Cod-Father