Author Topic: Yet Another Invoke Macro ...  (Read 17672 times)

rrr314159

  • Member
  • *****
  • Posts: 1382
Yet Another Invoke Macro ...
« on: January 29, 2015, 07:44:42 PM »
I finally got my favorite project ("MathMovie") converted to 64-bit. It took so long because a lot of MM had to be rewritten with "advanced" (to me) techniques I learned here. Originally used 8088-era techniques - it worked but was a horrible mess (still is, actually). It was about 17,000 lines, 30 of them macros; now it's about 13,000, 600 of them in macros. One of the most useful is my invoke macro (nvk).

You're wondering: JWasm has invoke, so who wants yet another invoke macro? For one thing, I need to assemble under ML64 also (various good reasons). More important, JWasm invoke doesn't do what I want (see below).

Many issues are involved in 64-bit conversion, the major one being stack alignment / calling convention (handled by nvk). This huge issue has consumed many person-years of expert coders across the globe - so why am I able to get past it so easily? Not because I'm smarter; au contraire, I'm dumber than most of them. It's because the issue has only to do with Windows interfacing. The hardware doesn't force you to align the stack, or pass parameters in rcx, rdx, r8, r9; nor does it care about all the other arcana of Windows calling conventions: reals in xmm's (unless vararg etc), prologues, epilogues, stack frame pointers, SEH etc etc. If you simply want to get 32-bit code working in 64-bits, you can (almost) ignore all that stuff. Of course you lose a lot: no codeview, no symbolic debugging, can't create Windows-called routines, etc.

I still have to interface with many Windows functions; nvk takes care of that. It's tested thoroughly with 100 functions, the ones I need. Below is a partial list (leaves out minor ones like strlen, etc). Perhaps the trickiest involve threading, but the most demanding was good old MessageBox (probably because it's the only one that pops up its own window). There are some Windows functions nvk won't support, but I don't happen to know what they are.
Code: [Select]
some major Windows functions tested with nvk invoke macro, no particular order

GetModuleHandleA proto :LPSTR
GetCommandLineA  proto
ExitProcess      proto :DWORD
LoadIconA        proto :HINSTANCE, :LPSTR
LoadCursorA      proto :HINSTANCE, :LPSTR
RegisterClassExA proto :ptr WNDCLASSEXA
CreateWindowExA  proto :DWORD, :LPSTR, :LPSTR, :DWORD, :SDWORD, :SDWORD, :SDWORD, :SDWORD, :HWND, :HMENU, :HINSTANCE, :LPVOID
ShowWindow       proto :HWND, :SDWORD
UpdateWindow     proto :HWND
GetMessageA      proto :ptr MSG, :HWND, :SDWORD, :SDWORD
TranslateMessage proto :ptr MSG
DispatchMessageA proto :ptr MSG
PostQuitMessage  proto :SDWORD
DefWindowProcA   proto :HWND, :UINT, :WPARAM, :LPARAM
MoveWindow proto :HWND, :DWORD, :DWORD, :DWORD, :DWORD, :BOOL
SetWindowTextA proto :HWND, :LPSTR
InvalidateRect proto :HWND, :ptr RECT, :BOOL
BeginPaint proto :HWND, :LPPAINTSTRUCT
GetClientRect proto :HWND, :LPRECT
DrawTextA proto :HDC, :LPSTR, :DWORD, :LPRECT, :DWORD
EndPaint proto :HWND, :ptr PAINTSTRUCT
PostMessageA proto :HWND, :DWORD, :WPARAM, :LPARAM
CreateFileA proto :LPSTR, :DWORD, :DWORD, :LPSECURITY_ATTRIBUTES, :DWORD, :DWORD, :HANDLE
WriteFile proto :HANDLE, :LPCVOID, :DWORD, :LPDWORD, :LPOVERLAPPED
ReadFile proto :HANDLE, :LPVOID, :DWORD, :LPDWORD, :LPOVERLAPPED
CloseHandle proto :HANDLE
CreateThread proto :LPSECURITY_ATTRIBUTES, :SIZE_T, :LPTHREAD_START_ROUTINE, :LPVOID, :DWORD, :LPDWORD
ExitThread proto :DWORD
BitBlt proto ;:HDC, :DWORD, :DWORD, :DWORD, :DWORD, :HDC, :DWORD, :DWORD, :DWORD
StretchBlt proto :HDC, :DWORD, :DWORD, :DWORD, :DWORD, :HDC, :DWORD, :DWORD, :DWORD, :DWORD, :DWORD
printf proto :ptr SBYTE, :VARARG
sprintf proto :ptr SBYTE, :ptr SBYTE, :VARARG
sscanf proto :ptr SBYTE, :ptr SBYTE, :VARARG
CreateCompatibleDC proto :HDC
GetDC proto :HWND
CreateDIBSection proto :HDC, :ptr BITMAPINFO, :DWORD, :ptr ptr, :HANDLE, :DWORD
SelectObject proto :HDC, :HGDIOBJ
DeleteObject proto :HGDIOBJ
PeekMessageA proto :LPMSG, :HWND, :DWORD, :DWORD, :DWORD
DeleteDC proto :HDC
GetCurrentProcess proto ; all these in winbase.inc
SetProcessAffinityMask proto :HANDLE, :DWORD_PTR
GetPriorityClass proto :HANDLE
SetPriorityClass proto :HANDLE, :DWORD
GetCurrentThread proto
GetThreadPriority proto :HANDLE
SetThreadPriority proto :HANDLE, :DWORD
SetThreadAffinityMask proto :HANDLE, :DWORD_PTR
Sleep proto :DWORD ; this one works  ; in winbase
Beep proto :DWORD, :DWORD  ; winbase
QueryPerformanceCounter proto :ptr writeanythinghereLARGE_INTEGER
QueryPerformanceFrequency proto :ptr LARGE_INTEGER
GetStdHandle proto :DWORD
WriteConsoleA proto :HANDLE, :ptr , :DWORD, :LPDWORD, :LPVOID
MessageBoxA proto :HWND, :LPSTR, :LPSTR, :DWORD
DestroyWindow proto :HWND
IsZoomed proto :HWND
LoadMenuA proto :HINSTANCE, :LPSTR
GetMenu proto :HWND
SetMenu proto :HWND, :HMENU
GetSubMenu proto :HMENU, :DWORD
CheckMenuItem proto :HMENU, :DWORD, :DWORD
CheckMenuRadioItem proto :HMENU, :DWORD, :DWORD, :DWORD, :DWORD
SetMenuItemInfoA proto :HMENU, :DWORD, :BOOL, :LPCMENUITEMINFOA
TrackPopupMenu proto :HMENU, :DWORD, :DWORD, :DWORD, :DWORD, :HWND, :ptr RECT
GetClientRect proto :HWND, :LPRECT
GetWindowRect proto :HWND, :LPRECT
SetCursor proto :HCURSOR
GetWindowLongA proto :HWND, :DWORD
SetWindowLongA proto :HWND, :DWORD, :SDWORD
SendMessageA proto :HWND, :DWORD, :WPARAM, :LPARAM
SetFocus proto :HWND
SetProcessAffinityMask proto :HANDLE, :DWORD_PTR
GetCurrentProcessId proto
GetProcessAffinityMask proto :HANDLE, :ptr DWORD_PTR, :ptr DWORD_PTR
GetCurrentThread, proto
GetCurrentThreadId proto
SetThreadAffinityMask proto :HANDLE, :DWORD_PTR
lstrcatA proto :LPSTR, :LPSTR ; and other minor ones like this
The other reason nvk can be n00b-written is, it's not very efficient: follows the KISS principle (keep it simple, sailor). My project has two major loops running at approximately 30 iterations /second and 60 million iterations / second. Windows is never invoked from the inner loop, since there are only a few hundred instruction cycles to play with (per core, of course). The 30-ips loop uses about 10 million cycles per iteration, and averages less than 1000 Windows invocations: so speed simply isn't an issue.

nvk "features" include:

- Forget about stack alignment. No "and rsp, -16"; you can push/pop across invocations (64 or 16 bit). For testing, I even adjust the stack randomly by odd numbers throughout my code (sub rsp, 4321, followed later, after a dozen invocations, by add rsp, 4321). JWasm can't do that.
- nvk never complains "register value overwritten" like JWasm. (It only takes a few extra instructions to avoid this.) In fact, it has no error messages at all; when fed bad args it just blows up all over the place.
- it handles "aDdR", and real4, real10, bytes/words/dwords, structures, all odd-sized arguments correctly. Some of these JWasm invoke doesn't do right (altho undoubtedly there are some types I haven't run into that nvk doesn't handle, but JWasm does).

This post is written for people, like me, who just want to get their 32-bit code up and running, and will worry about SEH (etc etc) later; not for experts, who of course already know this stuff. To them: please let me know what I'm doing wrong, if you've got nothing better to do at the moment. My macro technique is primitive, any tips to clean it up wld be welcome. Nvk is full of unknown (to me) bugs, so if you notice one pls inform.

There are comments in the code, here's a brief description. When called with a function and arguments, it first aligns to 16 bits, and stores the original rsp on the stack for later recovery. All registers, including rbp, are preserved. Then the args are counted and rounded up to an even number, which (times eight) is sub'ed from rsp. The arguments are put on the stack in reverse order, including the first four (always spilled to shadow space). Then the first (up to) four are read off the stack into rcx .. r9, which can of course appear as arguments. Finally the function is called; upon return, the stack is restored.

The zip includes the JWasm sample Win64_2.asm, with minimal mods for ML64 compatibility, plus makeJ.bat and makeM.bat. Should be self-explanatory.
Code: [Select]
;***********************
; INVOKE MACROS for ML64 and JWasm (if you want to use it there), by rrr314159 2015/1/29
;***********************
IFNDEF __JWASM__                                    ;; actually I prefer nvk in JWasm as well
    invoke equ nvk
ENDIF
;***********************
;***********************
nvk MACRO thefun:REQ, args:VARARG  ; "invoke"
;;***********************
;; ALIGN stack to 16 bits, call nvk_noalign, then restore rsp

    push rbp           
    lea  rbp, [rsp][8]                              ;; save entering rsp first into rbp then onto stack
    sub rsp, 8                                      ;; make room for entering rsp that was saved in rbp
    and rsp, -10h                                   ;; align 16
    mov [rsp], rbp                                  ;; put entering rsp onto stack
    mov rbp, [rbp-8]                                ;; restore rbp

    nvk_noalign thefun, args
   
    pop rsp                                         ;; right back where we started from
ENDM

;***********************
nvk_noalign MACRO thefun:REQ, args:VARARG           ;; "invoke" without aligning
;;***********************
LOCAL txt, cnt, stackadjust, cnttopass
;; Prepare stack for arguments, load stack, call function, restore rsp
;; called from nvk; also can call directly if u know stack is aligned

;; Count the arguments, prepare reversed arg list

    cnt = 0
    IFNB <args>                                     ;; if args blank skip most of the work
        txt equ <>
%       FOR arg, <args>
            txt CATSTR <arg> , <,>, txt
            cnt = cnt + 1
        ENDM
        txt SUBSTR txt, 1                           ;; force expression evaluation
        txt SUBSTR txt, 1, @SizeStr( %txt ) - 1

;; Adjust stack for args, rounded up to 16 bits (necessary for some funs)

        IF cnt GT 4
            stackadjust = cnt
        ELSE
            stackadjust = 4
        ENDIF
        stackadjust = ((stackadjust+1)/2)*2 ; round up to 16
        sub rsp, stackadjust * 8

;; Load stack, saving rdx in home space to be restored after each arg

        mov [rsp], rdx                              ;; cld also spill rcx,r8,9 if needed, or whatever
        cnttopass = cnt                             ;; pass by ref, so don't send cnt (gets clobbered)
        nvk_loadstack txt, cnttopass
       
;; Load four regs from prepared args loaded on stack

        mov rcx, [rsp]
        mov rdx, [rsp+8]
        mov r8, [rsp+10h]
        mov r9, [rsp+18h]

;; Adjust by 20h, no other work needed, if arglist was blank

    ELSE
        sub rsp, 20h
    ENDIF

;; Call the function (finally), afterwards restore stack

    call thefun             
    IF cnt GT 0
        add rsp, stackadjust * 8
    ELSE
        add rsp, 20h
    ENDIF
ENDM

;***********************
nvk_loadstack MACRO args, posonstack
;;***********************
local leacmd
;; Load args on stack in reverse, restore rdx after each (it may in the arg list)

%   FOR arg, <args>
        posonstack = posonstack - 1
        mov rdx, [rsp]                                  ;; rdx gets orig value each time

;; Check for "aDdR", if present prepare lea instruction and execute it

        leacmd equ <@afteraddr(arg)>                    ;; if addr, returns after-addr text
        IFNB leacmd
            leacmd CATSTR <lea rdx, >, leacmd
            &leacmd                                     ;; execute lea instruction
            mov [rsp + posonstack*8], rdx
        ELSE

;; Convert types as necessary; if real/integer not 8 bytes, convert to real8/qword

            IF TYPE(arg) EQ REAL4 OR TYPE(arg) EQ REAL10
                fld arg
                fstp REAL8 PTR [rsp + posonstack*8]
            ELSE
                IF TYPE(arg) EQ 1 OR TYPE(arg) EQ 2
                    movsx edx, arg
                ELSEIF TYPE(arg) EQ 4
                    mov edx, arg
                ELSE
                    mov rdx, arg
                ENDIF
                mov [rsp + posonstack*8], rdx
            ENDIF
        ENDIF
    ENDM
ENDM

;***********************
@afteraddr MACRO thetxt:=<>
;;***********************
LOCAL char, answer, iswhite, numchars

;; If argument starts with addr return rest of string, else blank

    answer equ <>
    numchars = 0
    FORC char, <&thetxt>
        IF numchars EQ 0
            iswhite INSTR 1,< >,<&char>       ;; trim leading spaces or tabs
            IFE iswhite
                answer CATSTR answer,<&char>
                numchars = 1
            ENDIF
        ELSEIF numchars LT 4
                answer CATSTR answer,<&char>
                numchars = numchars + 1
        ELSEIF numchars EQ 4
       
;; "answer" now holds first 4 chars after whitespace, is it "aDdR"?

            IFIDNI <addr>, answer
                answer equ <>                       ;; says addr; now get the latter part of arg
                numchars = 5                        ;; anything > 4 will do; no longer counting
            ELSE
                EXITM <>                            ;; not addr, return blank
            ENDIF
        ELSE                                        ;; numchars > 4 means get rest of string
            answer CATSTR answer,<&char>
        ENDIF
    ENDM

;; If, after trimming, arg was too short, it couldn't be addr

    IF numchars LT 5                           
        answer equ <>
    ENDIF       
EXITM answer
ENDM

;***********************

« Last Edit: January 30, 2015, 02:05:37 PM by rrr314159 »
I am NaN ;)

GoneFishing

  • Member
  • *****
  • Posts: 1071
  • Gone fishing
Re: Yet Another Invoke Macro ...
« Reply #1 on: January 29, 2015, 09:31:08 PM »
Hi rrr,
Interesting set of macros! Currently I'm testing my FCALL macro (i.e. 'FASTCALL' for JWASM on Linux) . Thinking about the best way to align the stack to 16 bits I've looked through your code and have one question :
from your "Adjust stack for args" routine:
Quote
stackadjust = ((stackadjust+1)/2)*2 ; round up to 16
 
is ((stackadjust+1)/2)*2 expression equal to (stackadjust+1) ?
Do we need to perform ODD-EVEN check on number of stack args here ? What if stackadjust=6 ,or say 8?

EDIT: I got it now - those expression are not equal for assembler 

rrr314159

  • Member
  • *****
  • Posts: 1382
Re: Yet Another Invoke Macro ...
« Reply #2 on: January 30, 2015, 02:19:08 AM »
No doubt u figured it out, Vertograd, this statement increases an odd number to the next even number. E.g. if you have 7 arguments this puts it up to 8. That way when it's multiplied by 8 bits it's a multiple of 16, so the stack remains aligned to 16. There are a couple important points to know. This is necessary for Windows functions, NOT for the hardware (the Intel chip). So you need to consider how Linux does it - it may not be necessary there. The same goes for the other things I'm doing, they're not required by the hardware so may not be necessary in Linux - dunno, haven't studied it. If you want I could look at it. The other point is: these adjustments are NOT required for all Windows functions! Some are perfectly happy with an un-adjusted stack. So if you (or anyone) tests whether it's necessary you may decide I'm wrong if you only look at a few functions. If anyone's interested I could discuss which functions are particularly picky - Some are picky about one adjustment but not others, so it's complicated. Good luck with Linux, it's got to make more sense than Windows!
I am NaN ;)

sinsi

  • Guest
Re: Yet Another Invoke Macro ...
« Reply #3 on: January 30, 2015, 06:12:07 AM »
Simple function MessageBox called when unaligned :(
Code: [Select]
(13d4.e58): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\Windows\system32\LPK.dll -
LPK!LpkDrawTextEx+0x315:
000007fe`ff611775 440f29842450010000 movaps xmmword ptr [rsp+150h],xmm8 ss:00000000`0006f708=000007fefef5a8e40000000001c81320

GoneFishing

  • Member
  • *****
  • Posts: 1071
  • Gone fishing
Re: Yet Another Invoke Macro ...
« Reply #4 on: January 30, 2015, 06:33:31 AM »
@sinsi: Interesting that the stack is not aligned to 16 bits in the MessageBox sample in the  Introduction to x64 Assembly:
Code: [Select]
; Sample x64 Assembly Program
; Chris Lomont 2009 www.lomont.org
extrn ExitProcess: PROC   ; external functions in system libraries
extrn MessageBoxA: PROC
.data
caption db '64-bit hello!', 0
message db 'Hello World!', 0
.code
Start PROC
  sub    rsp,28h      ; shadow space, aligns stack
  mov    rcx, 0       ; hWnd = HWND_DESKTOP
  lea    rdx, message ; LPCSTR lpText
  lea    r8,  caption ; LPCSTR lpCaption
  mov    r9d, 0       ; uType = MB_OK
  call   MessageBoxA  ; call MessageBox API function
  mov    ecx, eax     ; uExitCode = MessageBox(...)
  call ExitProcess
Start ENDP
End
It worked on Windows 8 ... not sure what version

@rrr:
  After some reading I clearly understand that I'll have to re-write my FCALL macro from scratch.
As of now it doesn't handle floating point arguments and doesn't align the stack on 16 bits which seems to be necessary on Linux too (at least in some cases). 
I'm thinking about creating thin Platform Abstraction Layer for JWASM  - set of macros to make my self-educational programming process more comfortable on both Windows and Linux computers ... if only my laziness will allow me to do it  :biggrin:


sinsi

  • Guest
Re: Yet Another Invoke Macro ...
« Reply #5 on: January 30, 2015, 07:44:41 AM »
Code: [Select]
  sub    rsp,28h      ; shadow space, aligns stackThe stack is aligned, on entry the stack is always unaligned by 8, the sub rsp,28h aligns it and allows 32 bytes for the spill.
Try it with that line commented out.

rrr314159

  • Member
  • *****
  • Posts: 1382
Re: Yet Another Invoke Macro ...
« Reply #6 on: January 30, 2015, 01:36:31 PM »
Sinsi, of course you're right the stack (on the main or "start" program entry) is unaligned by 8 but there are potential gotchas for beginners, especially with Windows programs (i.e. subsystem:windows in the linker). You can see confused posters, going back for years, fall for these.

For one, on entry into the Windows callback function (usually called WndProc) the stack is aligned to 16 (I'm talking about 64-bit of course). I haven't actually read this anywhere but that's what I've found. WndProc is the main "entry" into a typical window program, so it's easy to get confused. Similarly when you call "WinMain" in a typical program it's aligned to 16, because it's called right after program entry, adding 8 to the initially unaligned stack. It's sometimes considered the "C" entry point, while the real "start" is the "Masm" entry point; which can also be called WinMainCRTStartup ... So it's easy for an assembler beginner, or a C programmer no matter how advanced, to be unsure what "on entry" really means. Typically both these other Windows "entry points" are aligned to 16, not unaligned by 8. The names "start" and "main" are used promiscuously, and they're affected by the linker settings SUBSYSTEM and ENTRY.

MessageBox is very picky; most other Windows calls don't care about the alignment, at least if it's 8 off (printf family generally will work when it's unaligned by other numbers, in fact). So you can go along happily thinking you've got it figured out; even MessageBox will work half the time (on average); but sooner or later it will explode.

Another gotcha: people put "and rsp, -10h" at the top of their program, thinking they're covered; but now they've changed the alignment. If you're unaligned by 8 you MUST use 28h with messagebox, but if you're aligned you MUST use 20h with MB (and some other picky functions).

Then there's odd number of args over 4, which normally should be rounded up to provide 16-bit offset; BUT only if the function was called aligned. Many functions work unaligned, but then if you make sure their args offset are an even multiple of 16, and they call a more picky function - boom.

JWasm invoke always adds 8 on a call, and so gives the right result when you follow the rule you're referring to: always align b4 a call, and always expect unaligned by 8 on entry. But you can still fall for the odd arg list gotcha; the rule doesn't work with WndProc; and if you're not very careful other things can go wrong.

I'm forgetting a couple other interesting gotchas, shld refer to my notes, but ... just use nvk and you're covered!

It's really amusing to read old postings and see people going around and around on these issues; they think they've got it nailed, then suddenly MB (or others) blow up. I might be in the same boat as those poor guys, but don't know it yet!
« Last Edit: January 31, 2015, 04:36:22 AM by rrr314159 »
I am NaN ;)

sinsi

  • Guest
Re: Yet Another Invoke Macro ...
« Reply #7 on: January 30, 2015, 03:15:39 PM »
Entry point is where Windows jumps to after loading your program, the very first instruction of your code. Always aligned 8, not 16.
The window procedure, called whatever you want, is called by Windows during message processing. In all of my programs, is also always aligned 8, not 16.

The Windows ABI pretty much demands that on entry to a fastcall function the stack will be aligned 8, never 16, due to the call return address.

rrr314159

  • Member
  • *****
  • Posts: 1382
Re: Yet Another Invoke Macro ...
« Reply #8 on: January 31, 2015, 04:33:06 AM »
To determine that on entry into the Windows callback function (typically called WndProc - whatever) rsp is aligned to 16, I used this snippet. Hopefully it's enough; I can provide a complete sample prog if desired. For instance I just ran it and got 6ff1b0; always end with 0h. Probably doing something stupid - experience shows that happens at least a dozen times a day - but, what is it?

Code: [Select]
; »»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»
.data
    saveinitrsp dq 0
.code
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
WndProc proc hWnd:HWND, uMsg:UINT, wParam:WPARAM, lParam:LPARAM

    mov r11, rsp

cmp saveinitrsp, 0  ; can do this only once or else no window
jg @F
    mov saveinitrsp, r11
    invoke printf, cfm$("RSP coming in to WndProc was %x\n"), saveinitrsp
@@:

    cmp edx, WM_COMMAND
    jne @F

; etc, etc ...

BTW, in my above post the word "you" doesn't mean "you", you understand, rather it means "one". It only means "you" one time, the first use. "One", OTOH, always means "1". It sounds a bit like I'm giving you advice, but no, that's "one" I'm advising. I hope that's clear.

Anyway, regardless of rsp's alignment status on WndProc entry or anywhere else, if one uses nvk, one doesn't have to worry about it!
I am NaN ;)

sinsi

  • Guest
Re: Yet Another Invoke Macro ...
« Reply #9 on: January 31, 2015, 06:12:05 AM »
That's odd.
Code: [Select]
wndproc:    cmp edx,WM_CREATE

Code: [Select]
rax=000000000013f760 rbx=0000000000000000 rcx=00000000000b02ca
rdx=0000000000000024 rsi=0000000000000001 rdi=0000000000000000
rip=000000013fc910a8 rsp=000000000013f6f8 rbp=0000000000000000
 r8=0000000000000000  r9=000000000013f8f0 r10=00000000000b02ca
r11=0000000000000000 r12=0000000000000000 r13=0000000000000024
r14=0000000000000000 r15=00000000000b02ca
iopl=0         nv up ei pl zr na po nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246
image00000001_3fc90000+0x10a8:
00000001`3fc910a8 83fa01          cmp     edx,1

rrr314159

  • Member
  • *****
  • Posts: 1382
Re: Yet Another Invoke Macro ...
« Reply #10 on: January 31, 2015, 06:56:02 AM »
Ok - if WndProc is just a label, or you use proc with no arguments, it's aligned to 8. In other words you're right, on entry it's 8. But when you use proc with the standard four arguments JWasm, or ML64, builds a frame, in such a way that rsp gets bumped by an odd number of 8's, thus aligning it to 0. So I was wrong but wouldn't call it a stupid mistake, particularly for a beginner, to make. Thanks! Yet another gotcha ...
I am NaN ;)

GoneFishing

  • Member
  • *****
  • Posts: 1071
  • Gone fishing
Re: Yet Another Invoke Macro ...
« Reply #11 on: February 01, 2015, 12:22:33 AM »
Hi rrr,
Can your nvk macro work with arguments passed in XMM registers?
For the second day I'm fighting to death with  printf  function trying to convince her to print out the value of XMM0 register but all I get is  mysterious 7FFFFFE2 :icon_confused:


jj2007

  • Member
  • *****
  • Posts: 10636
  • Assembler is fun ;-)
    • MasmBasic
Re: Yet Another Invoke Macro ...
« Reply #12 on: February 01, 2015, 01:48:18 AM »
Can your nvk macro work with arguments passed in XMM registers?

deb prints xmm args just fine, as decimal, hex or binary, but that is 32-bit code.

It is not difficult to implement, the only tricky point is that opattr returns "it's a register", as if it was eax. Below a testbed showing workarounds. Note that ML 6.14 and 6.15 use the string XMM(0), therefore the somewhat clumsy version with ifidni (I know earlier ML versions are not relevant for 64-bit code).

Code: [Select]
include \masm32\include\masm32rt.inc
.686p
.xmm

GetType MACRO arg
LOCAL tmp$, opa, is
  opa = (opattr arg) AND 127
  tmp$ CATSTR <Myarg=>, <arg>, < with opattr=>, %opa
  % echo tmp$
  tmp$ CATSTR <arg>, <  > ; two blanks to make sure there are at least three chars
  tmp$ SUBSTR tmp$, 1, 3
  ifidni tmp$, <xmm>
echo ### xmm found ###
  else
echo ### something else...
  endif
  is INSTR <arg>, <xmm>
  if is eq 1
echo @@@ xmm found @@@
  else
echo @@@ arg = something else...
  endif

; all: Myarg=eax with opattr=48
; MLv10: Myarg=xmm0 with opattr=48
; JWasm: Myarg=xmm0 with opattr=48
; MLv615: Myarg=XMM(0) with opattr=48

ENDM

.code
x1 dd 123
start:
GetType x1 ; name is only 2 chars long
GetType eax
GetType xmm0
exit
.err ; don't build, just show the echos
end start

rrr314159

  • Member
  • *****
  • Posts: 1382
Re: Yet Another Invoke Macro ...
« Reply #13 on: February 01, 2015, 04:06:41 AM »
@jj2007,

You don't have to go to all that trouble, the type function distinguishes between xmm0 and rax: type(xmm0) = 10h. Old ML versions were broken - type returned 8 for xmm0, like rax, but it's been fixed since ver 8. BTW ML64 gives type(ymm0) as 20h, but Jwasm incorrectly says it's 8.

@vertograd,

the good news is, you don't need xmm0 for printf (or sprintf or any of that family). Instead, pass reals in the GPR's. The bad news is you just wasted x hours trying to get printf to read xmm0, which (AFAIK) it doesn't do.

ps. I think I'll look into converting deb to 64 bits,  I need (at least some of) that capability.
« Last Edit: February 01, 2015, 05:32:25 AM by rrr314159 »
I am NaN ;)

GoneFishing

  • Member
  • *****
  • Posts: 1071
  • Gone fishing
Re: Yet Another Invoke Macro ...
« Reply #14 on: February 01, 2015, 04:58:46 AM »
Thanks , Jochen . GetType macro runs on Linux too  :t

@rrr:
       Well, thank you for the information . Honestly I've already started to suspect that something is wrong with that function itself . Now I must look for another function that can take the arguments from XMM registers ...
You said  that  I 'wasted x hours' but I confess the few hours don't matter at all when I successfully wasted the best years of my life , that's about a half of my lifetime ...
BTW I have good news for you too : my version of JWASM reports the correct value for XMM registers -   10h. What version do you use?
Mine:
Quote
JWasm v2.11, Oct 20 2013, Masm-compatible assembler.