hi fearless,
Wish I'd been here when u asked about declaring qwords in prototypes. As u found out, yes almost all dwords have to be changed to qwords, both pointers and handles. That was the only question I could have answered! I know next to nothing about DLL's, C++, even resource files - how to use them, that's all. I focus on just writing assembler progs, mostly math stuff.
But for simply programming in 64-bit, here are a few tips you might find helpful.
No doubt u know the new ABI for Windows (note, Linux is different). First four args in RCX, RDX, R8, R9, plus shadow space reserved on stack. After those, args put on stack. All args are 8 bytes. If smaller (eg dword) expanded to 8; if larger, like a string or SIMD register, an 8-bit address is passed.
But you may not have noticed the following wrinkles. You can read that floating points are passed in XMM0 - XMM3 regs, actually they're also put in the GP register. Another point often not mentioned, VARARG floating point are NOT passed in XMM. In particular this applies to all the printf family (sscanf, etc).
Beyond those 4 gp's and 4 xmm's, also volatile are RAX, R10, R11, XMM4 and 5. Therefore non-volatile are rbx, rsi, rdi, r12, 13, 14, 15; XMM 6..15. Again, this is only Windows rules.
One gotcha, obvious once you think about it: callback functions use the new ABI. If you're used to getting an arg off the stack, now you find it in rcx, rdx. So for instance when a thread is launched its parameter is in rcx, not on the stack.
I'm sure you know about stack alignment being an issue, see my nvk post if not. If JWasm invoke ever gives you error message for stack alignment, or register overwrite, nvk fixes all such problems.
Many macros from masm32 still work in 64-bit, but one thing to look out for is the msvcrt routines. In masm they're, e.g., crt_printf, but now they're just printf. Often that one change is enough to make an old macro useable.
Another problem you're bound to run into, for some reason Intel dropped pushad and popad. So I wrote macros to replace them; seems most useful to include only the volatile registers. Often I don't want rax included since it sends back info (like error code) so made it optional:
;;*******************************
pushad MACRO raxalso:=<>
;;*******************************
IFNB <raxalso>
push rax
ENDIF
push rcx
push rdx
push rsi
push rdi
push r8
push r9
push r10
push r11
ENDM
;;*******************************
popad MACRO raxalso:=<>
;;*******************************
pop r11
pop r10
pop r9
pop r8
pop rdi
pop rsi
pop rdx
pop rcx
IFNB <raxalso>
pop rax
ENDIF
ENDMOf course you can't push 32-bit registers any more, only 16 and 64 bit. So I convert some standard macros like this:
IF @WordSize EQ 4
rax equ eax
ENDIF
;;*******************************
;; 32 or 64 bits
;; works with 2, 4 or 8 bytes, assume dest and src match
m2m MACRO dest, src
;;*******************************
IF (type(dest) EQ @WordSize) OR (type(dest) EQ 2)
push src
pop dest
ELSEIF @WordSize EQ 8 ;; 64 bit m2m'ing 4 bytes
push rax
mov eax, src
mov dest, eax
pop rax
ELSE ;; 32 bit asked to m2m 8 bytes
push dword ptr src
pop dword ptr dest
push dword ptr src+4
pop dword ptr dest+4
ENDIF
ENDM
;;*******************************Now m2m works with 16, 32 or 64 bit. This technique, checking for TYPE 4 vs. 8, is very often useful.
All methods for 32-bit now have 64-bit versions. For instance, absolute values in ax / eax / rax:
;***********************
absolute_ax MACRO
push rdx
cwd ; sign bit into dx
xor ax, dx ; invert if negative
sub ax, dx ; add 1 if neg, ax = abs
pop rdx
ENDM
;***********************
absolute_eax MACRO
push rdx
cdq ; sign bit into edx
xor eax, edx ; invert if negative
sub eax, edx ; add 1 if neg, eax = abs
pop rdx
ENDM
;***********************
absolute_rax MACRO
push rdx
cqo ; sign bit into rdx
xor rax, rdx ; invert if negative
sub rax, rdx ; add 1 if neg, rax = abs
pop rdx
ENDM
;***********************Of course there are big advantages to 64-bit. One is, floating point bit manipulations. In 32-bit, there are tricks that are worthwhile only for single precision, since REAL4 fits in a register; but REAL8 is 2 registers, 2 much trouble. Now all those tricks are good with REAL8. For instance
;**************************************************************
; MACROs to manipulate float bits
;**************************************************************
signbit4 equ 80000000h
signbit8 equ 8000000000000000h
nonsign4 equ 7fffffffh
nonsign8 equ 7fffffffffffffffh
exponen4 equ 7f800000h ; 8 bits for exponent
exponen8 equ 7ff0000000000000h ; 11 bits
mantiss4 equ 007fffffh ; these are "after the decimal" actual val is + 1
mantiss8 equ 000fffffffffffffh
thebias4 equ 3f800000h
thebias8 equ 3ff0000000000000h
IFNDEF asignbit4 ; often constants won't work, especially long ones, need data
.data
asignbit4 dword 80000000h
asignbit8 qword 8000000000000000h
anonsign4 dword 7fffffffh
anonsign8 qword 7fffffffffffffffh
aexponen4 dword 7f800000h
aexponen8 qword 7ff0000000000000h
amantiss4 dword 007fffffh
amantiss8 qword 000fffffffffffffh
athebias4 dword 3f800000h
athebias8 qword 3ff0000000000000h
.code
ENDIF
;***********************
Chs MACRO thereal ;; for memory or register reals
;***********************
IF (OPATTR thereal) EQ 30h
IF type (thereal) EQ 4
xor thereal, signbit4
ELSE
xor thereal, asignbit8
ENDIF
ELSE
IF type (thereal) EQ 4
xor thereal, signbit4
ELSE
push rax
mov rax, thereal
xor rax, asignbit8
mov thereal, rax
pop rax
ENDIF
ENDIF
ENDM
;***********************
; NOTE: this uses a "print" macro like masm32's
prntparts MACRO realval ;; show signbit, exponent and mantissa
;***********************
if type(realval) EQ 4
mov r10d, realval
print "real value4 %.9g, hex %x\n", realval, r10d
mov r11d, r10d
mov r12d, r10d
and r10d, asignbit4
shr r10d, 31
and r11d, aexponen4
sub r11d, thebias4
shr r11d, 23
and r12d, amantiss4
print "sign %x, exp %d, mant %d\n", r10d, r11d, r12d
else
mov r10, realval
print "real value8 %.19g, hex %x\n", realval, r10
mov r11, r10
mov r12, r10
and r10, asignbit8
shr r10, 63
and r11, aexponen8
sub r11, athebias8
shr r11, 52
and r12, amantiss8
print "sign %x, exp %d mant %d\n", r10, r11, r12
endif
ENDM
;***********************... and so on, many others. These tricks are MUCH more useful now, not being limited to REAL4. Previously u often just used the same clumsier/slower method required for REAL8 for REAL4 also, instead of having 2 different techniques.
Another good one is RDTSC, which of course sends back 64-bit answer. With 32-bit u had to deal with this in 2 registers, now it's easier:
;;*******************************
RDTSC64 MACRO ;; rdtsc cycles => rax
;;*******************************
push rdx
mfence ;; serialize
rdtsc
shl rdx, 32
or rax, rdx ;; 64 bits accomodates 100 yrs of cycles
pop rdx
ENDM
;;*******************************
;; use like, e.g., mov rcx, @RDTSC64()
@RDTSC64 MACRO ;; rdtsc cycles returned, rax used
;;*******************************
push rdx
mfence
rdtsc
shl rdx, 32
or rax, rdx
pop rdx
EXITM <rax>
ENDM
;;*******************************I have a lot of macros and techniques like these for just basic 64-bit programming. If any of this is useful let me know, more where that came from. For instance, macros that apply the floating point tricks to SIMD registers; in fact, a lot more SIMD stuff, not included here since it's not specifically about 64-bit. Also, timing macros using 64-bit registers, much more flexible than old 32-bit. Printing macros using 64-bit msvcrt routines. Etc.