News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Masm64 SDK ignores "uses"

Started by jj2007, August 28, 2023, 08:36:11 AM

Previous topic - Next topic

jj2007

mytest proc uses rsi rdi rbx gets ignored, apparently this feature doesn't exist in the SDK:

include \masm64\include64\masm64rt.inc    ; *** test of the uses feature ***
.code
mytest proc uses rsi rdi rbx arg1, arg2, arg3, arg4, arg5
Local loc1, loc2, loc3
  mov eax, arg1
  mov loc1, eax
  mov edx, arg5
  mov loc3, edx
  ret
mytest endp
entry_point proc
Local pContent:QWORD, ticks:QWORD        ; OPT_Assembler ml64
  lea rax, entry_point
  conout "The entry point is at ", hex$(rax), lf
  INT 3
  invoke mytest, 11111111h, 22222222h, 33333333h, 44444444h, 55555555h
  invoke ExitProcess, 0                        ; terminate process
entry_point endp
end

000000013F8F107C | CC                        | int3                            |
000000013F8F107D | 48:C7C1 11111111          | mov rcx,11111111                |
000000013F8F1084 | 48:C7C2 22222222          | mov rdx,22222222                |
000000013F8F108B | 49:C7C0 33333333          | mov r8,33333333                 |
000000013F8F1092 | 49:C7C1 44444444          | mov r9,44444444                 |
000000013F8F1099 | 48:C74424 20 55555555     | mov [rsp+20],55555555           |
000000013F8F10A2 | E8 59FFFFFF               | call <sub_13F8F1000>            |
...
000000013F8F1000 | C8 8000 00                | enter 80,0                      |
000000013F8F1004 | 48:83EC 70                | sub rsp,70                      |
000000013F8F1008 | 48:894D 10                | mov [rbp+10],rcx                |
000000013F8F100C | 48:8955 18                | mov [rbp+18],rdx                |
000000013F8F1010 | 4C:8945 20                | mov [rbp+20],r8                 |
000000013F8F1014 | 4C:894D 28                | mov [rbp+28],r9                 |
000000013F8F1018 | 8B45 10                   | mov eax,[rbp+10]                |
000000013F8F101B | 8945 9C                   | mov [rbp-64],eax                |
000000013F8F101E | 8B55 30                   | mov edx,[rbp+30]                |
000000013F8F1021 | 8955 94                   | mov [rbp-6C],edx                |
000000013F8F1024 | C9                        | leave                           |
000000013F8F1025 | C3                        | ret                             |

If you feel that the | vertical | lines are a little bit misaligned: not my fault, it's a forum quirk. When posting, they look ok, but once you edit the post to correct a typo, they get misaligned.

Admin' EDIT: I straightened it up for you  :smiley:

HSE

SDK have customized prologue and epilogue.

Hutch's design is to use macros:

mytest proc arg1, arg2, arg3, arg4, arg5
  USING rsi rdi rbx
  Local loc1, loc2, loc3

  SaveRegs

  mov eax, arg1
  mov loc1, eax
  mov edx, arg5
  mov loc3, edx

  RestoreRegs
  ret
mytest endp

But of course you can make:

option PROLOGUE:Prologuedef
option EPILOGUE:Epiloguedef

mytest proc uses rsi rdi rbx arg1, arg2, arg3, arg4, arg5
  Local loc1, loc2, loc3
  mov eax, arg1
  mov loc1, eax
  mov edx, arg5
  mov loc3, edx
  ret
mytest endp

Equations in Assembly: SmplMath

zedd151

Search results for: "register preservation macros"

https://masm32.com/board/index.php?topic=7278.0
https://masm32.com/board/index.php?topic=7285.0  look especially here for explanation of "USING" versus "uses"

The search tool is your friend. :smiley:

If you look around, hutch was also experimenting with other similar variations to save and restore registers...

jj2007

#3
Interesting, thanks :thumbsup:

However, when I use prologuedef, this is the outcome:
000000013F841000 | 55                        | push rbp                       |
000000013F841001 | 48:8BEC                   | mov rbp,rsp                    |
000000013F841004 | 48:83C4 F0                | add rsp,FFFFFFFFFFFFFFF0       |
000000013F841008 | 56                        | push rsi                       |
000000013F841009 | 57                        | push rdi                       |
000000013F84100A | 53                        | push rbx                       |
000000013F84100B | 8B45 10                   | mov eax,[rbp+10]               |
000000013F84100E | 8945 FC                   | mov [rbp-4],eax                |
000000013F841011 | 8B55 30                   | mov edx,[rbp+30]               |
000000013F841014 | 8955 F4                   | mov [rbp-C],edx                |
000000013F841017 | 5B                        | pop rbx                        |
000000013F841018 | 5F                        | pop rdi                        |
000000013F841019 | 5E                        | pop rsi                        |
000000013F84101A | C9                        | leave                          |
000000013F84101B | C3                        | ret                            |

push rbp etc is faster than enter, good, but after the three pushes rsp is aligned 8, not aligned 16 :rolleyes:

I also tested USING rsi rdi rbx - no effect :sad:

P.S., got it:
000000013FFC1000 | C8 8000 00                 | enter 80,0                      |
000000013FFC1004 | 48:81EC 90000000           | sub rsp,90                      |
000000013FFC100B | 48:894D 10                 | mov [rbp+10],rcx                |
000000013FFC100F | 48:8975 80                 | mov [rbp-80],rsi                |
000000013FFC1013 | 48:897D 88                 | mov [rbp-78],rdi                |
000000013FFC1017 | 48:895D 90                 | mov [rbp-70],rbx                |
000000013FFC101B | 8B45 10                    | mov eax,[rbp+10]                |
000000013FFC101E | 8985 7CFFFFFF              | mov [rbp-84],eax                |
000000013FFC1024 | 8995 74FFFFFF              | mov [rbp-8C],edx                |
000000013FFC102A | 48:8B75 80                 | mov rsi,[rbp-80]                |
000000013FFC102E | 48:8B7D 88                 | mov rdi,[rbp-78]                |
000000013FFC1032 | 48:8B5D 90                 | mov rbx,[rbp-70]                |
000000013FFC1036 | C9                         | leave                           |
000000013FFC1037 | C3                         | ret                         
Still the horribly slow enter 80, 0, but the rest is ok. However, you need to use three macros to achieve that!

mytest proc arg1
USING rsi, rdi, rbx
Local loc1, loc2, loc3
  SaveRegs
  mov eax, arg1
  mov loc1, eax
  mov loc3, edx
  RestoreRegs
  ret
mytest endp

zedd151

You have to use

SaveRegs and RestoreRegs as well as USING. It doesn't work automatically by just declaring USING... as in the first example that HSE posted.

Probably not a perfect solution, The Masm64 SDK is still in beta, after all... but it does seem to work.

If you think this is an "issue", maybe start a "Masm64 SDK Known Issues" thread? If such a thread is created, we can pin it to the top of the board for any other issues that come up to be added to that thread.

HSE

x64 ABI say that stack pointer don't need to be 16 aligned in leaf functions. Naturally, what for? (I'm learning that now  :smiley: )
Equations in Assembly: SmplMath

jj2007

Quote from: HSE on August 28, 2023, 12:00:16 PMx64 ABI say that stack pointer don't need to be 16 aligned in leaf functions. Naturally, what for? (I'm learning that now  :smiley: )

That's correct. Not a problem for Hutch' version of prologue, which is always aligned 16:
mytest proc arg1
USING rsi, rdi, rbx
Local loc1, loc2, loc3
  SaveRegs
  mov eax, arg1
  mov loc1, eax
  invoke MessageBoxA, 0, chr$("A message box"), chr$("Hi:"), MB_OK
  mov loc3, edx
  RestoreRegs
  ret
mytest endp

It's pretty inefficient, though:
000000013F3E1024 | 31C9                       | xor ecx,ecx                     |
000000013F3E1026 | 90                         | nop                             |
000000013F3E1027 | 90                         | nop                             |
000000013F3E1028 | 90                         | nop                             |
000000013F3E1029 | 90                         | nop                             |
000000013F3E102A | 90                         | nop                             |
000000013F3E102B | 48:8B15 38200000           | mov rdx,[13F3E306A]             | 000000013F3E306A:&"A message box"
000000013F3E1032 | 4C:8B05 3D200000           | mov r8,[13F3E3076]              | 000000013F3E3076:&"Hi:"
000000013F3E1039 | 45:31C9                    | xor r9d,r9d                     |
000000013F3E103C | 90                         | nop                             |
000000013F3E103D | 90                         | nop                             |
000000013F3E103E | 90                         | nop                             |
000000013F3E103F | 90                         | nop                             |
000000013F3E1040 | FF15 EA0F0000              | call [<&MessageBoxA>]           |
Nine bytes less by using xor instead of mov reg, 0.

jj2007

Honestly, this is pretty awful - I attach a testbed:
- Masm64 SDK does not crash, but it needs three macros to emulate some proc uses rsi rdi rbx, and the encoding is inefficient (mov rcx, 0 is 5 bytes longer than the equivalent xor ecx, ecx)
- Masm64 default prologue does not care of alignment and crashes
- unless you put the right number and type of locals there

include \masm64\include64\masm64rt.inc    ; *** test of the uses feature ***
.code
mytestOk proc arg1                                        ; uses default Masm64 SDK prologue (Hutch)
USING rsi, rdi, rbx
Local loc1, loc2, loc3
  SaveRegs
  mov eax, arg1
  mov loc1, eax
  invoke MessageBoxA, 0, chr$("This message box works"), chr$("Hi:"), MB_OK
  mov loc3, edx
  RestoreRegs
  ret
mytestOk endp
OPTION PROLOGUE:PrologueDef        ; from here on, let MASM handle the stack frame
OPTION EPILOGUE:EpilogueDef
mytest proc uses rsi rdi rbx arg1
Local loc1, loc2, loc3
  mov eax, arg1
  mov loc1, eax
  invoke MessageBoxA, 0, chr$("You won't see this one"), chr$("Hi:"), MB_OK
  mov loc3, edx
  ret
mytest endp
entry_point proc
Local pContent:QWORD, ticks:QWORD    ; take away the locals to see a crash
  invoke mytestOk, 11111111h   ; this one works fine
  invoke mytest, 11111111h     ; this one crashes because of misalignment
  invoke ExitProcess, 0
entry_point endp
end

zedd151

Quote from: jj2007 on August 28, 2023, 07:06:00 PMMasm64 SDK does not crash,
The Masm64 SDK cannot crash. It is a collection of include files, libraries, macros, examples, etc.  Unless you are talking about code from a specific macro that crashes.
If you are talking about ml64 crashing, no one here has any control over how ml64 operates, nor did hutch. For that, you need to register a complaint with Microsoft.

Hutch had gone to great lengths to make the Masm64 SDK as easy to use as the Masm32 SDK. For some of ml64's shortcomings hutch has written macros. i.e., "USING", "SaveRegs", "RestoreRegs", and others.

Seems that you are trying to compare how ml64 operates versus uasm64. While uasm might be a great tool, it is not quite a drop-in replacement for ml64. There are too many differences. Uasm has options that ml64 does not. uasm uses prototypes, ml64 does not. That does not put the Masm64 SDK at fault though, as you seem to be implying but rather ml64 itself.

It appears that you had not read through hutch's posts regarding the efforts he had put into the Masm64 SDK, what he did to improve writing code for ml64, to make life easier for the programmer. Some macros might just be inefficient, but hutch did what he had to do to make it work. You had plenty of time while he was still alive to offer up suggestions while he was testing out different versions of the macros for saving and restoring the 64 bit registers and keeping the stack properly aligned and balanced, etc.

Saying that the Masm64 SDK is faulty for ml64's shortcomings is laughable. Imagine what hutch would say at that suggestion... besides Masm64 SDK is still in beta (i.e., a Work In Progress, and yet unfinished)

Just my thoughts at the moment.  :azn:

jj2007

Quote from: zedd151 on August 29, 2023, 12:11:37 AM
Quote from: jj2007 on August 28, 2023, 07:06:00 PMMasm64 SDK does not crash,
The Masm64 SDK cannot crash. It is a collection of include files...

I apologise for my sloppy wording. Of course, I meant "code written with the Masm64 SDK does not crash".

zedd151

Quote from: jj2007 on August 29, 2023, 12:38:28 AMI apologise for my sloppy wording. Of course, I meant "code written with the Masm64 SDK does not crash".
I kind of figured that is what you meant, but thanks for the clarification. ml64 is not a nice tool to use, all we can do is write proper code for it. hutch did a pretty good job, in what he has acheived with the Masm64 SDK.
Once stoo23 is able to get hutch's Masm64 stuff, there will most likely be some updated macros, perhaps a more efficient set for preserving the registers. We can only hope.

I am looking into doing more 64 bit programming, btw. Using ml64 of course. First I will try some more 32->64 conversions, with multiple procedures as opposed to the more simple conversions that I had done in the past, which largely were very easily converted. :biggrin:

As a side note:
One thing that really bothers me about the SDK is ".if rax { 4" style of ".if" notation. Probably a macro that uses textequ might be able to restore using "<", ">" so the code looks more normal. But needs further investigation, in another thread.

Vortex

Hello,

Hutch did a great job to create and maintain the Masm64 package.

Here is a known method to preserve the volatile registers without specifiying USES :

include     \masm32\include64\masm64rt.inc

.code

start PROC

    invoke  main,10,20,30,40
    invoke  ExitProcess,0

start ENDP

main PROC a:QWORD,b:QWORD,c:QWORD,d:QWORD

LOCAL .rsi:QWORD
LOCAL .rdi:QWORD
LOCAL .rbx:QWORD

    mov     .rsi,rsi
    mov     .rdi,rdi
    mov     .rbx,rbx   

    xor     rbx,rbx
    xor     rsi,rsi
    xor     rdi,rdi

    mov     rsi,.rsi
    mov     rdi,.rdi
    mov     rbx,.rbx   

    mov     rax,1   
    ret

main ENDP

END

jj2007

Yes, that's one option, Erol, but do you realise that you replaced sometest proc uses rsi rdi rbx args with 12 lines of additional code? IMHO the prologue macro can handle this, without any user intervention.

Vortex

Hi Jochen,

That's true. Hutch used this method in the Masm64 package. An example is \masm64\m64lib\stdout.asm. Maybe I am wrong but the direct register write might be faster than the push\pop pair.

jj2007

Quote from: Vortex on August 29, 2023, 05:01:59 AMthe direct register write might be faster than the push\pop pair.

That should be tested, see attachment (pure Masm64 SDK code). Both methods require writing to and reading from memory.