Print Page - Masm64 SDK ignores "uses"

Title: Masm64 SDK ignores "uses"
Post by: jj2007 on August 28, 2023, 08:36:11 AM

mytest proc uses rsi rdi rbx gets ignored, apparently this feature doesn't exist in the SDK:

include \masm64\include64\masm64rt.inc    ; *** test of the uses feature ***
.code
mytest proc uses rsi rdi rbx arg1, arg2, arg3, arg4, arg5
Local loc1, loc2, loc3
  mov eax, arg1
  mov loc1, eax
  mov edx, arg5
  mov loc3, edx
  ret
mytest endp
entry_point proc
Local pContent:QWORD, ticks:QWORD        ; OPT_Assembler ml64
  lea rax, entry_point
  conout "The entry point is at ", hex$(rax), lf
  INT 3
  invoke mytest, 11111111h, 22222222h, 33333333h, 44444444h, 55555555h
  invoke ExitProcess, 0                        ; terminate process
entry_point endp
end

Code Select

000000013F8F107C | CC                        | int3                            |
000000013F8F107D | 48:C7C1 11111111          | mov rcx,11111111                |
000000013F8F1084 | 48:C7C2 22222222          | mov rdx,22222222                |
000000013F8F108B | 49:C7C0 33333333          | mov r8,33333333                 |
000000013F8F1092 | 49:C7C1 44444444          | mov r9,44444444                 |
000000013F8F1099 | 48:C74424 20 55555555     | mov [rsp+20],55555555           |
000000013F8F10A2 | E8 59FFFFFF               | call <sub_13F8F1000>            |
...
000000013F8F1000 | C8 8000 00                | enter 80,0                      |
000000013F8F1004 | 48:83EC 70                | sub rsp,70                      |
000000013F8F1008 | 48:894D 10                | mov [rbp+10],rcx                |
000000013F8F100C | 48:8955 18                | mov [rbp+18],rdx                |
000000013F8F1010 | 4C:8945 20                | mov [rbp+20],r8                 |
000000013F8F1014 | 4C:894D 28                | mov [rbp+28],r9                 |
000000013F8F1018 | 8B45 10                   | mov eax,[rbp+10]                |
000000013F8F101B | 8945 9C                   | mov [rbp-64],eax                |
000000013F8F101E | 8B55 30                   | mov edx,[rbp+30]                |
000000013F8F1021 | 8955 94                   | mov [rbp-6C],edx                |
000000013F8F1024 | C9                        | leave                           |
000000013F8F1025 | C3                        | ret                             |

If you feel that the | vertical | lines are a little bit misaligned: not my fault, it's a forum quirk. When posting, they look ok, but once you edit the post to correct a typo, they get misaligned.

Admin' EDIT: I straightened it up for you :smiley:

Title: Re: Masm64 SDK ignores "uses"
Post by: HSE on August 28, 2023, 09:01:21 AM

SDK have customized prologue and epilogue.

Hutch's design is to use macros:

Code Select

mytest proc arg1, arg2, arg3, arg4, arg5
  USING rsi rdi rbx
  Local loc1, loc2, loc3

  SaveRegs

  mov eax, arg1
  mov loc1, eax
  mov edx, arg5
  mov loc3, edx

  RestoreRegs
  ret
mytest endp

But of course you can make:

Code Select

option PROLOGUE:Prologuedef
option EPILOGUE:Epiloguedef

mytest proc uses rsi rdi rbx arg1, arg2, arg3, arg4, arg5
  Local loc1, loc2, loc3
  mov eax, arg1
  mov loc1, eax
  mov edx, arg5
  mov loc3, edx
  ret
mytest endp

Title: Re: Masm64 SDK ignores "uses"
Post by: zedd151 on August 28, 2023, 10:16:18 AM

Search results for: "register preservation macros"

https://masm32.com/board/index.php?topic=7278.0
https://masm32.com/board/index.php?topic=7285.0 look especially here for explanation of "USING" versus "uses"

The search tool is your friend. :smiley:

If you look around, hutch was also experimenting with other similar variations to save and restore registers...

Title: Re: Masm64 SDK ignores "uses"
Post by: jj2007 on August 28, 2023, 10:19:35 AM

Interesting, thanks :thumbsup:

However, when I use prologuedef, this is the outcome:

Code Select

000000013F841000 | 55                        | push rbp                       |
000000013F841001 | 48:8BEC                   | mov rbp,rsp                    |
000000013F841004 | 48:83C4 F0                | add rsp,FFFFFFFFFFFFFFF0       |
000000013F841008 | 56                        | push rsi                       |
000000013F841009 | 57                        | push rdi                       |
000000013F84100A | 53                        | push rbx                       |
000000013F84100B | 8B45 10                   | mov eax,[rbp+10]               |
000000013F84100E | 8945 FC                   | mov [rbp-4],eax                |
000000013F841011 | 8B55 30                   | mov edx,[rbp+30]               |
000000013F841014 | 8955 F4                   | mov [rbp-C],edx                |
000000013F841017 | 5B                        | pop rbx                        |
000000013F841018 | 5F                        | pop rdi                        |
000000013F841019 | 5E                        | pop rsi                        |
000000013F84101A | C9                        | leave                          |
000000013F84101B | C3                        | ret                            |

push rbp etc is faster than enter, good, but after the three pushes rsp is aligned 8, not aligned 16 :rolleyes:

I also tested USING rsi rdi rbx - no effect :sad:

P.S., got it:

Code Select

000000013FFC1000 | C8 8000 00                 | enter 80,0                      |
000000013FFC1004 | 48:81EC 90000000           | sub rsp,90                      |
000000013FFC100B | 48:894D 10                 | mov [rbp+10],rcx                |
000000013FFC100F | 48:8975 80                 | mov [rbp-80],rsi                |
000000013FFC1013 | 48:897D 88                 | mov [rbp-78],rdi                |
000000013FFC1017 | 48:895D 90                 | mov [rbp-70],rbx                |
000000013FFC101B | 8B45 10                    | mov eax,[rbp+10]                |
000000013FFC101E | 8985 7CFFFFFF              | mov [rbp-84],eax                |
000000013FFC1024 | 8995 74FFFFFF              | mov [rbp-8C],edx                |
000000013FFC102A | 48:8B75 80                 | mov rsi,[rbp-80]                |
000000013FFC102E | 48:8B7D 88                 | mov rdi,[rbp-78]                |
000000013FFC1032 | 48:8B5D 90                 | mov rbx,[rbp-70]                |
000000013FFC1036 | C9                         | leave                           |
000000013FFC1037 | C3                         | ret

Still the horribly slow enter 80, 0, but the rest is ok. However, you need to use three macros to achieve that!

Code Select

mytest proc arg1
USING rsi, rdi, rbx 
Local loc1, loc2, loc3
  SaveRegs
  mov eax, arg1
  mov loc1, eax
  mov loc3, edx
  RestoreRegs
  ret
mytest endp

Title: Re: Masm64 SDK ignores "uses"
Post by: zedd151 on August 28, 2023, 10:28:35 AM

You have to use

SaveRegs and RestoreRegs as well as USING. It doesn't work automatically by just declaring USING... as in the first example that HSE posted.

Probably not a perfect solution, The Masm64 SDK is still in beta, after all... but it does seem to work.

If you think this is an "issue", maybe start a "Masm64 SDK Known Issues" thread? If such a thread is created, we can pin it to the top of the board for any other issues that come up to be added to that thread.

Title: Re: Masm64 SDK ignores "uses"
Post by: HSE on August 28, 2023, 12:00:16 PM

x64 ABI say that stack pointer don't need to be 16 aligned in leaf functions. Naturally, what for? (I'm learning that now :smiley: )

Title: Re: Masm64 SDK ignores "uses"
Post by: jj2007 on August 28, 2023, 06:51:26 PM

Quote from: HSE on August 28, 2023, 12:00:16 PMx64 ABI say that stack pointer don't need to be 16 aligned in leaf functions. Naturally, what for? (I'm learning that now :smiley: )

That's correct. Not a problem for Hutch' version of prologue, which is always aligned 16:

Code Select

mytest proc arg1
USING rsi, rdi, rbx 
Local loc1, loc2, loc3
  SaveRegs
  mov eax, arg1
  mov loc1, eax
  invoke MessageBoxA, 0, chr$("A message box"), chr$("Hi:"), MB_OK
  mov loc3, edx
  RestoreRegs
  ret
mytest endp

It's pretty inefficient, though:

Code Select

000000013F3E1024 | 31C9                       | xor ecx,ecx                     |
000000013F3E1026 | 90                         | nop                             |
000000013F3E1027 | 90                         | nop                             |
000000013F3E1028 | 90                         | nop                             |
000000013F3E1029 | 90                         | nop                             |
000000013F3E102A | 90                         | nop                             |
000000013F3E102B | 48:8B15 38200000           | mov rdx,[13F3E306A]             | 000000013F3E306A:&"A message box"
000000013F3E1032 | 4C:8B05 3D200000           | mov r8,[13F3E3076]              | 000000013F3E3076:&"Hi:"
000000013F3E1039 | 45:31C9                    | xor r9d,r9d                     |
000000013F3E103C | 90                         | nop                             |
000000013F3E103D | 90                         | nop                             |
000000013F3E103E | 90                         | nop                             |
000000013F3E103F | 90                         | nop                             |
000000013F3E1040 | FF15 EA0F0000              | call [<&MessageBoxA>]           |

Nine bytes less by using xor instead of mov reg, 0.

Title: Re: Masm64 SDK ignores "uses"
Post by: jj2007 on August 28, 2023, 07:06:00 PM

Honestly, this is pretty awful - I attach a testbed:
- Masm64 SDK does not crash, but it needs three macros to emulate some proc uses rsi rdi rbx, and the encoding is inefficient (mov rcx, 0 is 5 bytes longer than the equivalent xor ecx, ecx)
- Masm64 default prologue does not care of alignment and crashes
- unless you put the right number and type of locals there

include \masm64\include64\masm64rt.inc ; *** test of the uses feature ***
.code
mytestOk proc arg1 ; uses default Masm64 SDK prologue (Hutch)
USING rsi, rdi, rbx
Local loc1, loc2, loc3
SaveRegs
mov eax, arg1
mov loc1, eax
invoke MessageBoxA, 0, chr$("This message box works"), chr$("Hi:"), MB_OK
mov loc3, edx
RestoreRegs
ret
mytestOk endp
OPTION PROLOGUE:PrologueDef ; from here on, let MASM handle the stack frame
OPTION EPILOGUE:EpilogueDef
mytest proc uses rsi rdi rbx arg1
Local loc1, loc2, loc3
mov eax, arg1
mov loc1, eax
invoke MessageBoxA, 0, chr$("You won't see this one"), chr$("Hi:"), MB_OK
mov loc3, edx
ret
mytest endp
entry_point proc
Local pContent:QWORD, ticks:QWORD ; take away the locals to see a crash
invoke mytestOk, 11111111h ; this one works fine
invoke mytest, 11111111h ; this one crashes because of misalignment
invoke ExitProcess, 0
entry_point endp
end

Title: Re: Masm64 SDK ignores "uses"
Post by: zedd151 on August 29, 2023, 12:11:37 AM

Quote from: jj2007 on August 28, 2023, 07:06:00 PMMasm64 SDK does not crash,

The Masm64 SDK cannot crash. It is a collection of include files, libraries, macros, examples, etc. Unless you are talking about code from a specific macro that crashes.
If you are talking about ml64 crashing, no one here has any control over how ml64 operates, nor did hutch. For that, you need to register a complaint with Microsoft.

Hutch had gone to great lengths to make the Masm64 SDK as easy to use as the Masm32 SDK. For some of ml64's shortcomings hutch has written macros. i.e., "USING", "SaveRegs", "RestoreRegs", and others.

Seems that you are trying to compare how ml64 operates versus uasm64. While uasm might be a great tool, it is not quite a drop-in replacement for ml64. There are too many differences. Uasm has options that ml64 does not. uasm uses prototypes, ml64 does not. That does not put the Masm64 SDK at fault though, as you seem to be implying but rather ml64 itself.

It appears that you had not read through hutch's posts regarding the efforts he had put into the Masm64 SDK, what he did to improve writing code for ml64, to make life easier for the programmer. Some macros might just be inefficient, but hutch did what he had to do to make it work. You had plenty of time while he was still alive to offer up suggestions while he was testing out different versions of the macros for saving and restoring the 64 bit registers and keeping the stack properly aligned and balanced, etc.

Saying that the Masm64 SDK is faulty for ml64's shortcomings is laughable. Imagine what hutch would say at that suggestion... besides Masm64 SDK is still in beta (i.e., a Work In Progress, and yet unfinished)

Just my thoughts at the moment. :azn:

Title: Re: Masm64 SDK ignores "uses"
Post by: jj2007 on August 29, 2023, 12:38:28 AM

Quote from: zedd151 on August 29, 2023, 12:11:37 AM
Quote from: jj2007 on August 28, 2023, 07:06:00 PMMasm64 SDK does not crash,
The Masm64 SDK cannot crash. It is a collection of include files...

I apologise for my sloppy wording. Of course, I meant "code written with the Masm64 SDK does not crash".

Title: Re: Masm64 SDK ignores "uses"
Post by: zedd151 on August 29, 2023, 12:59:01 AM

Quote from: jj2007 on August 29, 2023, 12:38:28 AMI apologise for my sloppy wording. Of course, I meant "code written with the Masm64 SDK does not crash".

I kind of figured that is what you meant, but thanks for the clarification. ml64 is not a nice tool to use, all we can do is write proper code for it. hutch did a pretty good job, in what he has acheived with the Masm64 SDK.
Once stoo23 is able to get hutch's Masm64 stuff, there will most likely be some updated macros, perhaps a more efficient set for preserving the registers. We can only hope.

I am looking into doing more 64 bit programming, btw. Using ml64 of course. First I will try some more 32->64 conversions, with multiple procedures as opposed to the more simple conversions that I had done in the past, which largely were very easily converted. :biggrin:

As a side note:
One thing that really bothers me about the SDK is ".if rax { 4" style of ".if" notation. Probably a macro that uses textequ might be able to restore using "<", ">" so the code looks more normal. But needs further investigation, in another thread.

Title: Re: Masm64 SDK ignores "uses"
Post by: Vortex on August 29, 2023, 04:12:01 AM

Hello,

Hutch did a great job to create and maintain the Masm64 package.

Here is a known method to preserve the volatile registers without specifiying USES :

Code Select

include     \masm32\include64\masm64rt.inc

.code

start PROC

    invoke  main,10,20,30,40
    invoke  ExitProcess,0

start ENDP

main PROC a:QWORD,b:QWORD,c:QWORD,d:QWORD

LOCAL .rsi:QWORD
LOCAL .rdi:QWORD
LOCAL .rbx:QWORD

    mov     .rsi,rsi
    mov     .rdi,rdi
    mov     .rbx,rbx    

    xor     rbx,rbx
    xor     rsi,rsi
    xor     rdi,rdi

    mov     rsi,.rsi
    mov     rdi,.rdi
    mov     rbx,.rbx    

    mov     rax,1    
    ret

main ENDP

END

Title: Re: Masm64 SDK ignores "uses"
Post by: jj2007 on August 29, 2023, 04:28:27 AM

Yes, that's one option, Erol, but do you realise that you replaced sometest proc uses rsi rdi rbx args with 12 lines of additional code? IMHO the prologue macro can handle this, without any user intervention.

Title: Re: Masm64 SDK ignores "uses"
Post by: Vortex on August 29, 2023, 05:01:59 AM

Hi Jochen,

That's true. Hutch used this method in the Masm64 package. An example is \masm64\m64lib\stdout.asm. Maybe I am wrong but the direct register write might be faster than the push\pop pair.

Title: Re: Masm64 SDK ignores "uses"
Post by: jj2007 on August 29, 2023, 05:19:50 AM

Quote from: Vortex on August 29, 2023, 05:01:59 AMthe direct register write might be faster than the push\pop pair.

That should be tested, see attachment (pure Masm64 SDK code). Both methods require writing to and reading from memory.

Title: Re: Masm64 SDK ignores "uses"
Post by: zedd151 on August 29, 2023, 05:38:57 AM

Quote from: jj2007 on August 29, 2023, 05:19:50 AMThat should be tested

Code Select

pushing took 1887 ms
moving  took 1888 ms
pushing took 1622 ms ; <----
moving  took 2153 ms ; <----
pushing took 1607 ms
moving  took 1872 ms
pushing took 1607 ms
moving  took 1887 ms

second run

Code Select

pushing took 1872 ms
moving  took 1887 ms
pushing took 1607 ms ; <----
moving  took 2137 ms ; <----
pushing took 1623 ms
moving  took 1872 ms
pushing took 1607 ms
moving  took 1872 ms

third run

Code Select

pushing took 1888 ms
moving  took 1903 ms
pushing took 1623 ms ; <----
moving  took 2152 ms ; <----
pushing took 1623 ms
moving  took 1887 ms
pushing took 1607 ms
moving  took 1888 ms

Odd, always the second iteration...

Title: Re: Masm64 SDK ignores "uses"
Post by: HSE on August 29, 2023, 06:34:34 AM

Hi Vortex,

Quote from: Vortex on August 29, 2023, 04:12:01 AMHere is a known method to preserve the volatile registers without specifiying USES :

I remember we help Hutch to make same thing with the macros. :thumbsup:

I think the idea behand this method is that you can trash the register in a first procedure part, and later you can use original value without need to store that twice. Registers stored by "uses" are a little hard to find from inside the procedure.

Title: Re: Masm64 SDK ignores "uses"
Post by: Vortex on August 29, 2023, 06:45:58 AM

Hi HSE,

You are right, registers stored by "uses" are not easy to find.

By the way, it looks like that the push \ pop pair is faster than the mov instruction, I tested Jochen's code.

Title: Re: Masm64 SDK ignores "uses"
Post by: HSE on August 29, 2023, 06:53:17 AM

Quote from: Vortex on August 29, 2023, 06:45:58 AMit looks like that the push \ pop pair is faster than the mov instruction,

:biggrin: :biggrin:

Code Select

pushing took 1063 ms
moving  took 953 ms
pushing took 906 ms
moving  took 828 ms
pushing took 938 ms
moving  took 844 ms
pushing took 906 ms
moving  took 953 ms

Title: Re: Masm64 SDK ignores "uses"
Post by: Vortex on August 29, 2023, 06:55:17 AM

Here are my results :

Code Select

pushing took 1825 ms
moving  took 1810 ms
pushing took 1544 ms
moving  took 2059 ms
pushing took 1529 ms
moving  took 1857 ms
pushing took 1638 ms
moving  took 1809 ms

Title: Re: Masm64 SDK ignores "uses"
Post by: zedd151 on August 29, 2023, 07:04:00 AM

Different processors, different results.

Title: Re: Masm64 SDK ignores "uses"
Post by: jj2007 on August 29, 2023, 07:19:14 AM

Quote from: zedd151 on August 29, 2023, 05:38:57 AMOdd, always the second iteration...

I added an align 16, now it looks stable on my machine:

Code Select

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
pushing took 1592 ms
moving  took 1841 ms
pushing took 1575 ms
moving  took 1825 ms
pushing took 1591 ms
moving  took 1825 ms
pushing took 1592 ms
moving  took 1856 ms

I also added a Masm64 SDK-compatible PrintCpu macro for Héctor ;-)

Title: Re: Masm64 SDK ignores "uses"
Post by: zedd151 on August 29, 2023, 07:22:29 AM

Code Select

Intel(R) Core(TM)2 Duo CPU    E8400  @ 3.00GHz
pushing took 1716 ms
moving  took 2091 ms
pushing took 1794 ms
moving  took 2090 ms
pushing took 1779 ms
moving  took 2090 ms
pushing took 1763 ms
moving  took 1934 ms
Press any key to continue . . .

Looks better
We need some AMD's

Title: Re: Masm64 SDK ignores "uses"
Post by: jj2007 on August 29, 2023, 07:29:28 AM

Quote from: zedd151 on August 29, 2023, 07:22:29 AMWe need some AMD's

Your wish is my command:

Code Select

AMD Athlon Gold 3150U with Radeon Graphics
pushing took 1828 ms
moving  took 1844 ms
pushing took 1875 ms
moving  took 1984 ms
pushing took 1906 ms
moving  took 1875 ms
pushing took 1875 ms
moving  took 1906 ms

Title: Re: Masm64 SDK ignores "uses"
Post by: zedd151 on August 29, 2023, 07:30:49 AM

Quote from: jj2007 on August 29, 2023, 07:29:28 AMAMD Athlon Gold 3150U with Radeon Graphics

Not a big variance. A little flip-flopping, though. I would call it about even for your AMD.

Title: Re: Masm64 SDK ignores "uses"
Post by: TimoVJL on August 29, 2023, 06:29:27 PM

AMD Ryzen 5 3400G

Code Select

pushing took 1375 ms
moving  took 1390 ms
pushing took 1375 ms
moving  took 1407 ms
pushing took 1453 ms
moving  took 1437 ms
pushing took 1469 ms
moving  took 1641 ms

Code Select

pushing took 1453 ms
moving  took 1390 ms
pushing took 1391 ms
moving  took 1390 ms
pushing took 1391 ms
moving  took 1390 ms
pushing took 1391 ms
moving  took 1406 ms

Code Select

pushing took 1625 ms
moving  took 1359 ms
pushing took 1359 ms
moving  took 1375 ms
pushing took 1375 ms
moving  took 1375 ms
pushing took 1360 ms
moving  took 1359 ms

Title: Re: Masm64 SDK ignores "uses"
Post by: jj2007 on August 29, 2023, 07:09:07 PM

So it seems that AMD CPUs take exactly the same amount of cycles, while Intel CPUs are slightly faster with push & pop.

This is remarkable, since the hype around the x64 ABI is based on the idea that moving stuff is faster than pushing :cool:

x64 Architecture (https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/x64-architecture) is an interesting read. Did you know that you can align 16 the stack with a simple, short and spl, 0F0h?

Code Select

48:83E4 F0                 | and rsp,FFFFFFFFFFFFFFF0        | OK
83E4 F0                    | and esp,FFFFFFF0                | not recommended, clears upper dword
66:83E4 F0                 | and sp,FFF0                     | OK
40:32E4                    | xor spl,spl                     | align stack 256
40:80E4 F0                 | and spl,F0                      | OK

Another interesting bit:

QuoteThe caller reserves space on the stack for arguments passed in registers

It doesn't say anything about our dear habit to put a sub rsp, 80h somewhere on top of the proc. It just says for arguments passed in registers, i.e. rcx, rdx, r8 and r9. At least that's what I read into this phrase - xmm0 is a register, right?

Title: Re: Masm64 SDK ignores "uses"
Post by: HSE on August 29, 2023, 09:44:40 PM

Quote from: jj2007 on August 29, 2023, 07:09:07 PMwhile Intel CPUs are slightly faster with push & pop.

Not exactly. Here results are same number: 5.59 cycles, and variance is so big (179 and 160 cycles^2) that it's not possible to say very much.

Picture is from mov, but pushpop is the same.

(https://i.postimg.cc/HVQ6SqXk/mov-stack.jpg) (https://postimg.cc/HVQ6SqXk)

Title: Re: Masm64 SDK ignores "uses"
Post by: jj2007 on August 29, 2023, 10:57:57 PM

What's your actual code? Here is mine:

Code Select

method1:
  push rsi
  push rdi
  push rbx
  nop
  pop rbx
  pop rdi
  pop rsi
  ret
method2:
 mov [rbp+16], rsi
 mov [rbp+24], rdi
 mov [rbp+32], rbx
  nop
  mov rbx, [rbp+32]
  mov rdi, [rbp+32]
  mov rsi, [rbp+32]
  ret

...

Code Select

  REPEAT 4
  mov ticks, rv(GetTickCount)
  mov ecx, iterations
  align 16
@@:
  call method1
  dec ecx
  jns @B
  sub rv(GetTickCount), ticks
  invoke __imp__cprintf, cfm$("pushing took %i ms\n"), rax

  mov ticks, rv(GetTickCount)
  mov ecx, iterations
  align 16
@@:
  call method2
  dec ecx
  jns @B
  sub rv(GetTickCount), ticks
  invoke __imp__cprintf, cfm$("moving  took %i ms\n"), rax
  ENDM

Title: Re: Masm64 SDK ignores "uses"
Post by: HSE on August 29, 2023, 11:08:32 PM

Code Select

function_under_glass5 macro
  push rsi
  push rdi
  push rbx
  nop
  pop rbx
  pop rdi
  pop rsi
endm

function_under_glass6 macro
  mov _rsi, rsi
  mov _rdi, rdi
  mov _rbx, rbx
  nop
  mov rbx, _rbx
  mov rdi, _rdi
  mov rsi, _rsi
endm

There is no call in this test.

Code Select

      .while !ZERO?
        function_under_glass6
        dec ebx
      .endw

Title: Re: Masm64 SDK ignores "uses"
Post by: jj2007 on August 29, 2023, 11:27:19 PM

Quote from: HSE on August 29, 2023, 11:08:32 PMThere is no call in this test.

Interesting. "nc" stands for "no call":

Code Select

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
*
pushing took 983 ms
moving  took 1139 ms
nc pushing took 1030 ms
nc moving  took 936 ms
pushing took 1045 ms
moving  took 1139 ms
nc pushing took 1014 ms
nc moving  took 952 ms
pushing took 998 ms
moving  took 1201 ms
nc pushing took 1014 ms
nc moving  took 936 ms
pushing took 967 ms
moving  took 1263 ms
nc pushing took 1030 ms
nc moving  took 951 ms

IMHO there should be a call, since we are talking about the best way to implement "uses rsi rdi". Anyway, it's an interesting result :cool:

Title: Re: Masm64 SDK ignores "uses"
Post by: HSE on August 29, 2023, 11:44:30 PM

Quote from: jj2007 on August 29, 2023, 11:27:19 PMIMHO there should be a call, since we are talking about the best way to implement "uses rsi rdi".

Correct. There was to much variations in your test. I tried to split the problem to see where variation is. Look like access to stack have that variation.

Title: Re: Masm64 SDK ignores "uses"
Post by: zedd on April 21, 2025, 10:09:26 AM

Quote from: zedd151 on August 29, 2023, 07:22:29 AM
Code Select Expand
Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz pushing took 1716 ms moving took 2091 ms pushing took 1794 ms moving took 2090 ms pushing took 1779 ms moving took 2090 ms pushing took 1763 ms moving took 1934 ms

Code Select

Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz
pushing took 969 ms
moving  took 985 ms
pushing took 953 ms
moving  took 1000 ms
pushing took 968 ms
moving  took 985 ms
pushing took 969 ms
moving  took 984 ms

I like beating my old, slow machine. :azn:

Title: Re: Masm64 SDK ignores "uses"
Post by: zedd on April 21, 2025, 10:12:31 AM

Quote from: jj2007 on August 29, 2023, 11:27:19 PM
Code Select Expand
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz * pushing took 983 ms moving took 1139 ms nc pushing took 1030 ms nc moving took 936 ms pushing took 1045 ms moving took 1139 ms nc pushing took 1014 ms nc moving took 952 ms pushing took 998 ms moving took 1201 ms nc pushing took 1014 ms nc moving took 936 ms pushing took 967 ms moving took 1263 ms nc pushing took 1030 ms nc moving took 951 ms
IMHO there should be a call, since we are talking about the best way to implement "uses rsi rdi". Anyway, it's an interesting result :cool:

Code Select

Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz
*
pushing took 594 ms
moving  took 610 ms
nc pushing took 593 ms
nc moving  took 610 ms
pushing took 593 ms
moving  took 625 ms
nc pushing took 594 ms
nc moving  took 578 ms
pushing took 610 ms
moving  took 609 ms
nc pushing took 594 ms
nc moving  took 594 ms
pushing took 609 ms
moving  took 609 ms
nc pushing took 610 ms
nc moving  took 593 ms

:biggrin: I'm kinda bored atm.

Title: Re: Masm64 SDK ignores "uses"
Post by: jj2007 on April 21, 2025, 09:35:12 PM

With an AMD:

Code Select

AMD Athlon Gold 3150U with Radeon Graphics
*
pushing took 1109 ms
moving  took 1110 ms
nc pushing took 1140 ms
nc moving  took 1110 ms
pushing took 1125 ms
moving  took 1125 ms
nc pushing took 1140 ms
nc moving  took 1125 ms
pushing took 1125 ms
moving  took 1125 ms
nc pushing took 1140 ms
nc moving  took 1125 ms
pushing took 1125 ms
moving  took 1141 ms
nc pushing took 1140 ms
nc moving  took 1141 ms

In short: no difference, and the justification of this horrible ABI that "moving is faster than pushing" is bulls*it.

Title: Re: Masm64 SDK ignores "uses"
Post by: tenkey on April 28, 2025, 06:50:41 AM

From what I picked up from Stack Overflow, the notion that moves are faster than pushes is basically obsolete, due to Intel's addition of a "stack engine" to newer microarchitectures. It streamlines sequences of pushes.

Title: Re: Masm64 SDK ignores "uses"
Post by: zedd on April 28, 2025, 07:36:38 AM

Quote from: tenkey on April 28, 2025, 06:50:41 AMthe notion that moves are faster than pushes is basically obsolete, due to Intel's addition of a "stack engine" to newer microarchitectures.

Yes, with todays hardware, fast enough is generally fast enough. :biggrin:
btw, I was just bored the other day when I posted here... not that I really cared about any perceived speed difference between the two. :cool:

Quote from: zedd on April 21, 2025, 10:12:31 AM:biggrin: I'm kinda bored atm.

The MASM Forum

Microsoft 64 bit MASM => MASM64 SDK => Topic started by: jj2007 on August 28, 2023, 08:36:11 AM