News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Bloated invoke or a macro?

Started by jj2007, February 25, 2024, 09:34:18 PM

Previous topic - Next topic

jj2007

  jinvoke CreateWindowEx, 0, wc.lpszClassName, Chr$("Hello World"), wsStyle, 300+320*@64, 127, 300, 200, NULL, rv(LoadMenu, wc.hInstance, 100), wc.hInstance, NULL
  invoke CreateWindowEx, 0, wc.lpszClassName, Chr$("Hello World"), wsStyle, 300+320*@64, 127, 300, 200, NULL, rv(LoadMenu, wc.hInstance, 100), wc.hInstance, NULL

Not a big difference, right? But under the hood it does look different:
UAsm: 125 bytes
48:83EC 20                      | sub rsp,20                      |
48:8B4B 18                      | mov rcx,[rbx+18]                |
48:C7C2 64000000                | mov rdx,64                      | 64:'d'
FF15 33320000                   | call [<LoadMenuA>]              |
48:83C4 20                      | add rsp,20                      |
48:83EC 60                      | sub rsp,60                      |
33C9                            | xor ecx,ecx                     |
48:8B53 40                      | mov rdx,[rbx+40]                | [qword ptr ds:[rbx+40]]:"UAsmGUI"
49:B8 6440004001000000          | mov r8,winguijj64.140004064     | 140004064:"Hello World"
41:B9 0000CF12                  | mov r9d,12CF0000                |
C74424 20 58020000              | mov [rsp+20],258                |
C74424 28 7F000000              | mov [rsp+28],7F                 |
C74424 30 2C010000              | mov [rsp+30],12C                |
C74424 38 C8000000              | mov [rsp+38],C8                 |
48:C74424 40 00000000           | mov [rsp+40],0                  |
48:894424 48                    | mov [rsp+48],rax                |
48:8B43 18                      | mov rax,[rbx+18]                |
48:894424 50                    | mov [rsp+50],rax                |
48:C74424 58 00000000           | mov [rsp+58],0                  |
FF15 D7310000                   | call [<CreateWindowExA>]        |
48:83C4 60                      | add rsp,60                      |
FFC0                            | inc eax                         |

JBasic: 100 bytes
BA 64000000                     | mov edx,64                      | 64:'d'
48:8B4B 18                      | mov rcx,[rbx+18]                | qword ptr ds:[rbx+18]:jdebP+19BC
FF15 C9340000                   | call [<LoadMenuA>]             |
48:836424 58 00                 | and [rsp+58],0                  | [qword ptr ss:[rsp+58]]:sub_1400015D3+91
4C:8B53 18                      | mov r10,[rbx+18]                | qword ptr ds:[rbx+18]:jdebP+19BC
4C:895424 50                    | mov [rsp+50],r10                |
48:894424 48                    | mov [rsp+48],rax                | [qword ptr ss:[rsp+48]]:LoadLibraryA+3F
48:836424 40 00                 | and [rsp+40],0                  | [qword ptr ss:[rsp+40]]:"msvcrt"
48:C74424 38 C8000000           | mov [rsp+38],C8                 |
48:C74424 30 2C010000           | mov [rsp+30],12C                |
48:C74424 28 7F000000           | mov [rsp+28],7F                 | [qword ptr ss:[rsp+28]]:GetProcAddressForCaller+6C
48:C74424 20 6C020000           | mov [rsp+20],26C                | [qword ptr ss:[rsp+20]]:sub_1400015D3+91
41:B9 0000CF12                  | mov r9d,12CF0000                |
4C:8D05 AC1F0000                | lea r8,[1400030A2]              | ds:[00000001400030A2]:"Hello World"
48:8B53 40                      | mov rdx,[rbx+40]                | [qword ptr ds:[rbx+40]]:"JBasicGUI"
33C9                            | xor ecx,ecx                     |
FF15 7A340000                   | call [<&CreateWindowExA>]       |
FFC0                            | inc eax                         |

Questions to the UAsm & JWasm developers:
- How difficult is it to use xor ecx,ecx instead of mov ecx, 0? (UAsm does it, JWasm not:  3 bytes more)

_japheth

QuoteNot a big difference, right? But under the hood it does look different:
UAsm: 125 bytes
<snip>
JBasic: 100 bytes

That's a bit cheating. You'd have to compare with "option win64:2" switch on. It's not the default, but actually it's necessary to comply with the Win64 ABI - because, IIRC, register RSP is regarded "non-volatile" in Win64, meaning you are NOT supposed to change it after the prologue.

Quote- How difficult is it to use xor ecx,ecx instead of mov ecx, 0? (UAsm does it, JWasm not:  3 bytes more)

It's extremely difficult - and less readable. But the last thing I'd implement is a "and [rsp+x], 0" to move a 0 to a location - that's just ugly.

Dummheit, gepaart mit Dreistigkeit - eine furchtbare Macht.

jj2007

Quote from: _japheth on February 26, 2024, 01:52:24 AMoption win64:2

I tried that, and indeed there is less fumbling with the stack but... CreateWindowEx fails with error 578h, invalid window handle :sad:

Re "xor ecx,ecx instead of mov ecx, 0" being "extremely difficult": kudos to the UAsm team who found a way to do that. I have no difficulties reading and understanding what xor ecx, ecx does btw.

_japheth

Quote from: jj2007 on February 26, 2024, 10:37:54 AMkudos to the UAsm team who found a way to do that.

Absolutely!

Although, if I examine what current jwasm creates for "INVOKE CreateWindowEx" (OPTION WIN64:3):
000001B3      invoke __imp_CreateWindowExA, NULL, addr szClass, addr szWnd,WS_OVERLAPPEDWINDOW,CW_USEDEFAULT, CW_USEDEFAULT,CW_USEDEFAULT, CW_USEDEFAULT,0,0,hInstance,0
000001B3 B900000000         *    mov ecx, NULL
000001B8 488D1500000000     *    lea rdx, szClass
000001BF 4C8D0500000000     *    lea r8, szWnd
000001C6 41B90000CF00       *    mov r9d, WS_OVERLAPPEDWINDOW
000001CC C744242000000080   *    mov dword ptr [rsp+32], CW_USEDEFAULT
000001D4 C744242800000080   *    mov dword ptr [rsp+40], CW_USEDEFAULT
000001DC C744243000000080   *    mov dword ptr [rsp+48], CW_USEDEFAULT
000001E4 C744243800000080   *    mov dword ptr [rsp+56], CW_USEDEFAULT
000001EC 48C744244000000000 *    mov qword ptr [rsp+64], 0
000001F5 48C744244800000000 *    mov qword ptr [rsp+72], 0
000001FE 488B4510           *    mov rax, hInstance
00000202 4889442450         *    mov [rsp+80], rax
00000207 48C744245800000000 *    mov qword ptr [rsp+88], 0
00000210 FF1500000000       *    call __imp_CreateWindowExA

Register R8 loads address of a string in .const with LEA, while Uasm uses MOV - which isn't the best thing to do, as your very smart JBasic also realized ...
Dummheit, gepaart mit Dreistigkeit - eine furchtbare Macht.

jj2007

Quote from: _japheth on February 26, 2024, 03:04:56 PMwhat current jwasm creates

So it seems that JWasm v2.17, Mar 24 2023 (which produces the code below) has been improved since. Google's top hit for jwasm download points to an 11 year old version. Same for Github - where is the current one?

Btw why is option Win64:3 not the default option? It avoids, apparently, the useless fumbling of the stack.
JWasm Win64:3
48:8B4B 18                      | mov rcx,[rbx+18]                |
48:C7C2 64000000                | mov rdx,64                      | 64:'d'
FF15 3E120000                   | call [<LoadMenuA>]              |
B9 00000000                     | mov ecx,0                       |
48:8B53 40                      | mov rdx,[rbx+40]                | [qword ptr ds:[rbx+40]]:"UAsmGUI"
49:B8 6420004001000000          | mov r8,winguijj64.140002064     | 140002064:"Hello World"
41:B9 0000CF12                  | mov r9d,12CF0000                |
C74424 20 6C020000              | mov [rsp+20],26C                |
C74424 28 7F000000              | mov [rsp+28],7F                 |
C74424 30 2C010000              | mov [rsp+30],12C                |
C74424 38 C8000000              | mov [rsp+38],C8                 |
48:C74424 40 00000000           | mov [rsp+40],0                  |
48:894424 48                    | mov [rsp+48],rax                |
48:8B43 18                      | mov rax,[rbx+18]                |
48:894424 50                    | mov [rsp+50],rax                |
48:C74424 58 00000000           | mov [rsp+58],0                  |
FF15 E7110000                   | call [<CreateWindowExA>]        |
85C0                            | test eax,eax                    |
74 39                           | je 14000111E                    |
FFC0                            | inc eax                         |

Quote from: _japheth on February 26, 2024, 01:52:24 AMthe last thing I'd implement is a "and [rsp+x], 0" to move a 0 to a location - that's just ugly

It's not ugly, it's a very short encoding for mov mem, 0. It also costs a few cycles, which is irrelevant for the CreateWindowEx case (and 99% of all cases), but your comment inspired me indeed, so I will implement a xor r99, r99 plus mov mem, r99 solution for JBasic.

HSE

Equations in Assembly: SmplMath

_japheth

Quote from: jj2007 on February 26, 2024, 08:44:11 PMSo it seems that JWasm v2.17, Mar 24 2023 (which produces the code below) has been improved since.

No. Please show details how the variable is defined - and the invoke statement!

QuoteBtw why is option Win64:3 not the default option? It avoids, apparently, the useless fumbling of the stack.

Because it relies on a certain "smartness" of the assembler that simply was missing in the beginning.
Dummheit, gepaart mit Dreistigkeit - eine furchtbare Macht.

jj2007

Quote from: _japheth on February 26, 2024, 10:27:05 PM
QuoteSo it seems that JWasm v2.17, Mar 24 2023 (which produces the code below) has been improved since.

No. Please show details how the variable is defined - and the invoke statement!

Quote from: jj2007 on February 25, 2024, 09:34:18 PM  jinvoke CreateWindowEx, 0, wc.lpszClassName, Chr$("Hello World"), wsStyle, 300+320*@64, 127, 300, 200, NULL, rv(LoadMenu, wc.hInstance, 100), wc.hInstance, NULL
  invoke CreateWindowEx, 0, wc.lpszClassName, Chr$("Hello World"), wsStyle, 300+320*@64, 127, 300, 200, NULL, rv(LoadMenu, wc.hInstance, 100), wc.hInstance, NULL

The variable Chr$("Hello World") is defined as follows:
Chr$ MACRO args:VARARG
Local NewString
  .DATA
  NewString db args, 0
  .CODE
  EXITM <offset NewString>
ENDM

The jinvoke macro translates the offset NewString into lea r8,[1400030A2]

TimoVJL

Pelles C 12
src\coff.c(89): error #2082: Invalid initialization type; expected 'unsigned char' but found 'char *'.
coffspec.h line 139 change for testing
        char ShortName[8];  // Pelles C 12
        //uint_8 ShortName[8];
May the source be with you

jj2007

Quote from: TimoVJL on February 27, 2024, 11:15:45 PMPelles C 12
src\coff.c(89): error #2082: Invalid initialization type; expected 'unsigned char' but found 'char *'.

Did you try to build a C source?

Quote from: HSE on February 26, 2024, 09:54:12 PMhttps://github.com/Baron-von-Riedesel/JWasm

TimoVJL

May the source be with you