Recent posts

Welcome to The MASM Forum.
Log in
Sign up

July 04, 2025, 02:41:31 PM

News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Home

The MASM Forum
► Recent posts

Recent posts

Pages1 2 3 ... 10

The Laboratory / Re: Invoke, call, jump. Simple...

Last post by NoCforMe - Today at 03:12:16 AM

Quote from: jj2007 on July 03, 2025, 08:11:59 PM
Quote from: NoCforMe on July 03, 2025, 05:19:16 PMI restrict my use of registers for passing parameters to the regular general-purpose ones (EAX/EBX/ECX/EDX), not FPU or XMM registers.

Why so restrictive?

I hardly ever use the FPU in my programs, and have never messed around w/XMM. Most of my code is in integer-land.

Nothing wrong with using either one of those register sets to pass parameters, of course.

The Laboratory / Re: Invoke, call, jump. Simple...

Last post by zedd - Today at 12:42:19 AM

From the laptop

Code Select

Intel(R) Celeron(R) N5105 @ 2.00GHz (SSE4)

549     cycles for 100 * proc aligned 16
484     cycles for 100 * proc aligned 16+3
550     cycles for 100 * aligned push+pop
482     cycles for 100 * aligned reg32

551     cycles for 100 * proc aligned 16
484     cycles for 100 * proc aligned 16+3
551     cycles for 100 * aligned push+pop
482     cycles for 100 * aligned reg32

550     cycles for 100 * proc aligned 16
485     cycles for 100 * proc aligned 16+3
552     cycles for 100 * aligned push+pop
482     cycles for 100 * aligned reg32

551     cycles for 100 * proc aligned 16
493     cycles for 100 * proc aligned 16+3
562     cycles for 100 * aligned push+pop
493     cycles for 100 * aligned reg32

564     cycles for 100 * proc aligned 16
496     cycles for 100 * proc aligned 16+3
561     cycles for 100 * aligned push+pop
485     cycles for 100 * aligned reg32

15      bytes for proc aligned 16
19      bytes for proc aligned 16+3
24      bytes for aligned push+pop
20      bytes for aligned reg32


--- ok ---

The Laboratory / Re: Invoke, call, jump. Simple...

Last post by zedd - Today at 12:37:26 AM

Code Select

Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz (SSE4)

344    cycles for 100 * proc aligned 16
256    cycles for 100 * proc aligned 16+3
391    cycles for 100 * aligned push+pop
387    cycles for 100 * aligned reg32

345    cycles for 100 * proc aligned 16
261    cycles for 100 * proc aligned 16+3
392    cycles for 100 * aligned push+pop
380    cycles for 100 * aligned reg32

345    cycles for 100 * proc aligned 16
265    cycles for 100 * proc aligned 16+3
403    cycles for 100 * aligned push+pop
381    cycles for 100 * aligned reg32

341    cycles for 100 * proc aligned 16
260    cycles for 100 * proc aligned 16+3
382    cycles for 100 * aligned push+pop
381    cycles for 100 * aligned reg32

382    cycles for 100 * proc aligned 16
260    cycles for 100 * proc aligned 16+3
374    cycles for 100 * aligned push+pop
389    cycles for 100 * aligned reg32

15      bytes for proc aligned 16
19      bytes for proc aligned 16+3
24      bytes for aligned push+pop
20      bytes for aligned reg32


--- ok ---

The Laboratory / Re: Invoke, call, jump. Simple...

Last post by TimoVJL - Today at 12:27:20 AM

Vintage AMD

Code Select

AMD Athlon(tm) II X2 220 Processor (SSE3)

505     cycles for 100 * proc aligned 16
402     cycles for 100 * proc aligned 16+3
502     cycles for 100 * aligned push+pop
403     cycles for 100 * aligned reg32

502     cycles for 100 * proc aligned 16
403     cycles for 100 * proc aligned 16+3
502     cycles for 100 * aligned push+pop
403     cycles for 100 * aligned reg32

503     cycles for 100 * proc aligned 16
402     cycles for 100 * proc aligned 16+3
502     cycles for 100 * aligned push+pop
403     cycles for 100 * aligned reg32

502     cycles for 100 * proc aligned 16
402     cycles for 100 * proc aligned 16+3
502     cycles for 100 * aligned push+pop
408     cycles for 100 * aligned reg32

502     cycles for 100 * proc aligned 16
402     cycles for 100 * proc aligned 16+3
503     cycles for 100 * aligned push+pop
403     cycles for 100 * aligned reg32

15      bytes for proc aligned 16
19      bytes for proc aligned 16+3
24      bytes for aligned push+pop
20      bytes for aligned reg32

The Laboratory / Re: Invoke, call, jump. Simple...

Last post by jj2007 - July 03, 2025, 08:11:59 PM

Quote from: NoCforMe on July 03, 2025, 05:19:16 PMI restrict my use of registers for passing parameters to the regular general-purpose ones (EAX/EBX/ECX/EDX), not FPU or XMM registers.

Why so restrictive?

Code Select

    mov eax, 31416        ; you can mix xmm registers with FPU and ordinary 
    movd xmm0, eax        ; registers and directly print the result
    fldpi                 ; load 3.14159 onto the FPU
    mov ecx, 123          ; \n is CrLf, \t is tab in Str$()
    Print Str$("\nresult=\t%f", xmm0/ST(0)*ecx)     ; output: [newline] result=    1230003.0

The Laboratory / Re: Invoke, call, jump. Simple...

Last post by NoCforMe - July 03, 2025, 05:19:16 PM

Quote from: daydreamer on July 03, 2025, 05:11:13 PM@NoCforMe
best with transferring thru registers in your own code,if you prefer using fpu regs or xmm regs for your real4/real8 variables as coding style to your own PROC's

I restrict my use of registers for passing parameters to the regular general-purpose ones (EAX/EBX/ECX/EDX), not FPU or XMM registers.

The Laboratory / Re: Invoke, call, jump. Simple...

Last post by daydreamer - July 03, 2025, 05:11:13 PM

@NoCforMe
best with transferring thru registers in your own code,if you prefer using fpu regs or xmm regs for your real4/real8 variables as coding style to your own PROC's

Code Select

Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz (SSE4)

322     cycles for 100 * proc aligned 16
267     cycles for 100 * proc aligned 16+3
392     cycles for 100 * aligned push+pop
392     cycles for 100 * aligned reg32

310     cycles for 100 * proc aligned 16
266     cycles for 100 * proc aligned 16+3
397     cycles for 100 * aligned push+pop
394     cycles for 100 * aligned reg32

308     cycles for 100 * proc aligned 16
269     cycles for 100 * proc aligned 16+3
408     cycles for 100 * aligned push+pop
392     cycles for 100 * aligned reg32

314     cycles for 100 * proc aligned 16
263     cycles for 100 * proc aligned 16+3
404     cycles for 100 * aligned push+pop
399     cycles for 100 * aligned reg32

308     cycles for 100 * proc aligned 16
267     cycles for 100 * proc aligned 16+3
395     cycles for 100 * aligned push+pop
391     cycles for 100 * aligned reg32

15      bytes for proc aligned 16
19      bytes for proc aligned 16+3
24      bytes for aligned push+pop
20      bytes for aligned reg32


-

MASM64 SDK / A couple minor additions to .i...

Last post by BobC - July 03, 2025, 10:01:38 AM

I needed a couple additions to two include files, so I am posting them here. I'm not sure if we have a place for MASM64 SDK improvements.

In kernel32.inc, I added GetuserDefaultLocaleName around line 2227.
In win64.inc, I added the full list of LOAD_LIBRARY constants around line 7444. The constants came directly from libloaderapi.h.

Bob

The Laboratory / Re: Invoke, call, jump. Simple...

Last post by NoCforMe - July 03, 2025, 07:47:50 AM

Quote from: daydreamer on July 02, 2025, 06:58:12 PM32bit fastcall transfer data in registers vs 32 bit invoke pushing data would be most fair to test

I transfer arguments to (and from) subroutines in registers all the time in my own code.
No need to follow someone else's ABI when it's your own personal code that you can do what the hell what you like with.

Of course, I do follow the part of the Win32 ABI that requires you to respect the "sacred" registers (EBX, ESI, EDI).

#10

The Laboratory / Re: Invoke, call, jump. Simple...

Last post by jj2007 - July 03, 2025, 12:41:36 AM

Quote from: daydreamer on July 02, 2025, 06:58:12 PM32bit fastcall transfer data in registers vs 32 bit invoke pushing data would be most fair to test

Yep, you can save a cycle

Code Select

AMD Athlon Gold 3150U with Radeon Graphics      (SSE4)

400    cycles for 100 * proc aligned 16
400    cycles for 100 * proc aligned 16+3
417    cycles for 100 * aligned push+pop
273    cycles for 100 * aligned reg32

405    cycles for 100 * proc aligned 16
409    cycles for 100 * proc aligned 16+3
426    cycles for 100 * aligned push+pop
276    cycles for 100 * aligned reg32

409    cycles for 100 * proc aligned 16
402    cycles for 100 * proc aligned 16+3
422    cycles for 100 * aligned push+pop
290    cycles for 100 * aligned reg32

403    cycles for 100 * proc aligned 16
406    cycles for 100 * proc aligned 16+3
426    cycles for 100 * aligned push+pop
278    cycles for 100 * aligned reg32

406    cycles for 100 * proc aligned 16
416    cycles for 100 * proc aligned 16+3
421    cycles for 100 * aligned push+pop
281    cycles for 100 * aligned reg32

15      bytes for proc aligned 16
19      bytes for proc aligned 16+3
24      bytes for aligned push+pop
20      bytes for aligned reg32

Pages1 2 3 ... 10

The MASM Forum

News:

Recent posts

The Laboratory / Re: Invoke, call, jump. Simple...

The Laboratory / Re: Invoke, call, jump. Simple...

The Laboratory / Re: Invoke, call, jump. Simple...

The Laboratory / Re: Invoke, call, jump. Simple...

The Laboratory / Re: Invoke, call, jump. Simple...

The Laboratory / Re: Invoke, call, jump. Simple...

The Laboratory / Re: Invoke, call, jump. Simple...

MASM64 SDK / A couple minor additions to .i...

The Laboratory / Re: Invoke, call, jump. Simple...

The Laboratory / Re: Invoke, call, jump. Simple...