Author Topic: Switch macro timings  (Read 10033 times)

jj2007

  • Member
  • *****
  • Posts: 13651
  • Assembly is fun ;-)
    • MasmBasic
Switch macro timings
« on: April 24, 2016, 07:34:02 PM »
Testing Nidud's new .Switch macro and MasmBasic's version (of today, download here). Some timings on other processors would be nice. No inkey - run from a DOS prompt.

Note the tradeoff between size and speed for the MasmBasic "ret" and "jmp" versions; the latter produces code that is almost identical to AsmC's .Switch macro (I still have to find out where the 8 or so extra bytes come from  :( ). What is remarkable is that there is already a speed gain for more than 4 cases, compared to the Masm32 switch macro.

Note also (quote from \Masm32\MasmBasic\MbGuide.rtf but valid also for AsmC .Switch)
Code: [Select]
- avoid using Switch_ with few cases that are far apart, e.g. case -1000, case 1, case 1000, as this
  creates a huge jump table with many default entries; the Masm32 switch macro will be more size-efficient

Code: [Select]
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
Assembled with AsmC
11 ms   case 260, MB Switch_ ret
8 ms    case 260, MB Switch_ jmp
370 ms  case 260, Masm32 switch
5 ms    case 260, AsmC .Switch

7 ms    case 196, MB Switch_ ret
5 ms    case 196, MB Switch_ jmp
286 ms  case 196, Masm32 switch
5 ms    case 196, AsmC .Switch

7 ms    case 132, MB Switch_ ret
5 ms    case 132, MB Switch_ jmp
189 ms  case 132, Masm32 switch
5 ms    case 132, AsmC .Switch

7 ms    case 68, MB Switch_ ret
5 ms    case 68, MB Switch_ jmp
98 ms   case 68, Masm32 switch
5 ms    case 68, AsmC .Switch

8 ms    case 4, MB Switch_ ret
5 ms    case 4, MB Switch_ jmp
10 ms   case 4, Masm32 switch
5 ms    case 4, AsmC .Switch

2999    bytes for MBSret
4211    bytes for MBSjmp
4799    bytes for Masm32
4200    bytes for asmc

MB Switch_ in jmp mode:
ct=-2   Case -2
ct=-1   Default
ct=0    Default
ct=1    Case 1 .. 4
ct=2    ecx=2
ct=3    Case 1 .. 4
ct=4    edx=4
ct=5    Default
ct=6    var1=6
ct=7    Case 7

MB Switch_ in ret mode:
ct=-2   Case -2
ct=-1   Default
ct=0    Default
ct=1    Case 1 .. 4
ct=2    ecx=2
ct=3    Case 1 .. 4
ct=4    edx=4
ct=5    Default
ct=6    var1=6
ct=7    Case 7

Source and exe attached; for building your own version, RichMasm works best, as it creates the AsmUsed$() macro (see above, "Assembled with AsmC"), which allows selective inclusion of Nidud's .Switch based on the assembler used. But the *.asc source opens in Wordpad, too.

P.S.: The macro assembles fine with everything from ML 6.15 .. 10, JWasm, HJWasm and AsmC; the ML versions build about half as fast, though, due to the excessive number of cases tested (300+).

Note also that there is cases=... in line 21; ML chokes above ca. 920 cases, AsmC seems not so impressed, but prepare for fat executables, e.g. for 10,000 cases:
Code: [Select]
109663  bytes for MBSret
149663  bytes for MBSjmp
179345  bytes for Masm32
149658  bytes for asmc
8)

sinsi

  • Guest
Re: Switch macro timings
« Reply #1 on: April 24, 2016, 08:04:20 PM »
Code: [Select]
Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
Assembled with AsmC
5 ms    case 260, MB Switch_ ret
4 ms    case 260, MB Switch_ jmp
294 ms  case 260, Masm32 switch
4 ms    case 260, AsmC .Switch

5 ms    case 196, MB Switch_ ret
4 ms    case 196, MB Switch_ jmp
226 ms  case 196, Masm32 switch
3 ms    case 196, AsmC .Switch

5 ms    case 132, MB Switch_ ret
4 ms    case 132, MB Switch_ jmp
149 ms  case 132, Masm32 switch
4 ms    case 132, AsmC .Switch

6 ms    case 68, MB Switch_ ret
4 ms    case 68, MB Switch_ jmp
81 ms   case 68, Masm32 switch
3 ms    case 68, AsmC .Switch

6 ms    case 4, MB Switch_ ret
4 ms    case 4, MB Switch_ jmp
4 ms    case 4, Masm32 switch
3 ms    case 4, AsmC .Switch

2999    bytes for MBSret
4211    bytes for MBSjmp
4799    bytes for Masm32
4200    bytes for asmc

MB Switch_ in jmp mode:
ct=-2   Case -2
ct=-1   Default
ct=0    Default
ct=1    Case 1 .. 4
ct=2    ecx=2
ct=3    Case 1 .. 4
ct=4    edx=4
ct=5    Default
ct=6    var1=6
ct=7    Case 7

MB Switch_ in ret mode:
ct=-2   Case -2
ct=-1   Default
ct=0    Default
ct=1    Case 1 .. 4
ct=2    ecx=2
ct=3    Case 1 .. 4
ct=4    edx=4
ct=5    Default
ct=6    var1=6
ct=7    Case 7

sinsi

  • Guest
Re: Switch macro timings
« Reply #2 on: April 24, 2016, 08:07:59 PM »
Code: [Select]
AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G
Assembled with AsmC
4 ms    case 260, MB Switch_ ret
4 ms    case 260, MB Switch_ jmp
313 ms  case 260, Masm32 switch
4 ms    case 260, AsmC .Switch

4 ms    case 196, MB Switch_ ret
3 ms    case 196, MB Switch_ jmp
218 ms  case 196, Masm32 switch
6 ms    case 196, AsmC .Switch

4 ms    case 132, MB Switch_ ret
3 ms    case 132, MB Switch_ jmp
147 ms  case 132, Masm32 switch
7 ms    case 132, AsmC .Switch

5 ms    case 68, MB Switch_ ret
4 ms    case 68, MB Switch_ jmp
77 ms   case 68, Masm32 switch
4 ms    case 68, AsmC .Switch

5 ms    case 4, MB Switch_ ret
6 ms    case 4, MB Switch_ jmp
6 ms    case 4, Masm32 switch
4 ms    case 4, AsmC .Switch

2999    bytes for MBSret
4211    bytes for MBSjmp
4799    bytes for Masm32
4200    bytes for asmc

MB Switch_ in jmp mode:
ct=-2   Case -2
ct=-1   Default
ct=0    Default
ct=1    Case 1 .. 4
ct=2    ecx=2
ct=3    Case 1 .. 4
ct=4    edx=4
ct=5    Default
ct=6    var1=6
ct=7    Case 7

MB Switch_ in ret mode:
ct=-2   Case -2
ct=-1   Default
ct=0    Default
ct=1    Case 1 .. 4
ct=2    ecx=2
ct=3    Case 1 .. 4
ct=4    edx=4
ct=5    Default
ct=6    var1=6
ct=7    Case 7

nidud

  • Member
  • *****
  • Posts: 2388
    • https://github.com/nidud/asmc
Re: Switch macro timings
« Reply #3 on: April 24, 2016, 08:38:17 PM »
deleted
« Last Edit: February 25, 2022, 11:35:20 AM by nidud »

nidud

  • Member
  • *****
  • Posts: 2388
    • https://github.com/nidud/asmc
Re: Switch macro timings
« Reply #4 on: April 24, 2016, 08:45:03 PM »
deleted
« Last Edit: February 25, 2022, 11:35:29 AM by nidud »

jj2007

  • Member
  • *****
  • Posts: 13651
  • Assembly is fun ;-)
    • MasmBasic
Re: Switch macro timings
« Reply #5 on: April 24, 2016, 10:26:21 PM »
It’s a slick solution but impossible to implement given the stack is uneven at the case entry

That could indeed be difficult in C, but the Switch_ macro works just fine, and size-wise it's up to 27% or so less, due to the retn instead of the jmp. I'll put a warning in the guide, though, that inside the Case the stack is 4 bytes off. Normally, it should be an issue only for exotic code that picks args manually from the stack.

nidud

  • Member
  • *****
  • Posts: 2388
    • https://github.com/nidud/asmc
Re: Switch macro timings
« Reply #6 on: April 24, 2016, 11:01:25 PM »
deleted
« Last Edit: February 25, 2022, 11:35:38 AM by nidud »

Siekmanski

  • Member
  • *****
  • Posts: 2622
Re: Switch macro timings
« Reply #7 on: April 25, 2016, 03:58:00 AM »
Code: [Select]
Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz
Assembled with AsmC
6 ms    case 260, MB Switch_ ret
4 ms    case 260, MB Switch_ jmp
334 ms  case 260, Masm32 switch
4 ms    case 260, AsmC .Switch

6 ms    case 196, MB Switch_ ret
4 ms    case 196, MB Switch_ jmp
252 ms  case 196, Masm32 switch
5 ms    case 196, AsmC .Switch

6 ms    case 132, MB Switch_ ret
4 ms    case 132, MB Switch_ jmp
169 ms  case 132, Masm32 switch
4 ms    case 132, AsmC .Switch

6 ms    case 68, MB Switch_ ret
5 ms    case 68, MB Switch_ jmp
89 ms   case 68, Masm32 switch
4 ms    case 68, AsmC .Switch

6 ms    case 4, MB Switch_ ret
5 ms    case 4, MB Switch_ jmp
4 ms    case 4, Masm32 switch
4 ms    case 4, AsmC .Switch

2999    bytes for MBSret
4211    bytes for MBSjmp
4799    bytes for Masm32
4200    bytes for asmc

MB Switch_ in jmp mode:
ct=-2   Case -2
ct=-1   Default
ct=0    Default
ct=1    Case 1 .. 4
ct=2    ecx=2
ct=3    Case 1 .. 4
ct=4    edx=4
ct=5    Default
ct=6    var1=6
ct=7    Case 7

MB Switch_ in ret mode:
ct=-2   Case -2
ct=-1   Default
ct=0    Default
ct=1    Case 1 .. 4
ct=2    ecx=2
ct=3    Case 1 .. 4
ct=4    edx=4
ct=5    Default
ct=6    var1=6
ct=7    Case 7
Creative coders use backward thinking techniques as a strategy.

jj2007

  • Member
  • *****
  • Posts: 13651
  • Assembly is fun ;-)
    • MasmBasic
Re: Switch macro timings
« Reply #8 on: April 25, 2016, 06:04:56 AM »
Code: [Select]
Switch_ eax
  Case_ 1
jmp toend
...
Endsw_
toend:
ret

Not sure what you want to demonstrate here... normally, you should never jmp out of a Switch.

Code: [Select]
Switch_ eax
  Case_ 1
; no jmp toend
...  ; will arrive after Endsw_ anyway
Endsw_

If you really want to "jump out", use this:
Code: [Select]
Switch_ eax
  Case_ 1
.if !somecondition
MsgBox "whatever", "title", MB_OK
.endif
Endsw_

nidud

  • Member
  • *****
  • Posts: 2388
    • https://github.com/nidud/asmc
Re: Switch macro timings
« Reply #9 on: April 25, 2016, 06:58:50 AM »
deleted
« Last Edit: February 25, 2022, 11:35:50 AM by nidud »

jj2007

  • Member
  • *****
  • Posts: 13651
  • Assembly is fun ;-)
    • MasmBasic
Re: Switch macro timings
« Reply #10 on: April 25, 2016, 08:03:17 AM »
Here’s a "jump in" from MasmBasic:
Code: [Select]
Event Key
  cmp VKey, -VK_F1
  je @Open
  ...
  Case 1, TB0+1 ; 2nd toolbar button or 2nd menu entry
  @Open:

No problem, stack is fine.

Quote
However what I actually meant was this:
Code: [Select]
mov ecx,4
Switch_ ecx, ret
  Case_ 1
mov eax,1
  Case_ 2
mov eax,2
  Case_ 3
mov eax,3
  Case_ 4
mov eax,4
Endsw_

The last case jumps to end.

GOOD CATCH! Corrected with version 25 April :t

Mikl__

  • Member
  • *****
  • Posts: 1304
Re: Switch macro timings
« Reply #11 on: April 25, 2016, 12:01:06 PM »
Hi, jj2007!
Code: [Select]
Intel(R) Pentium(R) CPU G860 @ 3.00GHz
Assembled with AsmC
7 ms case 260, MB Switch_ ret
5 ms case 260, MB Switch_ jmp
410 ms case 260, Masm32 switch
5 ms case 260, AsmC .Switch

7 ms case 196, MB Switch_ ret
5 ms case 196, MB Switch_ jmp
301 ms case 196, Masm32 switch
5 ms case 196, AsmC .Switch

7 ms case 132, MB Switch_ ret
6 ms case 132, MB Switch_ jmp
204 ms case 132, Masm32 switch
5 ms case 132, AsmC .Switch

7 ms case 68, MB Switch_ ret
5 ms case 68, MB Switch_ jmp
109 ms case 68, Masm32 switch
6 ms case 68, AsmC .Switch

8 ms case 4, MB Switch_ ret
6 ms case 4, MB Switch_ jmp
11 ms case 4, Masm32 switch
5 ms case 4, AsmC .Switch

2999 bytes for MBSret
4211 bytes for MBSjmp
4799 bytes for Masm32
4200 bytes for asmc

MB Switch_ in jmp mode:
ct=-2 Case -2
ct=-1 Default
ct=0 Default
ct=1 Case 1 .. 4
ct=2 ecx=2
ct=3 Case 1 .. 4
ct=4 edx=4
ct=5 Default
ct=6 var1=6
ct=7 Case 7

MB Switch_ in ret mode:
ct=-2 Case -2
ct=-1 Default
ct=0 Default
ct=1 Case 1 .. 4
ct=2 ecx=2
ct=3 Case 1 .. 4
ct=4 edx=4
ct=5 Default
ct=6 var1=6
ct=7 Case 7

jj2007

  • Member
  • *****
  • Posts: 13651
  • Assembly is fun ;-)
    • MasmBasic
Re: Switch macro timings
« Reply #12 on: April 25, 2016, 12:31:58 PM »
Thanks to all of you :icon14:

The problem is to find an application for jump table switches. My editor's help menu, for example, has over 30 entries, so that would be a candidate; but the number of entries can vary, so this is not an option.

WndProc?
Code: [Select]
WM_NULL                              equ 0h
WM_CREATE                            equ 1h
WM_DESTROY                           equ 2h
WM_MOVE                              equ 3h
WM_SIZE                              equ 5h
WM_ACTIVATE                          equ 6h

Looks like a great candidate, but...
Code: [Select]
WM_CUT                               equ 300h
WM_COPY                              equ 301h
WM_PASTE                             equ 302h

No good for a jump table ::)

nidud

  • Member
  • *****
  • Posts: 2388
    • https://github.com/nidud/asmc
Re: Switch macro timings
« Reply #13 on: April 26, 2016, 12:45:54 AM »
deleted
« Last Edit: February 25, 2022, 11:36:02 AM by nidud »

jj2007

  • Member
  • *****
  • Posts: 13651
  • Assembly is fun ;-)
    • MasmBasic
Re: Switch macro timings
« Reply #14 on: April 29, 2016, 09:37:58 PM »
This is a test with 2,000 cases:
Code: [Select]
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
Result MB:      0, 26 ms
Result Masm32:  0, 1795 ms
Result AsmC:    0, 29 ms

14084   bytes for swMB
27540   bytes for swMasm32
22111   bytes for swAsmC

But if you increase to 10,000 cases, the timings for Masm32 switch jump up dramatically:
Code: [Select]
Result MB:      0, 157 ms
Result Masm32:  0, 202547 ms
Result AsmC:    0, 185 ms

I wonder if AMD cpus behave differently...