The MASM Forum

General => The Campus => Topic started by: gelatine1 on November 12, 2014, 01:11:50 AM

Title: quickest way to retrieve HO and LO words of a dword ?
Post by: gelatine1 on November 12, 2014, 01:11:50 AM
I implemented this piece of code and I was wondering if there isn't a quicker/ better way to do this:

;variables used in code below
;x2 dd ?
;y2 dd ?

mov eax,lParam ;Get coords from lParam and store in x2 and y2
and eax,0FFFFh ;retrieve LO word
mov x2,eax
mov eax,lParam
and eax,0FFFF0000h ;retrieve HO word
ror eax,16 ;put HO word as LO word
mov y2,eax


Thanks in advance,
Jannes
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: hutch-- on November 12, 2014, 01:13:34 AM

; DWORD in EAX
mov CX, AX
rol EAX, 16
mov DX, AX
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: qWord on November 12, 2014, 01:17:09 AM
low word = WORD ptr lParam
high word = WORD ptr lParam[2]
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: gelatine1 on November 12, 2014, 01:20:03 AM
thanks , but do those work ? since I need to have the low word and high word stored in a dword and not just in a word like your codes are implying ?
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: qWord on November 12, 2014, 01:26:27 AM
use movzx or movsx according to the type.
e.g.:
movsx eax,WORD ptr lParam
movsx edx,WORD ptr lParam[2]
mov x,eax
mov y,edx
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: gelatine1 on November 12, 2014, 02:29:10 AM
Thank you :exclaim:
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: Gunther on November 12, 2014, 04:42:49 AM
Hi gelatine1,

both solutions (Hutch and qWord) should work fine. It could be that Hutch's code gets a few penalty cycles inside a 32-bit segment. But that's all.

Gunther
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: hutch-- on November 12, 2014, 06:35:29 AM
You got it in the form of 16 bit because that is what you asked for, the HI and LO WORDS of a dword.
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: Tedd on November 12, 2014, 06:42:57 AM
What's wrong with the classic:

    mov eax,lParam
    mov edx,eax
    shr eax,16      ;high word
    and edx,0FFFFh  ;low word
    mov y2,eax
    mov x2,edx

Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: jj2007 on November 12, 2014, 07:09:56 AM
Quote from: Tedd on November 12, 2014, 06:42:57 AM
What's wrong with the classic...

42% longer maybe?
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: dedndave on November 12, 2014, 07:19:21 AM
undoubtedly faster, too
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: hutch-- on November 12, 2014, 08:09:38 AM
These are my 2, I would normally use the first.


    LOCAL tx    :DWORD
    LOCAL ty    :DWORD

  ; ------------------------------------- 1

    movsx eax, WORD PTR [lParam]
    movsx ecx, WORD PTR [lParam+2]
    mov tx, eax
    mov ty, ecx

    print uhex$(tx),"h  loword",13,10
    print uhex$(ty),"h  hiword",13,10

  ; ------------------------------------- 2

    mov edx, lParam
    movzx eax, dx
    rol edx, 16
    movzx ecx, dx
    mov tx, eax
    mov ty, ecx

    print uhex$(tx),"h  loword",13,10
    print uhex$(ty),"h  hiword",13,10

  ; -------------------------------------
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: Gunther on November 12, 2014, 08:29:42 AM
Quote from: hutch-- on November 12, 2014, 08:09:38 AM
These are my 2, I would normally use the first.

yes, me to. It's a fast and clear solution.

Gunther
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: jj2007 on November 12, 2014, 02:31:30 PM
Speed-wise I can't see much of a difference. Anyway, in a WndProc it won't matter if it's 2 cycles or 1.8 ( ::) )

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz (SSE4)

997     cycles for 500 * movzx
879     cycles for 500 * shr & and FFFFh
899     cycles for 500 * movsx

994     cycles for 500 * movzx
874     cycles for 500 * shr & and FFFFh
897     cycles for 500 * movsx

970     cycles for 500 * movzx
871     cycles for 500 * shr & and FFFFh
898     cycles for 500 * movsx

17      bytes for movzx
23      bytes for shr & and FFFFh
17      bytes for movsx
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: dedndave on November 13, 2014, 12:16:04 AM
a bit disappointed with the movzx/movsx speed
I wonder how much of that may be attributed to the unaligned word
my sister's machine: (I'm in Michigan this week)

Intel(R) Core(TM)2 Duo CPU     T7700  @ 2.40GHz (SSE4)

2035    cycles for 500 * movzx
1042    cycles for 500 * shr & and FFFFh
2042    cycles for 500 * movsx

2040    cycles for 500 * movzx
1042    cycles for 500 * shr & and FFFFh
2115    cycles for 500 * movsx

2034    cycles for 500 * movzx
1052    cycles for 500 * shr & and FFFFh
2041    cycles for 500 * movsx
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: hutch-- on November 13, 2014, 12:22:38 AM
Dave,

You can expect those types of variations on different hardware. My old Core2 quad behave much closer to the i7 I am using but some of the earlier Core2 chips were internally different.

These are the results on my i7.


1442    cycles for 500 * movzx
1453    cycles for 500 * shr & and FFFFh
1450    cycles for 500 * movsx

1443    cycles for 500 * movzx
1453    cycles for 500 * shr & and FFFFh
1451    cycles for 500 * movsx

1443    cycles for 500 * movzx
1453    cycles for 500 * shr & and FFFFh
1451    cycles for 500 * movsx

17      bytes for movzx
23      bytes for shr & and FFFFh
17      bytes for movsx


This is the result on an older Core2 quad.


Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz (SSE4)

2098    cycles for 500 * movzx
1030    cycles for 500 * shr & and FFFFh
2128    cycles for 500 * movsx

2097    cycles for 500 * movzx
1046    cycles for 500 * shr & and FFFFh
2025    cycles for 500 * movsx

2108    cycles for 500 * movzx
1028    cycles for 500 * shr & and FFFFh
2020    cycles for 500 * movsx

17      bytes for movzx
23      bytes for shr & and FFFFh
17      bytes for movsx


This is the later quad.


Intel(R) Core(TM)2 Quad CPU    Q9650  @ 3.00GHz (SSE4)

2103    cycles for 500 * movzx
1030    cycles for 500 * shr & and FFFFh
2015    cycles for 500 * movsx

2100    cycles for 500 * movzx
1042    cycles for 500 * shr & and FFFFh
2009    cycles for 500 * movsx

2100    cycles for 500 * movzx
1028    cycles for 500 * shr & and FFFFh
2013    cycles for 500 * movsx

17      bytes for movzx
23      bytes for shr & and FFFFh
17      bytes for movsx


While I am at it, this is my old PIV.


Genuine Intel(R) CPU 3.80GHz (SSE3)

1861    cycles for 500 * movzx
2103    cycles for 500 * shr & and FFFFh
1777    cycles for 500 * movsx

1823    cycles for 500 * movzx
2058    cycles for 500 * shr & and FFFFh
1864    cycles for 500 * movsx

1814    cycles for 500 * movzx
2140    cycles for 500 * shr & and FFFFh
2091    cycles for 500 * movsx

17      bytes for movzx
23      bytes for shr & and FFFFh
17      bytes for movsx



Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: TouEnMasm on November 13, 2014, 12:32:45 AM
Without comments
Quote
Intel(R) Celeron(R) CPU 2.80GHz (SSE3)

1771    cycles for 500 * movzx
2087    cycles for 500 * shr & and FFFFh
1792    cycles for 500 * movsx

1804    cycles for 500 * movzx
2083    cycles for 500 * shr & and FFFFh
1796    cycles for 500 * movsx

1772    cycles for 500 * movzx
2085    cycles for 500 * shr & and FFFFh
1792    cycles for 500 * movsx

17      bytes for movzx
23      bytes for shr & and FFFFh
17      bytes for movsx
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: hutch-- on November 13, 2014, 12:59:30 AM
With some humour, on my i7 the following 4 algos clock almost identically.


  ; ------------------------------------- 1

    movzx eax, WORD PTR [lParam]
    movzx ecx, WORD PTR [lParam+2]

    mov tx, eax
    mov ty, ecx
    print uhex$(tx),"h  loword",13,10
    print uhex$(ty),"h  hiword",13,10

  ; ------------------------------------- 2

    mov edx, lParam

    movzx eax, dx
    rol edx, 16
    movzx ecx, dx

    mov tx, eax
    mov ty, ecx
    print uhex$(tx),"h  loword",13,10
    print uhex$(ty),"h  hiword",13,10

  ; ------------------------------------- 3

    mov edx, lParam

    mov ecx, edx
    and edx, 0000FFFFh
    rol ecx, 16
    and ecx, 0000FFFFh

    mov tx, edx
    mov ty, ecx
    print uhex$(tx),"h  loword",13,10
    print uhex$(ty),"h  hiword",13,10

  ; ------------------------------------- 4

    mov edx, lParam

    and edx, 0000FFFFh
    movzx ecx, WORD PTR [lParam+2]

    mov tx, edx
    mov ty, ecx
    print uhex$(tx),"h  loword",13,10
    print uhex$(ty),"h  hiword",13,10

  ; -------------------------------------
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: FORTRANS on November 13, 2014, 01:36:37 AM
pre-P4 (SSE1)

2027   cycles for 500 * movzx
2532   cycles for 500 * shr & and FFFFh
2043   cycles for 500 * movsx

2028   cycles for 500 * movzx
2534   cycles for 500 * shr & and FFFFh
2027   cycles for 500 * movsx

2026   cycles for 500 * movzx
2535   cycles for 500 * shr & and FFFFh
2026   cycles for 500 * movsx

17   bytes for movzx
23   bytes for shr & and FFFFh
17   bytes for movsx


--- ok --- pre-P4
4164   cycles for 500 * movzx
2577   cycles for 500 * shr & and FFFFh
4181   cycles for 500 * movsx

4138   cycles for 500 * movzx
2655   cycles for 500 * shr & and FFFFh
4188   cycles for 500 * movsx

4146   cycles for 500 * movzx
2605   cycles for 500 * shr & and FFFFh
4169   cycles for 500 * movsx

17   bytes for movzx
23   bytes for shr & and FFFFh
17   bytes for movsx


--- ok --- Intel(R) Pentium(R) M processor 1.70GHz (SSE2)

1530   cycles for 500 * movzx
2035   cycles for 500 * shr & and FFFFh
1527   cycles for 500 * movsx

1533   cycles for 500 * movzx
2036   cycles for 500 * shr & and FFFFh
1529   cycles for 500 * movsx

1529   cycles for 500 * movzx
2039   cycles for 500 * shr & and FFFFh
1535   cycles for 500 * movsx

17   bytes for movzx
23   bytes for shr & and FFFFh
17   bytes for movsx


--- ok --- Intel(R) Core(TM) i3-4005U CPU @ 1.70GHz (SSE4)

1004   cycles for 500 * movzx
1378   cycles for 500 * shr & and FFFFh
1013   cycles for 500 * movsx

1003   cycles for 500 * movzx
1379   cycles for 500 * shr & and FFFFh
1004   cycles for 500 * movsx

1020   cycles for 500 * movzx
1236   cycles for 500 * shr & and FFFFh
1363   cycles for 500 * movsx

17   bytes for movzx
23   bytes for shr & and FFFFh
17   bytes for movsx


--- ok ---
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: Gunther on November 13, 2014, 04:00:36 AM
Jochen,

the results:
Quote
Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz (SSE4)

898     cycles for 500 * movzx
919     cycles for 500 * shr & and FFFFh
918     cycles for 500 * movsx

901     cycles for 500 * movzx
917     cycles for 500 * shr & and FFFFh
918     cycles for 500 * movsx

949     cycles for 500 * movzx
931     cycles for 500 * shr & and FFFFh
924     cycles for 500 * movsx

17      bytes for movzx
23      bytes for shr & and FFFFh
17      bytes for movsx

--- ok ---

Gunther
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: MichaelW on November 13, 2014, 04:26:11 AM
Core2-i3 G3220 under Windows7-64:

Intel(R) Pentium(R) CPU G3220 @ 3.00GHz (SSE4)

1001    cycles for 500 * movzx
1017    cycles for 500 * shr & and FFFFh
1005    cycles for 500 * movsx

1002    cycles for 500 * movzx
1018    cycles for 500 * shr & and FFFFh
1001    cycles for 500 * movsx

1006    cycles for 500 * movzx
1017    cycles for 500 * shr & and FFFFh
1002    cycles for 500 * movsx
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: Siekmanski on November 13, 2014, 09:18:33 AM
Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz (SSE4) Windows 8.1 64 bit

1029    cycles for 500 * movzx
1057    cycles for 500 * shr & and FFFFh
1059    cycles for 500 * movsx

1039    cycles for 500 * movzx
1058    cycles for 500 * shr & and FFFFh
1059    cycles for 500 * movsx

1038    cycles for 500 * movzx
1056    cycles for 500 * shr & and FFFFh
1059    cycles for 500 * movsx

17      bytes for movzx
23      bytes for shr & and FFFFh
17      bytes for movsx
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: jj2007 on November 13, 2014, 12:11:14 PM
.if OldIntelQuadOrSimilar
   shr ...
.else
   movsx ...
.endif

One cycle more for the extra jump, of course - shall I post timings?
;)
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: nidud on November 13, 2014, 09:50:19 PM
deleted
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: dedndave on November 13, 2014, 10:52:47 PM
I like that nidud   :t
Title: Re: quickest way to retrieve HO and LO words of a dword ?
Post by: Gunther on November 14, 2014, 04:09:53 AM
Jochen,

Quote from: jj2007 on November 13, 2014, 12:11:14 PM
.if OldIntelQuadOrSimilar
   shr ...
.else
   movsx ...
.endif

One cycle more for the extra jump, of course - shall I post timings?
;)

yes.

Gunther