News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Multithreaded apps in 64 bit assembler

Started by AKRichard, August 08, 2012, 09:06:45 AM

Previous topic - Next topic

dedndave

i don't know if CMPXCHG implies lock or not
i am sure it is mentioned in the intel docs if it does
i prefer the simple XCHG

MichaelW

Instead of a Sleep 1, which unless you change the minimum timer resolution will suspend the current thread for the same interval as a Sleep 10, or on recent systems a Sleep 15 would, why not use Sleep 0?
Well Microsoft, here's another nice mess you've gotten us into.

dedndave

i think we've been over that before, Michael - lol (a few times)
i don't believe the system timer resolution affects Sleep

jj2007

Sleep 0 is definitely a lot faster:

Sleep 1: 1562662 µs
Sleep 0: 2696 µs

Sleep 1: 1562470 µs
Sleep 0: 1225 µs

Sleep 1: 1562516 µs
Sleep 0: 769 µs

include \masm32\MasmBasic\MasmBasic.inc   ; download
  Init
  REPEAT 3
   NanoTimer()
   mov ebx, 100
   .Repeat
      invoke Sleep, 1
      dec ebx
   .Until Zero?
   Print Str$("Sleep 1: %i µs\n", NanoTimer(µs))
   NanoTimer()
   mov ebx, 100
   .Repeat
      invoke Sleep, 0
      dec ebx
   .Until Zero?
   Print Str$("Sleep 0: %i µs\n\n", NanoTimer(µs))
  ENDM
  Inkey
  Exit
end start

dedndave

well - faster, how ?
if a thread uses Sleep to allow other threads to run...
then using a smaller elapse time means the thread is consuming more time checking the semaphore   :P

it is a trade-off - you must decide what is appropriate for each individual case
fast response vs use of fewer cycles

sinsi

Quote from: dedndave on August 19, 2012, 08:33:43 AM
i think we've been over that before, Michael - lol (a few times)
i don't believe the system timer resolution affects Sleep
timeBeginPeriod affects the system timer for all processes.

Using "invoke timeBeginPeriod,1" in jj's code gives me:
Sleep 1: 100010 µs
Sleep 0: 15 µs

Sleep 1: 100011 µs
Sleep 0: 15 µs

Sleep 1: 100005 µs
Sleep 0: 15 µs

http://masm32.com/board/index.php?topic=322.0

MichaelW

Quote from: dedndave on August 19, 2012, 08:33:43 AM
i think we've been over that before, Michael - lol (a few times)
i don't believe the system timer resolution affects Sleep
Yep, it's like déjà vu all over again :icon_eek:

Instead of depending on my memory of the performance counter not being affected by the minimum timer resolution, I coded a (not very accurate) timer that uses TSC as a time reference.

;==============================================================================
    include \masm32\include\masm32rt.inc
    include \masm32\include\winmm.inc
    includelib \masm32\lib\winmm.lib
    .686
;==============================================================================
    .data
        clkhz dq    0
        r8    REAL8 ?
    .code
;==============================================================================
OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
TscTimer proc
    .IF DWORD PTR clkhz == 0
        invoke Sleep, 3000
        rdtsc
        push edx
        push eax
        invoke Sleep, 1000
        rdtsc
        pop ecx
        sub eax, ecx
        pop ecx
        sbb edx, ecx
        mov DWORD PTR clkhz, eax
        mov DWORD PTR clkhz+4, edx
        printf("%I64dHz\n\n", clkhz)
    .ENDIF
    rdtsc
    push edx
    push eax
    fild QWORD PTR [esp]
    add esp, 8
    fild clkhz
    fdiv
    ret
TscTimer endp
OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef
;==============================================================================
start:
;==============================================================================

    invoke TscTimer
    fstp r8
    invoke TscTimer
    fstp r8
    invoke Sleep, 1000
    invoke TscTimer
    fld r8
    fsub
    fstp r8
    printf("%fs\n\n", r8)

    invoke TscTimer
    fstp r8
    REPEAT 1000
        invoke Sleep, 1
    ENDM
    invoke TscTimer
    fld r8
    fsub
    fstp r8
    printf("%fs\n", r8)

    invoke timeBeginPeriod, 1

    invoke TscTimer
    fstp r8
    REPEAT 1000
        invoke Sleep, 1
    ENDM
    invoke TscTimer
    fld r8
    fsub
    fstp r8
    printf("%fs\n\n", r8)

    invoke timeEndPeriod, 1

    inkey
    exit
;==============================================================================
END start


Running on my 500MHz P3 Windows 2000 system:

504248079Hz

0.999615s

9.999756s
1.011535s

Well Microsoft, here's another nice mess you've gotten us into.

dedndave

Prescott w/htt, XP MCE2005 SP3
3000103515Hz

0.999860s

1.955947s
1.974502s


although, i am not really sure how to interpret the results, Michael - lol

i have attached an assembled copy for others to try...

MichaelW

P4 Northwood, Windows XP SP3:

2992338536Hz

0.986793s

15.625224s
1.965253s


The point was to demonstrate that the Sleep period does depend on the minimum timer resolution. Perhaps on XP MCE it doesn't, or the minimum timer resolution is stuck at the 1.
Well Microsoft, here's another nice mess you've gotten us into.

MichaelW

I changed the source so it reports the minimum and maximum timer resolutions:

;==============================================================================
    include \masm32\include\masm32rt.inc
    include \masm32\include\winmm.inc
    includelib \masm32\lib\winmm.lib
    .686
;==============================================================================
    .data
        clkhz dq    0
        r8    REAL8 ?
        tc    TIMECAPS <>
    .code
;==============================================================================
OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
TscTimer proc
    .IF DWORD PTR clkhz == 0
        invoke Sleep, 3000
        rdtsc
        push edx
        push eax
        invoke Sleep, 1000
        rdtsc
        pop ecx
        sub eax, ecx
        pop ecx
        sbb edx, ecx
        mov DWORD PTR clkhz, eax
        mov DWORD PTR clkhz+4, edx
        printf("%I64dHz\n\n", clkhz)
    .ENDIF
    rdtsc
    push edx
    push eax
    fild QWORD PTR [esp]
    add esp, 8
    fild clkhz
    fdiv
    ret
TscTimer endp
OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef
;==============================================================================
start:
;==============================================================================
    invoke timeGetDevCaps, ADDR tc, SIZEOF tc
    printf("min %d\tmax %d\n\n", tc.wPeriodMin, tc.wPeriodMax)

    invoke TscTimer
    fstp r8
    invoke TscTimer
    fstp r8
    invoke Sleep, 1000
    invoke TscTimer
    fld r8
    fsub
    fstp r8
    printf("%fs\n\n", r8)

    invoke TscTimer
    fstp r8
    REPEAT 1000
        invoke Sleep, 1
    ENDM
    invoke TscTimer
    fld r8
    fsub
    fstp r8
    printf("%fs\n", r8)

    invoke timeBeginPeriod, 1

    invoke TscTimer
    fstp r8
    REPEAT 1000
        invoke Sleep, 1
    ENDM
    invoke TscTimer
    fld r8
    fsub
    fstp r8
    printf("%fs\n\n", r8)

    invoke timeEndPeriod, 1

    inkey
    exit
;==============================================================================
END start


And on both systems I get:

min 1    max 1000000
Well Microsoft, here's another nice mess you've gotten us into.

jj2007

The question is really "what does a sleep 0 effectively do?". My code calls it 100 times inside a loop, and the response times show clearly that it does not wait a lot. But does it wait at all?

Measuring with rdtsc, on my Celeron it seems that a Sleep 0 is roughly 1600 cycles:
1593372 cycles for 1000*Sleep 0: 1061 µs
So it does hang around somewhere and wait for something, e.g. a new timeslice...

hutch--

JJ,

The spec on SleepEx() is if you set the delay to zero it will immediately return if there is no other thread to run. Set it to SleepEx,1,0 and you will force a yield even if there is no other thread on that core that is waiting.

MichaelW

Quote from: dedndave on August 19, 2012, 09:45:14 AM
if a thread uses Sleep to allow other threads to run...
then using a smaller elapse time means the thread is consuming more time checking the semaphore   :P

Regarding the use of Sleep 0, if there is no other thread ready to run then what is the problem with consuming time checking the semaphore? And if there is some other thread ready to run, then it will receive whatever is left of the current timeslice, which I think would typically be most of the timeslice.
Well Microsoft, here's another nice mess you've gotten us into.

dedndave

#28
the assumption there is that if some other thread owns the semaphore, it must be doing something   :P
you want to allow it to finish it's work so it will release the semaphore

jj2007

MSDN: If you specify 0 milliseconds, the thread will relinquish the remainder of its time slice but remain ready. Note that a ready thread is not guaranteed to run immediately

My interpretation is that
- there is always another thread ready to run;
- the next action of the current thread happens in precisely the moment when it's been given a crispy fresh timeslice, which might not be "immediately".