Author Topic: SwitchToThread vs Sleep(0)  (Read 42563 times)

qWord

  • Member
  • *****
  • Posts: 1476
  • The base type of a type is the type itself
    • SmplMath macros
Re: SwitchToThread vs Sleep(0)
« Reply #45 on: May 15, 2013, 12:54:26 PM »
Hmm... Interesting. Maybe some systems return not CPU freq but some scale value from this, thank you, qWord! :t Changed code now to calculate CPU freq at runtime, probably it will work better that relying on an API ::)
window's performance counters are implemented with the APIC (Advanced Programmable Interrupt Controller) - the frequency is independent from the CPU's freq..
MREAL macros - when you need floating point arithmetic while assembling!

Antariy

  • Member
  • ****
  • Posts: 551
Re: SwitchToThread vs Sleep(0)
« Reply #46 on: May 15, 2013, 01:09:18 PM »
Interesting to check how it behaves - which values we get - before and after the changement of a timer resolution.

Edited.

Hmm... Interesting. Maybe some systems return not CPU freq but some scale value from this, thank you, qWord! :t Changed code now to calculate CPU freq at runtime, probably it will work better that relying on an API ::)
window's performance counters are implemented with the APIC (Advanced Programmable Interrupt Controller) - the frequency is independent from the CPU's freq..

Not always independed. Well, yes, my info is rusty :lol: Probably it is because of standard kernel (under kernel I mean "HAL" and "kernel" in a sheaf) on a single-core machine ::)


At least we probably can assume that the "default" 15,625 ms on NT's are more or less the maximum, better to fit the timings measurement to something like 10 ms.

jj2007

  • Member
  • *****
  • Posts: 10657
  • Assembler is fun ;-)
    • MasmBasic
Re: SwitchToThread vs Sleep(0)
« Reply #47 on: May 15, 2013, 03:45:39 PM »
Hi Alex,
mw blocked my one-core PC completely - I had to press power off for five seconds... :(

MichaelW

  • Global Moderator
  • Member
  • *****
  • Posts: 1209
Re: SwitchToThread vs Sleep(0)
« Reply #48 on: May 15, 2013, 07:41:31 PM »
Hi Alex,
mw blocked my one-core PC completely - I had to press power off for five seconds... :(

I thought I included a warning. For my P3 I use HIGH_PRIORITY_CLASS and THREAD_PRIORITY_NORMAL or THREAD_PRIORITY_ABOVE_NORMAL, but for my P4 with HT I can max out the priority, no problems.
Well Microsoft, here’s another nice mess you’ve gotten us into.

Antariy

  • Member
  • ****
  • Posts: 551
Re: SwitchToThread vs Sleep(0)
« Reply #49 on: May 15, 2013, 10:47:09 PM »
Hi Alex,
mw blocked my one-core PC completely - I had to press power off for five seconds... :(

Sorry, Jochen :(

Antariy

  • Member
  • ****
  • Posts: 551
Re: SwitchToThread vs Sleep(0)
« Reply #50 on: May 15, 2013, 10:57:56 PM »
Jochen, did you try in such a circumstance to press and hold [Ctrl]+[C], this should terminate program in some seconds (10~20)? This will help if the console prog hangs, but if some OS service/something such crashed/hanged, then nothing will help. (This may sound strange or funny, but once I got a freeze when hold left mouse button on a page in an Acrobat Reader (version 7), slowly scrolling the page by holding it with a "hand". The CPU usage is very high at this moment - something strange happened to an OS.)

jj2007

  • Member
  • *****
  • Posts: 10657
  • Assembler is fun ;-)
    • MasmBasic
Re: SwitchToThread vs Sleep(0)
« Reply #51 on: May 15, 2013, 11:02:02 PM »
Jochen, did you try in such a circumstance to press and hold [Ctrl]+[C]...

Don't worry, Alex, I had no files open, and no data were lost.
Ctrl C was what I tried first, but it was really blocked completely - no mouse, no keyboard reaction.

dedndave

  • Member
  • *****
  • Posts: 8827
  • Still using Abacus 2.0
    • DednDave
Re: SwitchToThread vs Sleep(0)
« Reply #52 on: May 17, 2013, 12:10:36 AM »
well, it probably needs some fine-tuning, but this demonstrates my serialization concept

it's based on a "natural" serialization of the code stream
i.e., rather than using one of the serializing instructions (per intel),
force the code stream to serialize based on register content
Code: [Select]
    _serializ MACRO
        pushad
        pop     edi
        pop     esi
        pop     ebp
        pop     eax
        pop     ebx
        pop     edx
        pop     eax
        pop     ecx
        xchg    eax,ecx
    ENDM
the CPU can't perform out-of-order execution if it has to wait for the sequence to finish   :biggrin:
all registers are involved, so it has to wait
as a matter of coincidence, the sequence is completely benign - no registers or flags are modified
hopefully, the time it takes to execute our sequence is more stable/repeatable than CPUID

as i said, we need to do some fine-tuning
but, here is a sample run
i do still get outliers, in spite of the single-quantum execution
Code: [Select]
11 8 5 7 8 7 8 10 8 8
24 25 22 22 23 21 24 21 24 26
56 55 57 56 55 55 56 57 53 56

but - i think we have a nice starting place

EDIT: attachment updated - see reply #57
« Last Edit: May 17, 2013, 06:35:38 AM by dedndave »

FORTRANS

  • Member
  • *****
  • Posts: 1078
Re: SwitchToThread vs Sleep(0)
« Reply #53 on: May 17, 2013, 02:44:17 AM »
Hi,

   Just looked at the Intel manuals I have.  Not in the serializing
section, but in the atomic operations section, it says that those
will serialize things as well.  Have you considered using an XCHG
Reg,Mem or the like to serialize things?  Just curious.  It seems
that it should work better than a CPUID.

Regards,

Steve N.

jj2007

  • Member
  • *****
  • Posts: 10657
  • Assembler is fun ;-)
    • MasmBasic
Re: SwitchToThread vs Sleep(0)
« Reply #54 on: May 17, 2013, 04:07:14 AM »
Have you considered using an XCHG Reg,Mem or the like to serialize things?

Steve,
Thanks, indeed that was one of the first things I tested, but no real difference. Besides, cpuid seems to be the "official" way to serialise.

FORTRANS

  • Member
  • *****
  • Posts: 1078
Re: SwitchToThread vs Sleep(0)
« Reply #55 on: May 17, 2013, 04:13:27 AM »
Hi,

   Okay, I had not noticed that you had looked at that.  And
yes, CPUID seems to be the code of choice in the samples I
have seen.

Thanks,

Steve N.

Gunther

  • Member
  • *****
  • Posts: 3585
  • Forgive your enemies, but never forget their names
Re: SwitchToThread vs Sleep(0)
« Reply #56 on: May 17, 2013, 04:29:57 AM »
Hi Steve,

And yes, CPUID seems to be the code of choice in the samples I have seen.

yes, CPUID seems to be the best choice. Agner Fog recommends it, too.

Gunther
Get your facts first, and then you can distort them.

dedndave

  • Member
  • *****
  • Posts: 8827
  • Still using Abacus 2.0
    • DednDave
Re: SwitchToThread vs Sleep(0)
« Reply #57 on: May 17, 2013, 05:43:59 AM »
it would seem that CALL/RET does a fair job
Code: [Select]
C:\Masm32\Asm32 => dTime2
4 5 4 6 4 5 4 4 4 5
277 276 280 277 278 275 277 279 277 278
83 82 83 82 83 82 78 83 84 81
Press any key to continue ...

C:\Masm32\Asm32 => dTime2
3 3 4 3 3 3 6 4 4 3
275 279 280 277 277 276 279 275 281 275
82 83 83 83 83 84 82 84 83 81
Press any key to continue ...

dedndave

  • Member
  • *****
  • Posts: 8827
  • Still using Abacus 2.0
    • DednDave
Re: SwitchToThread vs Sleep(0)
« Reply #58 on: May 17, 2013, 10:16:25 AM »
Code: [Select]
        INVOKE  dTime,CodeToMeasure1,HIGH_PRIORITY_CLASS,THREAD_PRIORITY_ABOVE_NORMAL
Code: [Select]
;***********************************************************************************************

dTime   PROC USES EBX ESI EDI lpfnProc:LPVOID,dwPriClass:DWORD,dwPriLevel:DWORD

;Code Timing Function, David R. Sheldon - DednDave, Ver 1.1, May 2013

;--------------------------------------------------

;Call With: lpfnProc   = address of function to be timed
;           dwPriClass = process priority class
;           dwPriLevel = thread priority level
;
;  Returns: EAX        = clock cycles (not including function CALL/RET overhead)
;
;Also Uses: ECX, EDX, all other registers are preserved
;
;    Notes: 1) The function referenced by lpfnProc may not have any arguments.
;              It may, however, contain local variables. The time consumed creating and
;              destroying the stack frame will be included in the timing measurement.
;           2) The function referenced by lpfnProc may destroy any register contents,
;              but must balance the stack (ESP) before RETurn, of course.

;--------------------------------------------------

;Process priority class       Thread priority level     Base priority
;
;IDLE_PRIORITY_CLASS          THREAD_PRIORITY_IDLE            1
;                             THREAD_PRIORITY_LOWEST          2
;                             THREAD_PRIORITY_BELOW_NORMAL    3
;                             THREAD_PRIORITY_NORMAL          4
;                             THREAD_PRIORITY_ABOVE_NORMAL    5
;                             THREAD_PRIORITY_HIGHEST         6
;                             THREAD_PRIORITY_TIME_CRITICAL  15
;
;BELOW_NORMAL_PRIORITY_CLASS  THREAD_PRIORITY_IDLE            1
;                             THREAD_PRIORITY_LOWEST          4
;                             THREAD_PRIORITY_BELOW_NORMAL    5
;                             THREAD_PRIORITY_NORMAL          6
;                             THREAD_PRIORITY_ABOVE_NORMAL    7
;                             THREAD_PRIORITY_HIGHEST         8
;                             THREAD_PRIORITY_TIME_CRITICAL  15
;
;NORMAL_PRIORITY_CLASS        THREAD_PRIORITY_IDLE            1
;                             THREAD_PRIORITY_LOWEST          6
;                             THREAD_PRIORITY_BELOW_NORMAL    7
;                             THREAD_PRIORITY_NORMAL          8
;                             THREAD_PRIORITY_ABOVE_NORMAL    9
;                             THREAD_PRIORITY_HIGHEST        10
;                             THREAD_PRIORITY_TIME_CRITICAL  15
;
;ABOVE_NORMAL_PRIORITY_CLASS  THREAD_PRIORITY_IDLE            1
;                             THREAD_PRIORITY_LOWEST          8
;                             THREAD_PRIORITY_BELOW_NORMAL    9
;                             THREAD_PRIORITY_NORMAL         10
;                             THREAD_PRIORITY_ABOVE_NORMAL   11
;                             THREAD_PRIORITY_HIGHEST        12
;                             THREAD_PRIORITY_TIME_CRITICAL  15
;
;HIGH_PRIORITY_CLASS          THREAD_PRIORITY_IDLE            1
;                             THREAD_PRIORITY_LOWEST         11
;                             THREAD_PRIORITY_BELOW_NORMAL   12
;                             THREAD_PRIORITY_NORMAL         13
;                             THREAD_PRIORITY_ABOVE_NORMAL   14
;                             THREAD_PRIORITY_HIGHEST        15
;                             THREAD_PRIORITY_TIME_CRITICAL  15
;
;REALTIME_PRIORITY_CLASS      THREAD_PRIORITY_IDLE           16
;                             THREAD_PRIORITY_LOWEST         22
;                             THREAD_PRIORITY_BELOW_NORMAL   23
;                             THREAD_PRIORITY_NORMAL         24
;                             THREAD_PRIORITY_ABOVE_NORMAL   25
;                             THREAD_PRIORITY_HIGHEST        26
;                             THREAD_PRIORITY_TIME_CRITICAL  31

;**************************************************

;local variables

;--------------------------------------------------

        LOCAL   _hProcess     :HANDLE
        LOCAL   _hThread      :HANDLE
        LOCAL   _dwPriClass   :DWORD
        LOCAL   _dwPriLevel   :DWORD
        LOCAL   _dwAffinity   :DWORD
        LOCAL   _dwTerminalHi :DWORD
        LOCAL   _dwTerminalLo :DWORD
        LOCAL   _dwTallyHi    :DWORD
        LOCAL   _dwTallyLo    :DWORD
        LOCAL   _dwPassCount  :DWORD

;**************************************************

;initialization

;--------------------------------------------------

        INVOKE  GetCurrentProcess
        mov     _hProcess,eax
        INVOKE  GetPriorityClass,eax
        mov     _dwPriClass,eax
        INVOKE  GetCurrentThread
        mov     _hThread,eax
        INVOKE  GetThreadPriority,eax
        mov     _dwPriLevel,eax
        INVOKE  GetProcessAffinityMask,_hProcess,addr _dwAffinity,addr _dwPassCount
        INVOKE  SetProcessAffinityMask,_hProcess,1
        rdtsc
        mov     esi,eax
        mov     edi,edx
        INVOKE  Sleep,125
        rdtsc
        xor     ecx,ecx
        mov     _dwTerminalLo,eax
        mov     _dwTerminalHi,edx
        mov     _dwPassCount,ecx
        sub     eax,esi
        sbb     edx,edi
        mov     _dwTallyLo,ecx
        shld    edx,eax,2
        shl     eax,2
        mov     _dwTallyHi,ecx
        add     _dwTerminalLo,eax
        adc     _dwTerminalHi,edx
        INVOKE  Sleep,ecx
        mov     edi,offset DummyProc
        call    SinglePass
        mov     edi,offset DummyProc
        call    SinglePass
        mov     edi,offset DummyProc
        call    SinglePass
        jmp short TopOfLoop

;**************************************************

;measurement single pass

;--------------------------------------------------

;EDI = proc address

        ALIGN   16

SinglePass:
        INVOKE  SetPriorityClass,_hProcess,dwPriClass
        INVOKE  Sleep,0               ;bind new priority
        INVOKE  SetThreadPriority,_hThread,dwPriLevel
        INVOKE  Sleep,0               ;bind new level
        INVOKE  Sleep,0               ;fresh slice
        rdtsc
        push    edx                   ;Ta
        push    eax
        push    ebp
        call    edi                   ;proc to be measured
        rdtsc
        pop     ebp
        push    edx                   ;Tb
        push    eax
        INVOKE  SetPriorityClass,_hProcess,_dwPriClass
        INVOKE  Sleep,0               ;bind new priority
        INVOKE  SetThreadPriority,_hThread,_dwPriLevel
        INVOKE  Sleep,0               ;bind new level
        pop     eax
        pop     edx
        pop     esi
        pop     edi
        mov     ecx,eax
        mov     ebx,edx               ;EBX:ECX = last tsc reading
        sub     eax,esi
        sbb     edx,edi               ;EDX:EAX = measured time
        retn

;**************************************************

;empty proc

;--------------------------------------------------

        ALIGN   16

DummyProc:
        retn

;**************************************************

;measurement loop

;--------------------------------------------------

TopOfLoop:
        mov     edi,offset DummyProc  ;empty proc for reference
        call    SinglePass
        push    edx
        push    eax
        mov     edi,lpfnProc          ;code to be measured
        call    SinglePass
        pop     esi
        pop     edi
        sub     eax,esi
        sbb     edx,edi
        inc dword ptr _dwPassCount
        add     _dwTallyLo,eax
        adc     _dwTallyHi,edx
        cmp     ebx,_dwTerminalHi
        jb      TopOfLoop

        ja      dTally

        cmp     ecx,_dwTerminalLo
        jb      TopOfLoop

;**************************************************

;tally results and exit

;--------------------------------------------------

dTally: INVOKE  SetProcessAffinityMask,_hProcess,_dwAffinity
        mov     edx,_dwTallyHi
        mov     eax,_dwTallyLo
        or      edx,edx
        mov     ecx,_dwPassCount
        jns     Tdivis

        xor     eax,eax
        jmp short dTExit

Tdivis: cmp     ecx,1
        jbe     dTExit

        div     ecx
        shl     edx,1
        cmp     edx,ecx
        sbb     eax,-1

dTExit: ret

dTime   ENDP

;***********************************************************************************************
« Last Edit: May 17, 2013, 11:33:05 AM by dedndave »

Antariy

  • Member
  • ****
  • Posts: 551
Re: SwitchToThread vs Sleep(0)
« Reply #59 on: May 17, 2013, 11:52:05 AM »
Hi Dave :t
For dtime2:
Code: [Select]
44 36 30 28 4 7 24 35 0 28
356 468 415 386 405 354 384 431 384 376
48 58 77 144 76 84 103 101 77 79
Press any key to continue ...

Can you build and post the full code of your previous post?