Author Topic: rdtsc  (Read 1048 times)

NoCforMe

  • Member
  • *****
  • Posts: 1124
Re: rdtsc
« Reply #15 on: January 24, 2023, 12:25:06 PM »
Speaking as a definite non-expert here, I would say that the answer to the question "Is xxxx good enough for my timing needs?" would depend on just what those timing needs are, wouldn't you say? Depending on whether you need absolutely accurate, reproducible timings on a small section of critical code, or whether you just need a ballpark look at how fast or slow a routine you're working on is, and you don't need accuracy down to the microsecond. Amiright?

NoCforMe

  • Member
  • *****
  • Posts: 1124
Re: rdtsc
« Reply #16 on: January 24, 2023, 12:28:12 PM »
I also have a code mountain in front of me bigger than Mount Everest.

OK, you're excused. For now ...

jj2007

  • Member
  • *****
  • Posts: 13944
  • Assembly is fun ;-)
    • MasmBasic
Re: rdtsc
« Reply #17 on: January 24, 2023, 12:35:34 PM »
Code: [Select]
  include \masm32\MasmBasic\MasmBasic.inc
  Init
  CyCtInit
  CyCtStart
fldpi
fmul FP8(100.0)
fdiv FP4(10.0)
fstp st
  CyCtEnd PI*100/10 ; describe what the code does
  EndOfCode

Results for repeated runs:
Code: [Select]
+17      Cycles for PI*100/10
+17      Cycles for PI*100/10
+17      Cycles for PI*100/10
+18      Cycles for PI*100/10
+15      Cycles for PI*100/10
+18      Cycles for PI*100/10
+18      Cycles for PI*100/10
+16      Cycles for PI*100/10
+18      Cycles for PI*100/10
+17      Cycles for PI*100/10
+17      Cycles for PI*100/10
+18      Cycles for PI*100/10
+17      Cycles for PI*100/10
+17      Cycles for PI*100/10
+17      Cycles for PI*100/10
+17      Cycles for PI*100/10

There is a lot statistical analysis, outlier elimination etc under the hood, and yet, I've seen runs on other people's machines that showed negative values. Timing is, ehm, challenging. The Lab is full of heroic attempts to get higher precision.

For my daily needs, I use NanoTimer() alias QueryPerformanceCounter. It's not really useful for very short code like the one above, but it is easy to use and much more precise than GetTickCount().

NoCforMe

  • Member
  • *****
  • Posts: 1124
Re: rdtsc
« Reply #18 on: January 24, 2023, 01:55:08 PM »
Another related question: Since it's hard to get accurate timings because we're not in total control of the CPU on account of preemptive multitasking, would it be possible to get more CPU cycles to ourselves, basically hogging the CPU, in order to get a better count? Could this be done by, say, changing our process's priority level under Windows?

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 10583
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: rdtsc
« Reply #19 on: January 24, 2023, 05:40:25 PM »
David,

You normally do that when you time something. Have nothing else running, up the priority then run the timing on the algo. I have a toy that times long duration tasks (minutes) that I use for mass video processing but I have never seen a decent timing technique for very short duration tasks.

Agner Fog designed a technique that has to be done at boot without loading the OS but it did not represent how an algo performed running from ring 3 when the OS has ring 0 priority.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :skrewy:

HSE

  • Member
  • *****
  • Posts: 2494
  • AMD 7-32 / i3 10-64
Re: rdtsc
« Reply #20 on: January 25, 2023, 01:11:36 AM »
Agner Fog designed a technique that has to be done at boot without loading the OS

That is exactly what I'm making with BenchOS

Code: [Select]
     
fldpi
fmul FP8(100.0)
fdiv FP4(10.0)
fstp st
 

Look like that code, wich obtain nothing, it's not even executed in this machine. The measured cycles look like cycles that opcodes takes to pass the pipeline   :rolleyes:
Equations in Assembly: SmplMath

NoCforMe

  • Member
  • *****
  • Posts: 1124
Re: rdtsc
« Reply #21 on: January 25, 2023, 12:10:21 PM »
Just for fun I went ahead and added timestamp logging to my LogBuddy debugging aid. Might be useful to someone ...