Author Topic: Count clock cycles with different methods  (Read 604 times)

Gunther

  • Member
  • *****
  • Posts: 4119
  • Forgive your enemies, but never forget their names
Count clock cycles with different methods
« on: July 07, 2022, 02:15:50 AM »
Counting clock cycles isn't easy. To obtain reliable results, many conditions must be taken into consideration. This has already been described in detail at several places.

The archive CYC.ZIP contains the source code and the program CYC.EXE. It runs under plain DOS in the ring 0. The program uses one of 4 different methods depending
on which instruction set is available. This is done using various code paths that are selected at runtime. If a very old CPU is present, the application terminates with an
error message. This is e.g. the case under DOSBox 0.74, because only an 80486 is simulated there:
Code: [Select]
Sorry!
Your CPU doesn't support the CPUID instruction.
No chance to count cycles.
Program ends now.

Under FreeDOS 1.3 without driver that's displayed:
Code: [Select]
Flag Values
===========

SSE2   = 7
RDTSCP = 5
RDTSC  = 3

Counting cycles with modern SSE2 technique:
-------------------------------------------

That's recommended by Intel since 2020 and probably the most reliable method.
Cycles used for 100 times FSAVE:  16726


Please, press any key to end the application...

The application works in Windows XP mode as well:
Code: [Select]
Flag Values
===========

SSE2   = 7
RDTSCP = 0
RDTSC  = 3

Counting cycles with modern SSE2 technique:
-------------------------------------------

That's recommended by Intel since 2020 and probably the most reliable method.
Cycles used for 100 times FSAVE:  40640


Please, press any key to end the application...
It should be noted, however, that this result is an emulation in ring 3.

This program is written with PB 3.5 and JWASM and still under development. With a variety of possible configurations under which the software can run, further testing is necessary.
I'm thankful for test reports and hints for improvement.
You have to know the facts before you can distort them.

FORTRANS

  • Member
  • *****
  • Posts: 1217
Re: Count clock cycles with different methods
« Reply #1 on: July 07, 2022, 06:31:42 AM »
Hi Gunther,

   Some results.

Code: [Select]
++ P-III, Windows 2000 ++

Flag Values
===========

SSE2   = 0
RDTSCP = 0
RDTSC  = 3

Counting cycles with the classic technique:
-------------------------------------------

That is the state before 2010.


Cycles used for 100 times FSAVE:  12497



Please, press any key to end the application...

++ P-III OS/2 VDM ++

Flag Values
===========

SSE2   = 0
RDTSCP = 0
RDTSC  = 3

Counting cycles with the classic technique:
-------------------------------------------

That is the state before 2010.


Cycles used for 100 times FSAVE:  12926



Please, press any key to end the application...

++ i3-4005U, Oracle VirtualBox, OS/2 Command Prompt, two runs
 as seemed inconsistant results. ++

Flag Values
===========

SSE2   = 7
RDTSCP = 5
RDTSC  = 3

Counting cycles with modern SSE2 technique:
-------------------------------------------

That's recommended by Intel since 2020 and probably the most reliable method.


Cycles used for 100 times FSAVE:  55433



Please, press any key to end the application...

++ i3-4005U. Oracle VirtualBox, OS/2 VDM three runs, but seems more consistant. ++

Flag Values
===========

SSE2   = 7
RDTSCP = 5
RDTSC  = 3

Counting cycles with modern SSE2 technique:
-------------------------------------------

That's recommended by Intel since 2020 and probably the most reliable method.


Cycles used for 100 times FSAVE:  20590



Please, press any key to end the application...

Flag Values
===========

SSE2   = 7
RDTSCP = 5
RDTSC  = 3

Counting cycles with modern SSE2 technique:
-------------------------------------------

That's recommended by Intel since 2020 and probably the most reliable method.


Cycles used for 100 times FSAVE:  19664



Please, press any key to end the application...
Flag Values
===========

SSE2   = 7
RDTSCP = 5
RDTSC  = 3

Counting cycles with modern SSE2 technique:
-------------------------------------------

That's recommended by Intel since 2020 and probably the most reliable method.


Cycles used for 100 times FSAVE:  17796



Please, press any key to end the application...

++ Pentium M, Windows XP ++

Flag Values
===========

SSE2   = 7
RDTSCP = 0
RDTSC  = 3

Counting cycles with modern SSE2 technique:
-------------------------------------------

That's recommended by Intel since 2020 and probably the most reliable method.


Cycles used for 100 times FSAVE:  13375



Please, press any key to end the application...

++

Regards,

Steve N.

Gunther

  • Member
  • *****
  • Posts: 4119
  • Forgive your enemies, but never forget their names
Re: Count clock cycles with different methods
« Reply #2 on: July 07, 2022, 08:12:45 AM »
Steve,

thank you for your report.  :thumbsup: Does the program also run under OS/2?
You have to know the facts before you can distort them.

FORTRANS

  • Member
  • *****
  • Posts: 1217
Re: Count clock cycles with different methods
« Reply #3 on: July 07, 2022, 11:08:49 PM »
Hi,

Steve,

thank you for your report.  :thumbsup: Does the program also run under OS/2?

   Well, sort of.  Testing on the older system shows that at an OS/2 command
prompt, typing the program name runs the program.  And it looks completely
normal.  But you cannot pipe the output to a file, and the video is reset along
with a minimal delay on exiting.  So it seems that OS/2 can load the VDM and
run the program.  But with caveats.  Running at a DOS prompt (VDM) you can
pipe the results and there are fewer artifacts seen.  In the VirtualBox system
it seemed to affect the timing as well.  Not seen in the native system though.
Helpful?

Cheers,

Steve N.

Gunther

  • Member
  • *****
  • Posts: 4119
  • Forgive your enemies, but never forget their names
Re: Count clock cycles with different methods
« Reply #4 on: July 10, 2022, 01:46:20 AM »
Steve,

   Well, sort of.  Testing on the older system shows that at an OS/2 command
prompt, typing the program name runs the program.  And it looks completely
normal.  But you cannot pipe the output to a file, and the video is reset along
with a minimal delay on exiting.

that sounds strange. I think that there is a bug in the DOS emulation of OS/2.

So it seems that OS/2 can load the VDM and
run the program.  But with caveats.  Running at a DOS prompt (VDM) you can
pipe the results and there are fewer artifacts seen.

That's good to know.

In the VirtualBox system
it seemed to affect the timing as well.  Not seen in the native system though.

That's clear. The times within the virtual machines are very unreliable. Too much is happening there that we've no control over.

Helpful?

Yes, of course.  :thumbsup:
You have to know the facts before you can distort them.