Miscellaneous > 16 bit DOS Programming

Count cycles: test reports needed

(1/2) > >>

Counting cycles isn't an easy thing to do. We've a fundamental problem with the usual macros. As a normal application, we are running in ring 3. This means: There is no direct hardware access for us.
We know little about task switches, micro-operations, cache misses or branch mispredictions. In principle, we could determine this, but to do so requires write access to certain control registers. That' s
not possible in ring 3.

What can we do? Under Linux we could use a kernel module for this. This would guarantee an exclusive access to the CPU. Under BSD or Windows a driver would have to be used for this. Only then would
the measured values be reliable and meaningful. This path will be tedious and cost a lot of time.

But there's another option. Under plain DOS there are no task switches. Exclusive access to the CPU is moreover guaranteed. I wrote a small program that counts the cycles for a short code sequence. This
is repeated 20 times to prevent cache warm-up effects. All 20 results are printed at the end. That serves only for the information, from where the values stabilize. Of course, the median, arithmetic mean
or the variance could be calculated. But that would only be a statistical ironing procedure without any factual background.

The application is written with PowerBASIC and JWASM. This is the first step. I'm working on a version that's completely written in assembly language. A short sequence of FPU instructions is tested: load
double, 4 floating point divisions, save double. If the program is tested, it can't do any harm. Here is the output under DOSBox 0.74-3:

--- Code: ---Sorry!
The usage of the Time Stamp Counter isn't possible
with the available CPU.
Program ends now.

--- End code ---

This is correct, because DOSBox only emulates an 80486. The Time Stamp Counter came with the Pentium. Here is the output under FreeDOS running under VirtualBox:

--- Code: ---Iteration 1:  104 Cycles
Iteration 2:  106 Cycles
Iteration 3:  104 Cycles
Iteration 4:  106 Cycles
Iteration 5:  104 Cycles
Iteration 6:  108 Cycles
Iteration 7:  100 Cycles
Iteration 8:  102 Cycles
Iteration 9:  98 Cycles
Iteration 10:  108 Cycles
Iteration 11:  104 Cycles
Iteration 12:  106 Cycles
Iteration 13:  100 Cycles
Iteration 14:  108 Cycles
Iteration 15:  104 Cycles
Iteration 16:  108 Cycles
Iteration 17:  102 Cycles
Iteration 18:  102 Cycles
Iteration 19:  100 Cycles
Iteration 20:  106 Cycles

Please, press any key to end the application...

--- End code ---

The same machine, application started under plain FreeDOS without any drivers:

--- Code: ---Iteration 1:  82 Cycles
Iteration 2:  82 Cycles
Iteration 3:  84 Cycles
Iteration 4:  84 Cycles
Iteration 5:  86 Cycles
Iteration 6:  82 Cycles
Iteration 7:  82 Cycles
Iteration 8:  82 Cycles
Iteration 9:  84 Cycles
Iteration 10:  82 Cycles
Iteration 11:  82 Cycles
Iteration 12:  82 Cycles
Iteration 13:  82 Cycles
Iteration 14:  82 Cycles
Iteration 15:  82 Cycles
Iteration 16:  82 Cycles
Iteration 17:  82 Cycles
Iteration 18:  82 Cycles
Iteration 19:  82 Cycles
Iteration 20:  82 Cycles

Please, press any key to end the application...

--- End code ---

Where does the difference came from? Why is the program slower under VirtualBox? Well, as mentioned at the beginning: We are an application in ring 3 and are additionally emulated.

Nevertheless, I would be happy about test runs and reports in other environments.


   P-III, Windows 2000 displays:

Command Prompt - cc
The NTVDM CPU has encountered an illegal instruction.
CS:0b33 IP:0044 OP:0f01f966a3 Choose 'Close' to terminate the application.

   Windows XP displays a similar message.  OS/2 displays a fancier
message that says the same sort of thing.  Address 44 in all three.




--- Quote from: FORTRANS on May 24, 2022, 07:20:44 AM ---   Windows XP displays a similar message.  OS/2 displays a fancier
message that says the same sort of thing.  Address 44 in all three.

--- End quote ---

yes, I've checked it with XP Mode (Windows Virtual PC) running under Win 7. I've the same effect here. Apparently these emulations don't like the rdtsc instruction.

Hi Gunther!

In VirtualBox FD1.3 mean is 3000 Cycles running from emulated hdd in FAT32 USB, but 20 Cycles if I copy same file to virtual unit A: (a little difference  :biggrin:)


         Don't have much sense, everything depend on number of cores in virtual machine.
         With more than 1 core, USB run in some ┬┐slow thread?. Values are between 3000 an 6000 cycles.
         With only one core original CC result is 88-96 cycles, only rtdsc is 18-24 cycles and replacing rtdscp wit cpuid-rtdsc is 2846-4464 cycles.
         Something don't work like expected in FreeDos, VirtualBox or both.         

Hi Fortrans!

--- Quote from: FORTRANS on May 24, 2022, 07:20:44 AM ---   P-III, Windows 2000 displays:

--- End quote ---

I can guess that your machine can't deal with rdtscp, must be replaced with older rdtsc (if I understand well  :biggrin:)


[0] Message Index

[#] Next page

Go to full version