Counting cycles isn't an easy thing to do. We've a fundamental problem with the usual macros. As a normal application, we are running in ring 3. This means: There is no direct hardware access for us.

We know little about task switches, micro-operations, cache misses or branch mispredictions. In principle, we could determine this, but to do so requires write access to certain control registers. That' s

not possible in ring 3.

What can we do? Under Linux we could use a kernel module for this. This would guarantee an exclusive access to the CPU. Under BSD or Windows a driver would have to be used for this. Only then would

the measured values be reliable and meaningful. This path will be tedious and cost a lot of time.

But there's another option. Under plain DOS there are no task switches. Exclusive access to the CPU is moreover guaranteed. I wrote a small program that counts the cycles for a short code sequence. This

is repeated 20 times to prevent cache warm-up effects. All 20 results are printed at the end. That serves only for the information, from where the values stabilize. Of course, the median, arithmetic mean

or the variance could be calculated. But that would only be a statistical ironing procedure without any factual background.

The application is written with PowerBASIC and JWASM. This is the first step. I'm working on a version that's completely written in assembly language. A short sequence of FPU instructions is tested: load

double, 4 floating point divisions, save double. If the program is tested, it can't do any harm. Here is the output under DOSBox 0.74-3:

`Sorry!`

The usage of the Time Stamp Counter isn't possible

with the available CPU.

Program ends now.

This is correct, because DOSBox only emulates an 80486. The Time Stamp Counter came with the Pentium. Here is the output under FreeDOS running under VirtualBox:

`Iteration 1: 104 Cycles`

Iteration 2: 106 Cycles

Iteration 3: 104 Cycles

Iteration 4: 106 Cycles

Iteration 5: 104 Cycles

Iteration 6: 108 Cycles

Iteration 7: 100 Cycles

Iteration 8: 102 Cycles

Iteration 9: 98 Cycles

Iteration 10: 108 Cycles

Iteration 11: 104 Cycles

Iteration 12: 106 Cycles

Iteration 13: 100 Cycles

Iteration 14: 108 Cycles

Iteration 15: 104 Cycles

Iteration 16: 108 Cycles

Iteration 17: 102 Cycles

Iteration 18: 102 Cycles

Iteration 19: 100 Cycles

Iteration 20: 106 Cycles

Please, press any key to end the application...

The same machine, application started under plain FreeDOS without any drivers:

`Iteration 1: 82 Cycles`

Iteration 2: 82 Cycles

Iteration 3: 84 Cycles

Iteration 4: 84 Cycles

Iteration 5: 86 Cycles

Iteration 6: 82 Cycles

Iteration 7: 82 Cycles

Iteration 8: 82 Cycles

Iteration 9: 84 Cycles

Iteration 10: 82 Cycles

Iteration 11: 82 Cycles

Iteration 12: 82 Cycles

Iteration 13: 82 Cycles

Iteration 14: 82 Cycles

Iteration 15: 82 Cycles

Iteration 16: 82 Cycles

Iteration 17: 82 Cycles

Iteration 18: 82 Cycles

Iteration 19: 82 Cycles

Iteration 20: 82 Cycles

Please, press any key to end the application...

Where does the difference came from? Why is the program slower under VirtualBox? Well, as mentioned at the beginning: We are an application in ring 3 and are additionally emulated.

Nevertheless, I would be happy about test runs and reports in other environments.