Author Topic: Windows XP Mode is a miracle  (Read 1200 times)

Gunther

  • Member
  • *****
  • Posts: 3517
  • Forgive your enemies, but never forget their names
Windows XP Mode is a miracle
« on: August 13, 2014, 12:04:06 PM »
Well, that's partly a bit DOS stuff (not really  :lol:). Therefore I decided to post it here. What's the story?

I've installed the current version of DJGPP (gcc for DOS) on my DOS pen drive. That works fine. DJGPP makes a 32-bit DOS Protected Mode EXE with the necessary DOS Extender linked in. It uses the FLAT memory model similar to 32-bit Windows. Therefore, the attached archive FSUM.ZIP contains both, the DOS and the Win32 EXE (in different folders, please check the readme.txt file for the directory and file structure). The sources are the same for both operating systems. The only thing to do was two different compiler runs. This applies to the gcc. The DOS version of JWASM isn't up to date and can't be used (or one has to code XGETBV and a few other instructions via DB statements). Fortunately, we can use the Windows OBJ file without changes, because the COFF format is used in both cases. Therefore, the bfsum.bat files are slightly different, because the assembly process isn't necessary under DOS.

Here are the interesting results. First the DOS exe under FreeDos 1.1:
Quote

Calculating the sum of a float array in 3 different ways.
That'll take a little while. Please be patient ...

Simple C implementation:
------------------------
sum1              = 8390656.00
Elapsed Time      = 14.45 Seconds

C implementation with 4 accumulators:
-------------------------------------
sum2              = 8390656.00
Elapsed Time      = 6.70 Seconds
Performance Boost = 216%

Assembly Language with 4 XMM accumulators:
------------------------------------------
sum3              = 8390656.00
Elapsed Time      = 1.26 Seconds
Performance Boost = 1143%

Please, press enter to end the application ...


The same DOS EXE runs flawless under Windows 7-32 inside VirtualBox:
Quote

Calculating the sum of a float array in 3 different ways.
That'll take a little while. Please be patient ...

Simple C implementation:
------------------------
sum1              = 8390656.00
Elapsed Time      = 13.90 Seconds

C implementation with 4 accumulators:
-------------------------------------
sum2              = 8390656.00
Elapsed Time      = 6.59 Seconds
Performance Boost = 211%

Assembly Language with 4 XMM accumulators:
------------------------------------------
sum3              = 8390656.00
Elapsed Time      = 1.21 Seconds
Performance Boost = 1150%

Please, press enter to end the application ...

That's not so bad for a DOS emulation inside an emulation. And it's a bit faster. But that's nothing. A big surprise was the same DOS EXE under XP mode with VirtualPC:
Quote

Calculating the sum of a float array in 3 different ways.
That'll take a little while. Please be patient ...

Simple C implementation:
------------------------
sum1              = 8390656.00
Elapsed Time      = 13.35 Seconds

C implementation with 4 accumulators:
-------------------------------------
sum2              = 8390656.00
Elapsed Time      = 6.21 Seconds
Performance Boost = 215%

Assembly Language with 4 XMM accumulators:
------------------------------------------
sum3              = 8390656.00
Elapsed Time      = 1.15 Seconds
Performance Boost = 1157%

Please, press enter to end the application ...

What is happening here? There's only one application running under plain DOS, no scheduler intervenes, no task switches etc. The FLAT memory model is the same, but DOS is slower. The only difference is that DJGPP must link all static, because DOS doesn't know DLLs and late binding. But that can't make the difference.

Of course, the native Windows EXE brings similar results under XP mode:
Quote

Calculating the sum of a float array in 3 different ways.
That'll take a little while. Please be patient ...

Simple C implementation:
------------------------
sum1              = 8390656.00
Elapsed Time      = 13.45 Seconds

C implementation with 4 accumulators:
-------------------------------------
sum2              = 8390656.00
Elapsed Time      = 6.78 Seconds
Performance Boost = 198%

Assembly Language with 4 XMM accumulators:
------------------------------------------
sum3              = 8390656.00
Elapsed Time      = 1.14 Seconds
Performance Boost = 1179%

Please, press enter to end the application ...

But the fastest is the Win32 EXE running under Windows 7-64 (WOW):
Quote

Calculating the sum of a float array in 3 different ways.
That'll take a little while. Please be patient ...

Simple C implementation:
------------------------
sum1              = 8390656.00
Elapsed Time      = 12.83 Seconds

C implementation with 4 accumulators:
-------------------------------------
sum2              = 8390656.00
Elapsed Time      = 6.43 Seconds
Performance Boost = 200%

Assembly Language with 4 XMM accumulators:
------------------------------------------
sum3              = 8390656.00
Elapsed Time      = 1.10 Seconds
Performance Boost = 1166%

Please, press enter to end the application ...


Has anyone an explanation? What's the behavior under other configurations? Any help is welcome.

Gunther
Get your facts first, and then you can distort them.