News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Time to crash the "best algo" to format dword unsigned numbers

Started by frktons, December 26, 2012, 11:06:06 AM

Previous topic - Next topic

frktons

Quote from: dedndave on December 30, 2012, 03:11:36 AM
it's not "how many times" you run the test, really
it's how much time you spend running it
you want to run it for a long enough period of time so that OS intervention is negligible
or, at least, consistent - lol

Quotethen, for each test, try to choose a loop count that yields a 500 mS test

The program I'm going to use this function in will never have so long
time to run for this specific function. So I really don't understand what's
the point of doing so long test. The results I get are not the speed the
function is going to have while I use it.

By the way I've almost finished:
Quote
----------------------------------------------------------------------------
Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz

Instructions: MMX, SSE1, SSE2, SSE3, SSSE3
----------------------------------------------------------------------------
3.268   cycles for NumFormatX - IDIV / Stack
2.332   cycles for NumFormatX2 - Reciprocal IMUL / Stack

1.514   cycles for NumFormatF-I - IDIV / Stack / Table
1.282   cycles for NumFormatF-IA - IDIV / STRUCT / Table
1.199   cycles for NumFormatF-II - Reciprocal IMUL / Stack / Table
891     cycles for NumFormatF-III - Reciprocal IMUL / STRUCT / Table
----------------------------------------------------------------------------
3.371   cycles for NumFormatX - IDIV / Stack
2.253   cycles for NumFormatX2 - Reciprocal IMUL / Stack

1.592   cycles for NumFormatF-I - IDIV / Stack / Table
1.237   cycles for NumFormatF-IA - IDIV / STRUCT / Table
1.296   cycles for NumFormatF-II - Reciprocal IMUL / Stack / Table
973     cycles for NumFormatF-III - Reciprocal IMUL / STRUCT / Table
----------------------------------------------------------------------------
3.220   cycles for NumFormatX - IDIV / Stack
2.306   cycles for NumFormatX2 - Reciprocal IMUL / Stack

1.477   cycles for NumFormatF-I - IDIV / Stack / Table
1.267   cycles for NumFormatF-IA - IDIV / STRUCT / Table
1.215   cycles for NumFormatF-II - Reciprocal IMUL / Stack / Table
908     cycles for NumFormatF-III - Reciprocal IMUL / STRUCT / Table
----------------------------------------------------------------------------

In a few days I'm going to post the final results and program. :biggrin:
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama

dedndave

the purpose is merely to get stable, repeatable results
it isn't for my benefit - it is for yours
it will make it easier to interpret the data, is all

frktons

Quote from: dedndave on December 30, 2012, 08:07:30 AM
the purpose is merely to get stable, repeatable results
it isn't for my benefit - it is for yours
it will make it easier to interpret the data, is all

If the principle of getting stable and repeatable results
is referred to general algos, I do understand all the things
that you, Michael, or others have said.
My doubts are only referred to the application of these
principles to a case in which I forecast the routine will
be called 100-200 times for each cycle.
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama

frktons

Time to move towards other routines and the main
project: Text File Scanner & Compressor.

For the time being I've already fulfilled the goal of a
2:1 faster algo. It is in reality about 3:1 faster.
My routines right align the numbers, as you can see with
one of the two programs attached, clive's ones align left.
Quote
----------------------------------------------------------------------------
Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz

Instructions: MMX, SSE1, SSE2, SSE3, SSSE3
----------------------------------------------------------------------------
3.091   cycles for NumFormatX - IDIV / Stack
2.463   cycles for NumFormatX2 - Reciprocal IMUL / Stack

1.425   cycles for NumFormatF-I - IDIV / Stack / Table
1.225   cycles for NumFormatF-IA - IDIV / STRUCT / Table
1.193   cycles for NumFormatF-II - Reciprocal IMUL / Stack / Table
  912   cycles for NumFormatF-III - Reciprocal IMUL / STRUCT / Table
----------------------------------------------------------------------------
3.252   cycles for NumFormatX - IDIV / Stack
2.447   cycles for NumFormatX2 - Reciprocal IMUL / Stack

1.408   cycles for NumFormatF-I - IDIV / Stack / Table
1.261   cycles for NumFormatF-IA - IDIV / STRUCT / Table
1.177   cycles for NumFormatF-II - Reciprocal IMUL / Stack / Table
  929   cycles for NumFormatF-III - Reciprocal IMUL / STRUCT / Table
----------------------------------------------------------------------------
3.240   cycles for NumFormatX - IDIV / Stack
2.434   cycles for NumFormatX2 - Reciprocal IMUL / Stack

1.403   cycles for NumFormatF-I - IDIV / Stack / Table
1.230   cycles for NumFormatF-IA - IDIV / STRUCT / Table
1.179   cycles for NumFormatF-II - Reciprocal IMUL / Stack / Table
  902   cycles for NumFormatF-III - Reciprocal IMUL / STRUCT / Table
----------------------------------------------------------------------------

If somebody posts his results I'd be nice.

There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama

dedndave

 ;)
QuoteMicrosoft Windows XP [Version 5.1.2600]
(C) Copyri0ht 1985-2001 Micros0ft Corp.                       0
  0       9                   9                               9
  9Documen10 and Settings\Dave10esktop => pf                 10
10       99                  99                             99
99       100                 100                           100
100       999                 999                           999
999       1,000               1,000                       1.000               1.
000       9,999               9,999                       9.999               9.
999       10,000              10,000                     10.000              10.
000       99,999              99,999                     99.999              99.
999       100,000             100,000                   100.000             100.
000       999,999             999,999                   999.999             999.
999       1,000,000           1,000,000               1.000.000           1.000.
000       9,999,999           9,999,999               9.999.999           9.999.
999       10,000,000          10,000,000             10.000.000          10.000.
000       99,999,999          99,999,999             99.999.999          99.999.
999       100,000,000         100,000,000           100.000.000         100.000.
000       999,999,999         999,999,999           999.999.999         999.999.
999       1,000,000,000       1,000,000,000       1.000.000.000       1.000.000.
000       4,294,967,295       4,294,967,295       4.294.967.295       4.294.967.295

prescott w/htt
Quote----------------------------------------------------------------------------
Intel(R) Pentium(R) 4 CPU 3.00GHz

Instructions: MMX, SSE1, SSE2, SSE3
----------------------------------------------------------------------------
8,328   cycles for NumFormatX - IDIV / Stack
3,178   cycles for NumFormatX2 - Reciprocal IMUL / Stack

3,364   cycles for NumFormatF-I - IDIV / Stack / Table
2,581   cycles for NumFormatF-IA - IDIV / STRUCT / Table
1,283   cycles for NumFormatF-II - Reciprocal IMUL / Stack / Table
1,124   cycles for NumFormatF-III - Reciprocal IMUL / STRUCT / Table
----------------------------------------------------------------------------
8,322   cycles for NumFormatX - IDIV / Stack
3,201   cycles for NumFormatX2 - Reciprocal IMUL / Stack

3,365   cycles for NumFormatF-I - IDIV / Stack / Table
2,564   cycles for NumFormatF-IA - IDIV / STRUCT / Table
1,280   cycles for NumFormatF-II - Reciprocal IMUL / Stack / Table
1,125   cycles for NumFormatF-III - Reciprocal IMUL / STRUCT / Table
----------------------------------------------------------------------------
8,266   cycles for NumFormatX - IDIV / Stack
3,172   cycles for NumFormatX2 - Reciprocal IMUL / Stack

3,382   cycles for NumFormatF-I - IDIV / Stack / Table
2,545   cycles for NumFormatF-IA - IDIV / STRUCT / Table
1,285   cycles for NumFormatF-II - Reciprocal IMUL / Stack / Table
1,109   cycles for NumFormatF-III - Reciprocal IMUL / STRUCT / Table
----------------------------------------------------------------------------

frktons

Thanks Dave for your collaboration. As you can see your
student is trying to learn something, my master.  :lol:

Oh, sorry. I forgot the usual $50 for you.  :t
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama

frktons

I decided not to do, for the time being, the SSE version,
maybe in the future. Now there are a thousand more routines to prepare  :lol:
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama

frktons

If somebody with an english setting of Windows can run
this test and post the results, I'll be glad.

I have to verify that the routines use the appropriate thousand separator
for english and non-english setting.

These routines have been checked and corrected  in
order to adapt to the locale setting of the system.

Thanks
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama

sinsi



          0                   0                               0                   0
          9                   9                               9                   9
          10                  10                             10                  10
          99                  99                             99                  99
          100                 100                           100                 100
          999                 999                           999                 999
          1,000               1,000                       1,000               1,000
          9,999               9,999                       9,999               9,999
          10,000              10,000                     10,000              10,000
          99,999              99,999                     99,999              99,999
          100,000             100,000                   100,000             100,000
          999,999             999,999                   999,999             999,999
          1,000,000           1,000,000               1,000,000           1,000,000
          9,999,999           9,999,999               9,999,999           9,999,999
          10,000,000          10,000,000             10,000,000          10,000,000
          99,999,999          99,999,999             99,999,999          99,999,999
          100,000,000         100,000,000           100,000,000         100,000,000
          999,999,999         999,999,999           999,999,999         999,999,999
          1,000,000,000       1,000,000,000       1,000,000,000       1,000,000,000
          4,294,967,295       4,294,967,295       4,294,967,295       4,294,967,295


frktons

There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama

frktons

I forgot the last touch to get an algo 3:1 faster than previous ones:
Quote
----------------------------------------------------------------------------
Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz

Instructions: MMX, SSE1, SSE2, SSE3, SSSE3
----------------------------------------------------------------------------
3,243   cycles for NumFormatX - IDIV / Stack
2,432   cycles for NumFormatX2 - Reciprocal IMUL / Stack

1,463   cycles for NumFormatF-I - IDIV / Stack / Table
1,278   cycles for NumFormatF-IA - IDIV / STRUCT / Table
1,173   cycles for NumFormatF-II - Reciprocal IMUL / Stack / Table
  763   cycles for NumFormatF-III - Reciprocal IMUL / STRUCT / Table
----------------------------------------------------------------------------
3,293   cycles for NumFormatX - IDIV / Stack
2,419   cycles for NumFormatX2 - Reciprocal IMUL / Stack

1,453   cycles for NumFormatF-I - IDIV / Stack / Table
1,250   cycles for NumFormatF-IA - IDIV / STRUCT / Table
1,170   cycles for NumFormatF-II - Reciprocal IMUL / Stack / Table
  776   cycles for NumFormatF-III - Reciprocal IMUL / STRUCT / Table
----------------------------------------------------------------------------
3,259   cycles for NumFormatX - IDIV / Stack
2,411   cycles for NumFormatX2 - Reciprocal IMUL / Stack

1,453   cycles for NumFormatF-I - IDIV / Stack / Table
1,238   cycles for NumFormatF-IA - IDIV / STRUCT / Table
1,174   cycles for NumFormatF-II - Reciprocal IMUL / Stack / Table
  768   cycles for NumFormatF-III - Reciprocal IMUL / STRUCT / Table
----------------------------------------------------------------------------

Now it's OK.  :t
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama

jj2007

 :t
Intel(R) Celeron(R) M CPU        420  @ 1.60GHz

Instructions: MMX, SSE1, SSE2, SSE3
--------------------------------------------------------------------
2,896   cycles for NumFormatX - IDIV / Stack
2,279   cycles for NumFormatX2 - Reciprocal IMUL / Stack

1,327   cycles for NumFormatF-I - IDIV / Stack / Table
1,231   cycles for NumFormatF-IA - IDIV / STRUCT / Table
1,055   cycles for NumFormatF-II - Reciprocal IMUL / Stack / Table
  777   cycles for NumFormatF-III - Reciprocal IMUL / STRUCT / Table
--------------------------------------------------------------------
2,895   cycles for NumFormatX - IDIV / Stack
2,279   cycles for NumFormatX2 - Reciprocal IMUL / Stack

1,328   cycles for NumFormatF-I - IDIV / Stack / Table
1,241   cycles for NumFormatF-IA - IDIV / STRUCT / Table
1,076   cycles for NumFormatF-II - Reciprocal IMUL / Stack / Table
  777   cycles for NumFormatF-III - Reciprocal IMUL / STRUCT / Table
--------------------------------------------------------------------
2,889   cycles for NumFormatX - IDIV / Stack
2,286   cycles for NumFormatX2 - Reciprocal IMUL / Stack

1,341   cycles for NumFormatF-I - IDIV / Stack / Table
1,271   cycles for NumFormatF-IA - IDIV / STRUCT / Table
1,056   cycles for NumFormatF-II - Reciprocal IMUL / Stack / Table
  777   cycles for NumFormatF-III - Reciprocal IMUL / STRUCT / Table

dedndave

prescott w/htt
Quote----------------------------------------------------------------------------
Intel(R) Pentium(R) 4 CPU 3.00GHz

Instructions: MMX, SSE1, SSE2, SSE3
----------------------------------------------------------------------------
8,306   cycles for NumFormatX - IDIV / Stack
3,730   cycles for NumFormatX2 - Reciprocal IMUL / Stack

3,355   cycles for NumFormatF-I - IDIV / Stack / Table
2,556   cycles for NumFormatF-IA - IDIV / STRUCT / Table
1,293   cycles for NumFormatF-II - Reciprocal IMUL / Stack / Table
  968   cycles for NumFormatF-III - Reciprocal IMUL / STRUCT / Table
----------------------------------------------------------------------------
8,285   cycles for NumFormatX - IDIV / Stack
3,744   cycles for NumFormatX2 - Reciprocal IMUL / Stack

3,368   cycles for NumFormatF-I - IDIV / Stack / Table
2,569   cycles for NumFormatF-IA - IDIV / STRUCT / Table
1,293   cycles for NumFormatF-II - Reciprocal IMUL / Stack / Table
  995   cycles for NumFormatF-III - Reciprocal IMUL / STRUCT / Table
----------------------------------------------------------------------------
8,381   cycles for NumFormatX - IDIV / Stack
3,719   cycles for NumFormatX2 - Reciprocal IMUL / Stack

3,364   cycles for NumFormatF-I - IDIV / Stack / Table
2,543   cycles for NumFormatF-IA - IDIV / STRUCT / Table
1,286   cycles for NumFormatF-II - Reciprocal IMUL / Stack / Table
  972   cycles for NumFormatF-III - Reciprocal IMUL / STRUCT / Table
----------------------------------------------------------------------------

frktons

Dave,
on your PC my last algo is 4:1 faster than previous ones.
Very good my Master, you teached it well.  :t
You'll find your $50 at the usual link.  :lol:
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama

dedndave

what do you think you are, some kind of jedi knight or something ?
waving your link around in front of me
i'm an assembly programmer
mind tricks don't work on me
only money !