ArrayIndex timings

daydreamer · February 10, 2019, 08:19:25 AM

Quote from: guga on February 10, 2019, 07:26:29 AM
QuoteReal number precision is a can of worms :(

Indeed. I´m having to make a hard choice right now to what is the best approach on RosAsm´s debugger. The numbers are rounded to the nearest (As in OllyDbg and Idapro) but, perhaps, it would be better to keep the numbers precise (For the debugger, mainly). The FPU in RosAsm was last updated more then a decade ago and i didn´t updated it ever since. The routines for the assembler, disassembler and debugger are similar to each other, so i´m thinking what would be the best choice. Maybe creating a option to allow the user round the FPU or not (Specially for the disassembler when facing numbers such as: 99.99999999999997. We chosen to round those kind of numbers for the disassembler but, it maybe good to made this as an user option and not an automated mode.

Thinking :icon_rolleyes: :icon_rolleyes:

Have to work on regular If macros later,because I want to make special macros for cielab conversions first + other macros for calculations
So first I want to make macro is for check grey pixel is between two values in table that is without jump,so you have freedom to use conditional mov or conditional jump

jj2007 · February 10, 2019, 11:28:16 AM

Hi everybody,

I am testing two slightly different versions of my binary search algos. Can I please have some timings, especially also on AMD, on older, mobile or otherwise exotic CPUs? Thanks :icon14:

Code Select

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
461 ms for 1000000 matches - integer version A
497 ms for 1000000 matches - integer version B

564 ms for 1000000 matches - double version A
581 ms for 1000000 matches - double version B

460 ms for 1000000 matches - integer version A
457 ms for 1000000 matches - integer version B

578 ms for 1000000 matches - double version A
569 ms for 1000000 matches - double version B

452 ms for 1000000 matches - integer version A
455 ms for 1000000 matches - integer version B

586 ms for 1000000 matches - double version A
611 ms for 1000000 matches - double version B

474 ms for 1000000 matches - integer version A
448 ms for 1000000 matches - integer version B

588 ms for 1000000 matches - double version A
575 ms for 1000000 matches - double version B

483 ms for 1000000 matches - integer version A
465 ms for 1000000 matches - integer version B

610 ms for 1000000 matches - double version A
587 ms for 1000000 matches - double version B

P.S.: During the first run it builds two databases. The sorting may take about 5-10 seconds.

guga · February 10, 2019, 04:33:34 PM

Hi Jochen :t

Code Select

Building two 10 Mio elements databases...

Intel(R) Core(TM) i7 CPU         870  @ 2.93GHz
312 ms for 1000000 matches - integer version A
337 ms for 1000000 matches - integer version B

420 ms for 1000000 matches - double version A
431 ms for 1000000 matches - double version B

332 ms for 1000000 matches - integer version A
309 ms for 1000000 matches - integer version B

427 ms for 1000000 matches - double version A
423 ms for 1000000 matches - double version B

307 ms for 1000000 matches - integer version A
318 ms for 1000000 matches - integer version B

428 ms for 1000000 matches - double version A
418 ms for 1000000 matches - double version B

314 ms for 1000000 matches - integer version A
314 ms for 1000000 matches - integer version B

422 ms for 1000000 matches - double version A
411 ms for 1000000 matches - double version B

314 ms for 1000000 matches - integer version A
328 ms for 1000000 matches - integer version B

416 ms for 1000000 matches - double version A
423 ms for 1000000 matches - double version B

--- hit any key ---

About the problem on RosAsm debugger. The error was caused by a FPU_EXCEPTION_PRECISION on the control word of the FPU. It do returns the correct (?) number 99.9999999999999714 but due to that exception it was rounding only on the buffer that stored the string that converted the FPU to Ascii and not the FPU value itself.

I´m in doubt what to do because due about those kind of exceptions. I can consider the number as 100, anyway and force it to become 100, right ? It seems to be on the margin of error, though

Siekmanski · February 10, 2019, 05:03:01 PM

Code Select

Building two 10 Mio elements databases...

Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz
348 ms for 1000000 matches - integer version A
302 ms for 1000000 matches - integer version B

393 ms for 1000000 matches - double version A
410 ms for 1000000 matches - double version B

307 ms for 1000000 matches - integer version A
285 ms for 1000000 matches - integer version B

410 ms for 1000000 matches - double version A
396 ms for 1000000 matches - double version B

285 ms for 1000000 matches - integer version A
291 ms for 1000000 matches - integer version B

417 ms for 1000000 matches - double version A
426 ms for 1000000 matches - double version B

298 ms for 1000000 matches - integer version A
316 ms for 1000000 matches - integer version B

445 ms for 1000000 matches - double version A
405 ms for 1000000 matches - double version B

317 ms for 1000000 matches - integer version A
297 ms for 1000000 matches - integer version B

418 ms for 1000000 matches - double version A
429 ms for 1000000 matches - double version B

daydreamer · February 10, 2019, 05:25:40 PM

Guga maybe think like an engineer,put measurement 100+-0.1 on blueprint
So the workers know how precision is to let metal bits thru or throw away because it's became too big (>100.01) or too small (<99.99)

TimoVJL · February 10, 2019, 08:18:33 PM

Code Select

Building two 10 Mio elements databases...

AMD Athlon(tm) II X2 220 Processor
889 ms for 1000000 matches - integer version A
799 ms for 1000000 matches - integer version B

934 ms for 1000000 matches - double version A
935 ms for 1000000 matches - double version B

795 ms for 1000000 matches - integer version A
795 ms for 1000000 matches - integer version B

934 ms for 1000000 matches - double version A
932 ms for 1000000 matches - double version B

807 ms for 1000000 matches - integer version A
816 ms for 1000000 matches - integer version B

936 ms for 1000000 matches - double version A
929 ms for 1000000 matches - double version B

810 ms for 1000000 matches - integer version A
811 ms for 1000000 matches - integer version B

933 ms for 1000000 matches - double version A
927 ms for 1000000 matches - double version B

805 ms for 1000000 matches - integer version A
795 ms for 1000000 matches - integer version B

931 ms for 1000000 matches - double version A
943 ms for 1000000 matches - double version B

--- hit any key ---

Siekmanski · February 10, 2019, 08:22:57 PM

Quote from: daydreamer on February 10, 2019, 05:25:40 PM
Guga maybe think like an engineer,put measurement 100+-0.1 on blueprint
So the workers know how precision is to let metal bits thru or throw away because it's became too big (>100.01) or too small (<99.99)

Maybe engineers can get some gain and precision with precalculated coefficients..... 8)

guga · February 11, 2019, 01:20:23 AM

Siekmanski

Hi DayDreamer

The problem is that the precision is Ok, it is converting the FPU to 17 digits (considering the dot). The problem is that the functions to convert FPU to String are old and i´m having to review them since it is generating a bug in both Debugger and Disassembler. It was an adaptation of Raymond´s function made by René (the original author), but as i told, some numbers where supposed to be rounded only when the number reaches the 16-17th final digit ending with either "0" to "9" but, for some odd reason it is considering the last 3 digits and to get things worst, when the generated exponent is negative it tries to compute the maximum exponent (similar to the scientific routine in Raymond´s work) but it is giving a error because the generated exponent is turning negative and i´m not being able to find how to fix that yet.

The main problem of rounding a Real8 or TenByte (real10) is that 99.999999997145 is still not 100 ! But i could, however, consider (for the disassembler and debugger routines) 99.999999999995 or 99.999999999991 or 99.999999999999 as 100 or should i keep those numbers precise disregarding the margin of error in the FPU itself ? This is the kind of thing i´m trying to decide what to do, because on Olly and ida they have different approaches on this sort of thing. (While ollydbg seems to round it anyway (as we did in rosasm), Idapro considers the number more precise including the margin of error (and this caused another error with numbers being converted as 2.546465 e314 for example, when the correct was other value).

guga · February 11, 2019, 04:36:54 AM

Ok, guys, fixed one part of RosAsm FPU for the disassembler mode. It correctly outputs the numbers with a bit of more precision then Idapro

QuoteRosAsm
[Data04090B8: F$ 1.84467440737095516e+19]
[Data04090BC: R$ -5.42101086242752217e-20]

Idapro

flt_4090B8 dd 1.8446744e19
dbl_4090BC dq -5.421010862427522e-20

It still have a few more things to fix before i decide what is the proper method to handle those roundings. :icon_rolleyes: :icon_rolleyes:

FORTRANS · February 12, 2019, 01:19:39 AM

Hi Jochen,

Quote from: jj2007 on February 10, 2019, 11:28:16 AM
Hi everybody,

I am testing two slightly different versions of my binary search algos. Can I please have some timings, especially also on AMD, on older, mobile or otherwise exotic CPUs? Thanks :icon14:

It ,of course, did not run on the P-III. So, the next oldest still working.

Code Select

F:\TEMP\TEST>ARRAYIND
Building two 10 Mio elements databases...

Intel(R) Pentium(R) M processor 1.70GHz
1111 ms for 1000000 matches - integer version A
990 ms for 1000000 matches - integer version B

1134 ms for 1000000 matches - double version A
1148 ms for 1000000 matches - double version B

989 ms for 1000000 matches - integer version A
990 ms for 1000000 matches - integer version B

1131 ms for 1000000 matches - double version A
1137 ms for 1000000 matches - double version B

993 ms for 1000000 matches - integer version A
986 ms for 1000000 matches - integer version B

1203 ms for 1000000 matches - double version A
1152 ms for 1000000 matches - double version B

1009 ms for 1000000 matches - integer version A
1009 ms for 1000000 matches - integer version B

1159 ms for 1000000 matches - double version A
1157 ms for 1000000 matches - double version B

1007 ms for 1000000 matches - integer version A
1008 ms for 1000000 matches - integer version B

1133 ms for 1000000 matches - double version A
1162 ms for 1000000 matches - double version B

--- hit any key ---

HTH,

Steve N.

The MASM Forum

News: