Author Topic: 64 bit OS and Real 4 (float) - not very reliable!  (Read 28601 times)

Gunther

  • Member
  • *****
  • Posts: 3585
  • Forgive your enemies, but never forget their names
Re: 64 bit OS and Real 4 (float) - not very reliable!
« Reply #15 on: November 08, 2012, 02:51:18 AM »
Hi qWord,

your later answers sounding better and better and I hope we're at the same line.

Even this sound a bit like you won't think about the error of an calculation and instead hope that the high precision solve that problem. I would estimate the error and then choose the correct data type or, if need, instead us an high/arbitrary precision library.

Excuse me please, but with the best will in the world, I can't see an error. In my testbed we've the following situation:
  • No number has fractional part. So, rounding errors which have to do with 0.1 and multiples of that can't occur.
  • We don't subtract nearly equal values, which would lead to a dangerous digit cancallation.
  • The valid range for REAL 4 is from 1.175 x 10^(-38) up to 3.403 x 10^38. We're far away from that limits.

Therefore, the FPU makes a good job in that range and brings the right results. What you are calling "error" has only to do with the precision shrink. These shrink isn't a law of nature, but the work of man done by compiler builders.

Gunther
Get your facts first, and then you can distort them.

qWord

  • Member
  • *****
  • Posts: 1476
  • The base type of a type is the type itself
    • SmplMath macros
Re: 64 bit OS and Real 4 (float) - not very reliable!
« Reply #16 on: November 08, 2012, 09:36:45 AM »
  • No number has fractional part. So, rounding errors which have to do with 0.1 and multiples of that can't occur.
all numbers are normalized to 1.xxx...*2^y

  • The valid range for REAL 4 is from 1.175 x 10^(-38) up to 3.403 x 10^38. We're far away from that limits
yes, with 24 precision bits. Even with 64 precision bits, you  will only get around 18 decimal digits of precision.

These shrink isn't a law of nature, but the work of man done by compiler builders.
no, it is an technical limit.

Therefore, the FPU makes a good job in that range and brings the right results.
no doubts that it does a good job, but you will also get it using double precision variables with SSEx.

If I use float or double variables in my programs, I assume that the compiler will create code that does the calculations at least with that precision. However I would never assume that higher precision is used. As your example shows, the compiler creators and most other SW developers share this opinion.
You assumption based on your experience that it was done that way in the past - I'm sure it was never the intention by any compiler builder to always use REAL10/8s  - they simply know that it is dam slow to switch the FPU flags.

regards, qWord
MREAL macros - when you need floating point arithmetic while assembling!

qWord

  • Member
  • *****
  • Posts: 1476
  • The base type of a type is the type itself
    • SmplMath macros
Re: 64 bit OS and Real 4 (float) - not very reliable!
« Reply #17 on: November 08, 2012, 10:02:47 AM »
addendum: I've take a look in the latest C standard and found this:
Quote from: ISO/IEC 9899:201x Committee Draft — April 12, 2011
Except for assignment and cast (which remove all extra range and precision), the values
yielded by operators with floating operands and values subject to the usual arithmetic
conversions and of floating constants are evaluated to a format whose range and precision
may be greater
than required by the type. The use of evaluation formats is characterized
by the implementation-defined value of FLT_EVAL_METHOD
  • -1 indeterminable;
  • 0 evaluate all operations and constants just to the range and precision of the
    type;
  • 1 evaluate operations and constants of type float and double to the
    range and precision of the double type, evaluate long double
    operations and constants to the range and precision of the long double
    type;
  • 2 evaluate all operations and constants to the range and precision of the
    long double type.

All other negative values for FLT_EVAL_METHOD characterize implementation-defined
behavior.
What does your compiler returns for FLT_EVAL_METHOD?
I currently can't find similar statements for c++...
MREAL macros - when you need floating point arithmetic while assembling!

Gunther

  • Member
  • *****
  • Posts: 3585
  • Forgive your enemies, but never forget their names
Re: 64 bit OS and Real 4 (float) - not very reliable!
« Reply #18 on: November 10, 2012, 09:35:29 PM »
Hi qWord,

all numbers are normalized to 1.xxx...*2^y

That's clear, but I meant the decimal representation of the numbers.

yes, with 24 precision bits. Even with 64 precision bits, you  will only get around 18 decimal digits of precision.
My point was REAL 4 and by the way, I've strong doubts that you will have 18 valid decimal digits with REAL 8; probably 16 digits is real.

These shrink isn't a law of nature, but the work of man done by compiler builders.
no, it is an technical limit.
The decision to use the xmm registers and not the FPU is the work of man and has nothing to do with technical limits.

no doubts that it does a good job, but you will also get it using double precision variables with SSEx.
That is to check.

If I use float or double variables in my programs, I assume that the compiler will create code that does the calculations at least with that precision. However I would never assume that higher precision is used. As your example shows, the compiler creators and most other SW developers share this opinion.
Or, and that's another possibility, the compiler builders don't know enough about the dangers and risks.

What does your compiler returns for FLT_EVAL_METHOD?
I currently can't find similar statements for c++...

I've to check that.

Gunther
Get your facts first, and then you can distort them.

MichaelW

  • Global Moderator
  • Member
  • *****
  • Posts: 1209
Re: 64 bit OS and Real 4 (float) - not very reliable!
« Reply #19 on: November 11, 2012, 05:59:55 AM »
Everything I have is 32-bit. For my most recent installation of MinGW (with gcc version 4.5.2), __FLT_EVAL_METHOD__ is set to 2. This is the relevant section of math.h:
Code: [Select]
/* Use the compiler's builtin define for FLT_EVAL_METHOD to
   set float_t and double_t.  */
#if defined(__FLT_EVAL_METHOD__) 
# if ( __FLT_EVAL_METHOD__== 0)
typedef float float_t;
typedef double double_t;
# elif (__FLT_EVAL_METHOD__ == 1)
typedef double float_t;
typedef double double_t;
# elif (__FLT_EVAL_METHOD__ == 2)
typedef long double float_t;
typedef long double double_t;
#endif
#else /* ix87 FPU default */
typedef long double float_t;
typedef long double double_t;
#endif

I cannot find any reference to __FLT_EVAL_METHOD__ in the header files that shipped with my installation of the 2003 PSDK, the Windows Server 2003 SP1 DDK, or the Microsoft Visual C++ Toolkit 2003.
Well Microsoft, here’s another nice mess you’ve gotten us into.

Gunther

  • Member
  • *****
  • Posts: 3585
  • Forgive your enemies, but never forget their names
Re: 64 bit OS and Real 4 (float) - not very reliable!
« Reply #20 on: November 11, 2012, 09:48:18 AM »
Hi Mike,

thank you for your investigation. My installed gcc version 4.7.2 contains the same section in math.h.

Gunther
Get your facts first, and then you can distort them.

MichaelW

  • Global Moderator
  • Member
  • *****
  • Posts: 1209
Re: 64 bit OS and Real 4 (float) - not very reliable!
« Reply #21 on: November 11, 2012, 02:36:43 PM »
It seems to me that there should be a way to control the value, a command line option or some such, but in my quick search of the GCC docs I did not find one. And then there is the question of how this relates to the effective deprecation of long double under Windows.
Well Microsoft, here’s another nice mess you’ve gotten us into.

Gunther

  • Member
  • *****
  • Posts: 3585
  • Forgive your enemies, but never forget their names
Re: 64 bit OS and Real 4 (float) - not very reliable!
« Reply #22 on: November 11, 2012, 09:34:12 PM »
Hi Mike,

It seems to me that there should be a way to control the value, a command line option or some such, but in my quick search of the GCC docs I did not find one. And then there is the question of how this relates to the effective deprecation of long double under Windows.

the gcc supports long double under Windows but can't print such values, because it uses the windows libc. It's a bit strange.

Gunther
Get your facts first, and then you can distort them.