Sorry if this is the wrong place for this.
My fpu coding is horrid at best.
How would you code this "c" example in this post of mine.(jwasm or masm)
http://cboard.cprogramming.com/c-programming/155373-verify-mingw-issue.html
Do you feel it is unrealistic to expect 1267?
It is beginning to appear only gcc and BCc 5.5 return 1266.
James
The result depends only on the FPU rounding mode. If it's "down", then 1266.9999999999997730 becomes indeed 1266...
include \masm32\include\masm32rt.inc
.code
start:
fld FP8(12.67)
push 100
fild dword ptr [esp]
fmul
fistp dword ptr [esp]
pop eax
inkey str$(eax)
exit
end start
Yes, using the FPU directly, and with everything at the defaults I get 1267, but by changing the rounding mode I can get 1266.
;==============================================================================
include \masm32\include\masm32rt.inc
;==============================================================================
FRC_NEAREST equ 0 ; or to even if equidistant (initialized state)
FRC_DOWN equ 400h ; toward -infinity
FRC_UP equ 800h ; toward +inifinity
FRC_TRUNCATE equ 0c00h ; toward zero
SETRC MACRO rc
IFNDEF __fpu__cw__
.data?
__fpu__cw__ dw ?
.code
ENDIF
fstcw __fpu__cw__
and __fpu__cw__, NOT 0c00h
or __fpu__cw__, rc
fldcw __fpu__cw__
ENDM
;==============================================================================
.data
.code
;==============================================================================
foo proc d:REAL8
printf("%.15G\n", d)
fld8 100.0
fld d
fmul
push eax
fistp DWORD PTR [esp]
pop eax
ret
foo endp
;==============================================================================
start:
;==============================================================================
invoke foo, FP8(12.67)
printf("%d\n",eax)
SETRC FRC_DOWN
invoke foo, FP8(12.67)
printf("%d\n",eax)
SETRC FRC_UP
invoke foo, FP8(12.67)
printf("%d\n",eax)
SETRC FRC_TRUNCATE
invoke foo, FP8(12.67)
printf("%d\n\n",eax)
inkey
exit
;==============================================================================
end start
12.67
1267
12.67
1266
12.67
1267
12.67
1266
In the C code you should be able to display the current FPU rounding mode, but then there is the possibility that one is using SSE2 instead of the FPU, and I can't conveniently test the effects of that ATM.
Quote from: jcfuller on March 26, 2013, 11:10:51 PMthis "c" example
According to the C (ISO) standard, floating point values are truncated when they are converted to integer values (return d*100). Therefore 1266 is conform result.
EDIT: the same applies for c++
I checked the rounding mode with fegetround() and gcc (MinGW), gcc(MinGwTDM) and PellesC all return zero but as stated
only gcc (MinGW) displays 1266 so it appears something else is causing the difference?
James
I can reproduce that by setting the precision to REAL10, rounding mode = nearest, doing the calculation and then change the rounding mode to truncate (round to zero). When storing the integer result, I get also 1266. If the precision is REAL8, 1267 is the result.
You might try to check the constant FLT_EVAL_METHOD (http://masm32.com/board/index.php?topic=881.msg7820#msg7820).
the only place i see "finit" on this page is where Michael mentions infinity :biggrin:
Thanks everyone.
As I never do any thing in moderation and can become obsessed at the drop of a ... I persisted to see why the difference?
This is my latest test piece. When compiled with the newest 32bit gcc 4.8.0 only the first one displays 1266
James
#include <stdio.h>
#include <math.h>
int foo (double);
int main (int,char**);
int foo (double d)
{
printf("%s% .15G\n","d = ",(double)d);
return (d*100); // 1266
//return ceil(d*100); // 1267
//return floor(d*100); // 1267
//return trunc(d*100); // 1267
//return round(d*100); // 1267
}
int main (int argc,char** argv)
{
int rv={0};
double d =12.67;
rv= foo( d);
printf("%s% d\n","rv = ",(int)rv);
return 0;
}
Quote from: dedndave on March 27, 2013, 01:53:12 AM
the only place i see "finit" on this page is where Michael mentions infinity :biggrin:
On program entry the FPU setting is NEAR, 53 bits. Here is an extended example.
include \masm32\include\masm32rt.inc
.code
start:
fld FP8(12.67)
push 100
fild dword ptr [esp]
fmul
fistp dword ptr [esp]
pop eax
print str$(eax), 13, 10 ; default setting is NEAR, 53 ->1267
push 847Fh ; DOWN, 24 ->1266
fldcw word ptr [esp]
pop eax
fld FP8(12.67)
push 100
fild dword ptr [esp]
fmul
fistp dword ptr [esp]
pop eax
print str$(eax), 13, 10
push 807Fh ; NEAR, 24 ->1267
fldcw word ptr [esp]
pop eax
fld FP8(12.67)
push 100
fild dword ptr [esp]
fmul
fistp dword ptr [esp]
pop eax
inkey str$(eax)
exit
end start
As it has already said, there are two problem: one is the different evaluation precision of the compilers and the other is the truncation on double->int conversion.
If you get 1266 or 1267, depends of the result of d*100: if the result is a bit larger or equal than 1267, it will be truncate to this number, otherwise it will get 1266.
Many C/C++ compiler use 53 Bit precision for evaluation, where the calculation's result seems to be a bit larger (or equal) than 1267 (due rounding of the result). GCC (32Bit) does use 64 Bit precision and the result is below 1267. Therefore GCC truncate the result to 1266 whereas the other compilers shows the intuitive result. For the other function (ceil, floor,...) it is that they have a double argument, thus GCC will do a conversion from 64 to 53 bit precision -> the value gets 1267 before the actual function is called.
Jochen, yes - that's how the windows os sets it up for us
but, the C compiler init code may set it up differently
I could remark that in all cases got 1267, so my opinion that it's gone from differ bitwise 12.67 constant inputing of differ runtimes. So basically in this topic need hex dump of mentioned values, so we'll never get clear answer.
As I said before MingwTDM,tcc,PellesC,FreeBASIC,PowerBASIC,VC all display 1267.
I am trying to figure out why gcc would or want to be different?
I guess I need to find out how to communicate with the builders.
James
Edit: gcc on 32bit linux displays 1266 also.
Hi James,
Quote from: jcfuller on March 27, 2013, 04:37:30 AM
As I said before MingwTDM,tcc,PellesC,FreeBASIC,PowerBASIC,VC all display 1267.
I am trying to figure out why gcc would or want to be different?
I guess I need to find out how to communicate with the builders.
James
Edit: gcc on 32bit linux displays 1266 also.
yes, and that's all right, because it has to do with the C standard. Qword did explain it very well. The answer to your question is in reply #3 and #9 of this thread.
Gunther
Gunther,
Yes I did read that a little fast and it does answer the what (thanks Qword) but it still does not answer the why when it appears almost all (clang on 32bit linux returns 1267 too) other compilers use 53 bit??
James
James,
I think there's another reason. Your operating system starts the good old FPU in the mode round to nearest. That makes sense. The C standard says, that the cast from float or double has to be done with the mode truncate. So, the following will happen: before the cast, the FPU control word is saved by the compiler, the control word will be changed to truncate, the cast will be made and after that, the old control word will be restored. That's the way.
The gcc uses this method very strict, other compilers - I'm not so sure. I hope that helps.
Gunther
I started out looking for a gcc command line option that would control the truncation. I did not find one, but I did determine that with gcc 4.7.2 I can push the result to 1267 with a cast.
#include <stdio.h>
#include <conio.h>
int foo (double);
int main (int,char**);
int foo (double d)
{
printf("d = %.15G\n", (double)d);
return (double)(d*100); /* was return d*100; */
}
int main (int argc,char** argv)
{
int rv={0};
rv= foo(12.67);
printf("rv = %d\n",(int)rv);
getch();
return 0;
}
Assembly output without cast:
.text
.globl _foo
.def _foo; .scl 2; .type 32; .endef
_foo:
LFB11:
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp
.cfi_def_cfa_register 5
subl $56, %esp
fldl 8(%ebp)
fstl 4(%esp)
movl $LC0, (%esp)
fstpl -40(%ebp)
call _printf
flds LC1
fldl -40(%ebp)
fmulp %st, %st(1)
fnstcw -10(%ebp)
movw -10(%ebp), %ax
orb $12, %ah
movw %ax, -12(%ebp)
fldcw -12(%ebp)
fistpl -16(%ebp)
fldcw -10(%ebp)
movl -16(%ebp), %eax
leave
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
LFE11:
Assembly output with cast:
.text
.globl _foo
.def _foo; .scl 2; .type 32; .endef
_foo:
LFB11:
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp
.cfi_def_cfa_register 5
subl $56, %esp
fldl 8(%ebp)
fstl 4(%esp)
movl $LC0, (%esp)
fstpl -40(%ebp)
call _printf
flds LC1
fldl -40(%ebp)
fmulp %st, %st(1)
fstpl -16(%ebp)
fldl -16(%ebp)
fnstcw -18(%ebp)
movw -18(%ebp), %ax
orb $12, %ah
movw %ax, -20(%ebp)
fldcw -20(%ebp)
fistpl -24(%ebp)
fldcw -18(%ebp)
movl -24(%ebp), %eax
leave
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
LFE11:
The difference is that the cast causes the result of the multiply to be stored to memory as a double and then reloaded, reducing the precision from 64 bits to 53 bits. I'm having trouble understanding how this could cause a rounding by truncation to round up, and beginning to suspect that a cast is not a reliable way to get around the problem.
In detail, if we do the operation by hand, using 12.67 saved as double constant, we get the following result with a precision of 64 bit:
9E5FFFFF FFFFFFC0 (<=fraction bits, high DWORD followed by low DWORD)
here we can see that a rounding to 53 bits will cause the result to become the value 9E6000... (==> gets 1267 after normalization of result)
However, for 64 bit precision this result is not an integer and is a bit below 1267, thus the language feature truncated the result to 1266.
Hi qWord,
good research. Thank you. :t I hadn't enough time yesterday to do that.
Gunther
Quote from: qWord on March 27, 2013, 12:35:51 PM
However, for 64 bit precision this result is not an integer and is a bit below 1267, thus the language feature truncated the result to 1266.
That applies to all three precision levels, except if you use REAL10 to fld the 12.67.
I guess the main lesson here is "don't trust your intuition, check the (C) language specification" ;-)
Thank you all for the information.
My main goal (if possible) is to get the same response from MinGW gcc/g++ as I do with all other compilers I test for my bc9 project.
I found this article:
http://www.linuxtopia.org/online_books/an_introduction_to_gcc/gccintro_70.html
So because I am out of my element here and only have a vague idea of what I'm doing I changed my code to this:
#include <stdio.h>
void
set_fpu (unsigned int mode)
{
asm ("fldcw %0" : : "m" (*&mode));
}
int foo (double d)
{
printf("%s% .15G\n","d = ",(double)d);
return (d*100);
}
int main (int argc,char** argv)
{
int rv={0};
set_fpu(0x27F);
rv= foo( 12.67);
printf("%s% d\n","rv = ",(int)rv);
return 0;
}
and now I get 12.67
compiled with MinGw 4.8.0 32bit: gcc -Wall rv.c -orv.exe
James
you can use FSTCW to store the current control word to a 16-bit memory location
alter the control bits, as desired
then, use FLDCW to load the control word back into the FPU
from Ray's FPU tutorial...
QuoteThe RC field (bits 11 and 10) or Rounding Control determines how the FPU will round results in one of four ways:
00 = Round to nearest, or to even if equidistant (this is the initialized state)
01 = Round down (toward -infinity)
10 = Round up (toward +infinity)
11 = Truncate (toward 0)
now, it says round to nearest is the init state - but that is after an FINIT
the compiler start-up code may alters this
http://www.ray.masmcode.com/ (http://www.ray.masmcode.com/)
rather than testing 1266/1267, just look at those 2 bits, as they are in each platform
Thanks Dave but all I really want is gcc to behave like all the other c/c++ compilers (if possible) for the duration of the app not for any specific call.
If this code does it and does not muck up something really important I may just use it.
James
A common soultion for your problem is this macro, which round FP values to nearest integer:
#define nearest(x)((x)>=0?(int)((x)+0.5):(int)((x)-0.5))
maybe have a look at gcc options ?
http://gcc.gnu.org/wiki/FloatingPointMath (http://gcc.gnu.org/wiki/FloatingPointMath)
I did find the gcc options page but didn't try all of them.
One that does work (sorta) is -fsingle-precision-constant but it does extend(?) d
d = 12.6700000762939
rv = 1267
I can't find out how to detect if the compiler is MinGW or MinGWTDM so I think the best direction is QWords last one with the nearest macro.
Thank you all for your time and suggestions.
James
James,
I think that Agner Fog has a solution for your problem, too.
Gunther
Quote from: Gunther on March 28, 2013, 09:10:05 AM
James,
I think that Agner Fog has a solution for your problem, too.
Gunther
Agner's place is a vast site. Do you have any particular item in mind?
James
I had another go at manipulating the command line options, and when I changed:
-std=c99
To:
-std=gnu89
Then the compiled code, without the (double) cast that I used previously, returned:
rv = 1267
http://tigcc.ticalc.org/doc/comopts.html#SEC6
I didn't test any further, so there may be other options that will produce the same result.
MichaelW,
I do not see that with a couple different MinGW builds 4.6.2, 4.7.2, 4.8.0 all return 1266
James
Just to muddy the waters a bit more - if compiled as a 64bit app (-m64) with no other options, 4.8.0 displays 1267
James
James,
I think it's inside the asmlib. (http://agner.org/optimize/#asmlib) It's 32 bit code, round to nearest and truncate. Please check it out.
Quote from: jcfuller on March 29, 2013, 01:00:59 AM
Just to muddy the waters a bit more - if compiled as a 64bit app (-m64) with no other options, 4.8.0 displays 1267
That's not very surprising for me: different developer teams, different implementations. You should write a simple test bed and contact the developers. It could be important.
Gunther
Quote from: Gunther on March 29, 2013, 03:36:14 AMI think it's inside the asmlib. (http://agner.org/optimize/#asmlib) It's 32 bit code, round to nearest and truncate. Please check it out.
why should he use a platform specific solution whereas the language itself hold a simple one?
I did find Round in asmlib but as QWord points out; From the asmlib-instructions.pdf:
Compilers with C99 or C++0x support have the identical functions lrint and lrintf
So by using return lrint(d*100) all should display the same.
James
conditional assembly ? :P
IFDEF lrint
return lrint(d*100)
ELSE
return d*100
ENDIF
just a thought
The issue for me was MinGW not working the same as most other compilers.
I don't like the original MinGW distro anyway as it does not statically link needed libraries as does TDM-GCC and the nuwen distro.
QWords nearest macro works on ALL 32/64 bit compilers I've tried so I am satisfied.
I don't know if this is an actual issue that should be reported? The c forum post from above seems to indicate it's perfectly valid.
James
James,
the point is that you must realize that the precision of calculation is implementation specific per definition of the standard - therefore all compiler do it the right way.
The problem was or is that you are not aware of this typecast-pitfall, which is, BTW, also present in many other HLLs (c++, c#, java,...).
James,
The attachment contains my test source, the batch file I compiled with, and the -std=gnu99 EXE. The compiler is reported as:
GNU C (GCC) version 4.7.2 (mingw32)
MichaelW,
drop the -Os . I cannot use any optimizations. I think it was gcc 4.5 where the -O started clobbering some of my bc9 library code.
James
Yep, the 1267 result was apparently an unintended side effect of the optimization.
Quote from: qWord on March 29, 2013, 05:23:51 AM
James,
the point is that you must realize that the precision of calculation is implementation specific per definition of the standard - therefore all compiler do it the right way.
The problem was or is that you are not aware of this typecast-pitfall, which is, BTW, also present in many other HLLs (c++, c#, java,...).
See also the prior thread http://masm32.com/board/index.php?topic=1155.msg11385#msg11385
Dave.
Thanks for the thread Dave.
I have decided on message #20 approach.
Would some kind person translate this to intel syntax please. I need it for Borland 5.5
void set_fpu (unsigned int mode)
{
asm ("fldcw %0" : : "m" (*&mode));
}
Thanks again for all your insights.
James
that code looks a little iffy :P
it does not appear to mask off the rounding control bits, which is what you really want
you do not want to alter the other bits (i guess they build the entire control word, external to the function)
i already mentioned the bits used
http://masm32.com/board/index.php?topic=1700.msg17328#msg17328 (http://masm32.com/board/index.php?topic=1700.msg17328#msg17328)
and, you can use the stack as a temporary holding place
push eax ;create a temporary variable on the stack
fstcw word ptr [esp]
fwait
pop eax ;AX now holds the current control word
;manipulate bits, as desired - bits 10 and 11 hold the rounding control
push eax ;put new control word on the stack
fldcw word ptr [esp]
fwait
pop eax ;clean up the stack
Dave,
Did you read the gcc information in the link from that message?
I am not one to question anyone on what is iffy but it sounded like exactly the same thing as QWord pointed out with gcc using 64bit precision as opposed to everyone (maybe not Borland ) using 53bit?
James
ok - the code is basically the same - you just need to alter bits 8 and 9 :biggrin:
again, from Ray's tutorial...
QuoteThe PC field ( bits 9 and 8 ) or Precision Control determines to what precision
the FPU rounds results after each arithmetic instruction in one of three ways:
00 = 24 bits (REAL4)
01 = Not used
10 = 53 bits (REAL8)
11 = 64 bits (REAL10) (this is the initialized state)
the "initialized state" means "after FINIT"
the OS hands it over to you in 53-bit precision, i think
as you know, the compiler start-up code may alter this
it might be worth mentioning that.....
the compiler code may depend on the precision and rounding control bits being set a specific way (especially precision)
how you leave these set may affect other calculations - possibly even generate exceptions
you may have to set it to the mode you like, perform your work, then set it back the way it was :P
one of the reasons i prefer ASM - lol
EDIT: it may be better to alter this behaviour with the proper command-line switch, if available
that way, the compiler code should work with you instead of against you
/Fa with vc++ express give the correct result at the first time
Quote from: ToutEnMasm on March 31, 2013, 03:08:27 AM
/Fa with vc++ express give the correct result at the first time
Yes I know only the MinGW gcc and Borland 5.5 compilers give a different result.
James
It looks like I'm going to have to do the procedure in Jwasm and link the obj if I want it bad enough because the free Borland 5.5 lacks the tasm assembler so no inline asm
James
You can also use the CRT to control the FPU:
http://msdn.microsoft.com/en-US/library/e9b52ceh(v=vs.80).aspx
I made an attempt to use __control87_2, but MSVCRT.DLL on my XP SP3 system does not export the function.
Also, from the MinGW float.h:
/*
MSVCRT.dll _fpreset initializes the control register to 0x27f,
the status register to zero and the tag word to 0FFFFh.
This differs from asm instruction finit/fninit which set control
word to 0x37f (64 bit mantissa precison rather than 53 bit).
By default, the mingw version of _fpreset sets fp control as
per fninit. To use the MSVCRT.dll _fpreset, include CRT_fp8.o when
building your application.
*/
And the lib directory includes CRT_fp8.o and CRT_fp10.o
MichaelW,
Thanks for the information but _fpreset() does not work or I misunderstand it.
My os is Win764.
After re-reading the page from the link in msg #20 I revised my code a bit.
I'm really not sure if these are related or not?
James
using MinGw 4.8.0
output with: gcc fputest.c -ofputest.exe
unexpected result
d = 12.67
rv = 1266
output with: gcc fputest.c -ofputest.exe -DFPRESET
unexpected result
d = 12.67
rv = 1266
output with: gcc fputest.c -ofputest.exe -DDOUBLE
comparison succeeds
d = 12.67
rv = 1267
using TDM-GCC
output with: gcc fputest.c -ofputest.exe
comparison succeeds
d = 12.67
rv = 1267
New test code:
#include <stdio.h>
#include <float.h>
#ifdef DOUBLE
void
set_fpu (unsigned int mode)
{
asm ("fldcw %0" : : "m" (*&mode));
}
#endif
int foo (double d)
{
printf("%s% .15G\n","d = ",(double)d);
return (d*100);
}
int main (int argc,char** argv)
{
int rv={0};
double d={0};
double c={0};
double a=3.0;
double b=7.0;
#ifdef DOUBLE
set_fpu (0x27F);
#endif
#ifdef FPRESET
_fpreset();
#endif
c= a/ b;
if(c==(a/b))
{
printf("%s\n","comparison succeeds");
}
else
{
printf("%s\n","unexpected result");
}
d= 12.67;
rv= foo( d);
printf("%s% d\n","rv = ",(int)rv);
return 0;
}
My bad I forgot to link in crt_fp8.o.
It appears to work as advertised
James