fpu example

Gunther · March 27, 2013, 09:57:23 AM

James,

I think there's another reason. Your operating system starts the good old FPU in the mode round to nearest. That makes sense. The C standard says, that the cast from float or double has to be done with the mode truncate. So, the following will happen: before the cast, the FPU control word is saved by the compiler, the control word will be changed to truncate, the cast will be made and after that, the old control word will be restored. That's the way.

The gcc uses this method very strict, other compilers - I'm not so sure. I hope that helps.

Gunther

MichaelW · March 27, 2013, 10:57:43 AM

I started out looking for a gcc command line option that would control the truncation. I did not find one, but I did determine that with gcc 4.7.2 I can push the result to 1267 with a cast.

Code Select


#include <stdio.h>
#include <conio.h>

int foo (double);
int main (int,char**);

int foo (double d)
{
    printf("d = %.15G\n", (double)d);
    return (double)(d*100);                  /* was return d*100; */
}

int main (int argc,char** argv)
{
    int rv={0};
    rv= foo(12.67);
    printf("rv = %d\n",(int)rv);
    getch();
    return 0;
}

Assembly output without cast:

Code Select


	.text
	.globl	_foo
	.def	_foo;	.scl	2;	.type	32;	.endef
_foo:
LFB11:
	.cfi_startproc
	pushl	%ebp
	.cfi_def_cfa_offset 8
	.cfi_offset 5, -8
	movl	%esp, %ebp
	.cfi_def_cfa_register 5
	subl	$56, %esp
	fldl	8(%ebp)
	fstl	4(%esp)
	movl	$LC0, (%esp)
	fstpl	-40(%ebp)
	call	_printf
	flds	LC1
	fldl	-40(%ebp)
	fmulp	%st, %st(1)
	fnstcw	-10(%ebp)
	movw	-10(%ebp), %ax
	orb	$12, %ah
	movw	%ax, -12(%ebp)
	fldcw	-12(%ebp)
	fistpl	-16(%ebp)
	fldcw	-10(%ebp)
	movl	-16(%ebp), %eax
	leave
	.cfi_restore 5
	.cfi_def_cfa 4, 4
	ret
	.cfi_endproc
LFE11:

Assembly output with cast:

Code Select


	.text
	.globl	_foo
	.def	_foo;	.scl	2;	.type	32;	.endef
_foo:
LFB11:
	.cfi_startproc
	pushl	%ebp
	.cfi_def_cfa_offset 8
	.cfi_offset 5, -8
	movl	%esp, %ebp
	.cfi_def_cfa_register 5
	subl	$56, %esp
	fldl	8(%ebp)
	fstl	4(%esp)
	movl	$LC0, (%esp)
	fstpl	-40(%ebp)
	call	_printf
	flds	LC1
	fldl	-40(%ebp)
	fmulp	%st, %st(1)
	fstpl	-16(%ebp)
	fldl	-16(%ebp)
	fnstcw	-18(%ebp)
	movw	-18(%ebp), %ax
	orb	$12, %ah
	movw	%ax, -20(%ebp)
	fldcw	-20(%ebp)
	fistpl	-24(%ebp)
	fldcw	-18(%ebp)
	movl	-24(%ebp), %eax
	leave
	.cfi_restore 5
	.cfi_def_cfa 4, 4
	ret
	.cfi_endproc
LFE11:

The difference is that the cast causes the result of the multiply to be stored to memory as a double and then reloaded, reducing the precision from 64 bits to 53 bits. I'm having trouble understanding how this could cause a rounding by truncation to round up, and beginning to suspect that a cast is not a reliable way to get around the problem.

qWord · March 27, 2013, 12:35:51 PM

In detail, if we do the operation by hand, using 12.67 saved as double constant, we get the following result with a precision of 64 bit:
9E5FFFFF FFFFFFC0 (<=fraction bits, high DWORD followed by low DWORD)
here we can see that a rounding to 53 bits will cause the result to become the value 9E6000... (==> gets 1267 after normalization of result)
However, for 64 bit precision this result is not an integer and is a bit below 1267, thus the language feature truncated the result to 1266.

Gunther · March 27, 2013, 06:09:27 PM

Hi qWord,

good research. Thank you. :t I hadn't enough time yesterday to do that.

Gunther

jj2007 · March 27, 2013, 06:15:53 PM

Quote from: qWord on March 27, 2013, 12:35:51 PM
However, for 64 bit precision this result is not an integer and is a bit below 1267, thus the language feature truncated the result to 1266.

That applies to all three precision levels, except if you use REAL10 to fld the 12.67.

I guess the main lesson here is "don't trust your intuition, check the (C) language specification" ;-)

jcfuller · March 27, 2013, 10:00:14 PM

Thank you all for the information.
My main goal (if possible) is to get the same response from MinGW gcc/g++ as I do with all other compilers I test for my bc9 project.
I found this article:
http://www.linuxtopia.org/online_books/an_introduction_to_gcc/gccintro_70.html
So because I am out of my element here and only have a vague idea of what I'm doing I changed my code to this:

Code Select



#include <stdio.h>

void
set_fpu (unsigned int mode)
{
asm ("fldcw %0" : : "m" (*&mode));
}


int foo (double d)
{
  printf("%s% .15G\n","d = ",(double)d);
  return (d*100);
}


int main (int argc,char** argv)
{
  int      rv={0};
  set_fpu(0x27F);
  rv= foo( 12.67);
  printf("%s% d\n","rv = ",(int)rv);
  return 0;
}

and now I get 12.67

compiled with MinGw 4.8.0 32bit: gcc -Wall rv.c -orv.exe

James

dedndave · March 27, 2013, 10:54:08 PM

you can use FSTCW to store the current control word to a 16-bit memory location
alter the control bits, as desired
then, use FLDCW to load the control word back into the FPU

from Ray's FPU tutorial...

QuoteThe RC field (bits 11 and 10) or Rounding Control determines how the FPU will round results in one of four ways:

00 = Round to nearest, or to even if equidistant (this is the initialized state)
01 = Round down (toward -infinity)
10 = Round up (toward +infinity)
11 = Truncate (toward 0)

now, it says round to nearest is the init state - but that is after an FINIT
the compiler start-up code may alters this

http://www.ray.masmcode.com/

rather than testing 1266/1267, just look at those 2 bits, as they are in each platform

jcfuller · March 27, 2013, 11:11:44 PM

Thanks Dave but all I really want is gcc to behave like all the other c/c++ compilers (if possible) for the duration of the app not for any specific call.
If this code does it and does not muck up something really important I may just use it.

James

qWord · March 28, 2013, 01:06:51 AM

A common soultion for your problem is this macro, which round FP values to nearest integer:

Code Select

#define nearest(x)((x)>=0?(int)((x)+0.5):(int)((x)-0.5))

dedndave · March 28, 2013, 01:13:37 AM

maybe have a look at gcc options ?

http://gcc.gnu.org/wiki/FloatingPointMath

jcfuller · March 28, 2013, 02:14:20 AM

I did find the gcc options page but didn't try all of them.
One that does work (sorta) is -fsingle-precision-constant but it does extend(?) d
d = 12.6700000762939
rv = 1267

I can't find out how to detect if the compiler is MinGW or MinGWTDM so I think the best direction is QWords last one with the nearest macro.

Thank you all for your time and suggestions.

James

Gunther · March 28, 2013, 09:10:05 AM

James,

I think that Agner Fog has a solution for your problem, too.

Gunther

jcfuller · March 28, 2013, 08:50:35 PM

Quote from: Gunther on March 28, 2013, 09:10:05 AM
James,

I think that Agner Fog has a solution for your problem, too.

Gunther

Agner's place is a vast site. Do you have any particular item in mind?

James

MichaelW · March 28, 2013, 10:03:39 PM

I had another go at manipulating the command line options, and when I changed:
-std=c99
To:
-std=gnu89
Then the compiled code, without the (double) cast that I used previously, returned:
rv = 1267

http://tigcc.ticalc.org/doc/comopts.html#SEC6

I didn't test any further, so there may be other options that will produce the same result.

jcfuller · March 29, 2013, 12:00:10 AM

MichaelW,
I do not see that with a couple different MinGW builds 4.6.2, 4.7.2, 4.8.0 all return 1266

James

The MASM Forum

News:

fpu example

Gunther

MichaelW

qWord

Gunther

jj2007

jcfuller

dedndave

jcfuller

qWord

dedndave

jcfuller

Gunther

jcfuller

MichaelW

jcfuller