News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

fpu example

Started by jcfuller, March 26, 2013, 11:10:51 PM

Previous topic - Next topic

Gunther

James,

I think there's another reason. Your operating system starts the good old FPU in the mode round to nearest. That makes sense. The C standard says, that the cast from float or double has to be done with the mode truncate. So, the following will happen: before the cast, the FPU control word is saved by the compiler, the control word will be changed to truncate, the cast will be made and after that, the old control word will be restored. That's the way.

The gcc uses this method very strict, other compilers - I'm not so sure. I hope that helps.

Gunther
You have to know the facts before you can distort them.

MichaelW

#16
I started out looking for a gcc command line option that would control the truncation. I did not find one, but I did determine that with gcc 4.7.2 I can push the result to 1267 with a cast.

#include <stdio.h>
#include <conio.h>

int foo (double);
int main (int,char**);

int foo (double d)
{
    printf("d = %.15G\n", (double)d);
    return (double)(d*100);                  /* was return d*100; */
}

int main (int argc,char** argv)
{
    int rv={0};
    rv= foo(12.67);
    printf("rv = %d\n",(int)rv);
    getch();
    return 0;
}

Assembly output without cast:

.text
.globl _foo
.def _foo; .scl 2; .type 32; .endef
_foo:
LFB11:
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp
.cfi_def_cfa_register 5
subl $56, %esp
fldl 8(%ebp)
fstl 4(%esp)
movl $LC0, (%esp)
fstpl -40(%ebp)
call _printf
flds LC1
fldl -40(%ebp)
fmulp %st, %st(1)
fnstcw -10(%ebp)
movw -10(%ebp), %ax
orb $12, %ah
movw %ax, -12(%ebp)
fldcw -12(%ebp)
fistpl -16(%ebp)
fldcw -10(%ebp)
movl -16(%ebp), %eax
leave
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
LFE11:

Assembly output with cast:

.text
.globl _foo
.def _foo; .scl 2; .type 32; .endef
_foo:
LFB11:
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp
.cfi_def_cfa_register 5
subl $56, %esp
fldl 8(%ebp)
fstl 4(%esp)
movl $LC0, (%esp)
fstpl -40(%ebp)
call _printf
flds LC1
fldl -40(%ebp)
fmulp %st, %st(1)
fstpl -16(%ebp)
fldl -16(%ebp)
fnstcw -18(%ebp)
movw -18(%ebp), %ax
orb $12, %ah
movw %ax, -20(%ebp)
fldcw -20(%ebp)
fistpl -24(%ebp)
fldcw -18(%ebp)
movl -24(%ebp), %eax
leave
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
LFE11:


The difference is that the cast causes the result of the multiply to be stored to memory as a double and then reloaded, reducing the precision from 64 bits to 53 bits. I'm having trouble understanding how this could cause a rounding by truncation to round up, and beginning to suspect that a cast is not a reliable way to get around the problem.
Well Microsoft, here's another nice mess you've gotten us into.

qWord

In detail, if we do the operation by hand, using 12.67 saved as double constant, we get the following result with a precision of 64 bit:
9E5FFFFF FFFFFFC0 (<=fraction bits, high DWORD followed by low DWORD)
here we can see that a rounding to 53 bits will cause the result to become the value  9E6000... (==> gets 1267 after normalization of result)
However, for 64 bit precision this result is not an integer and is a bit below 1267, thus the language feature truncated the result to 1266.
MREAL macros - when you need floating point arithmetic while assembling!

Gunther

Hi qWord,

good research. Thank you. :t I hadn't enough time yesterday to do that.

Gunther
You have to know the facts before you can distort them.

jj2007

Quote from: qWord on March 27, 2013, 12:35:51 PM
However, for 64 bit precision this result is not an integer and is a bit below 1267, thus the language feature truncated the result to 1266.

That applies to all three precision levels, except if you use REAL10 to fld the 12.67.

I guess the main lesson here is "don't trust your intuition, check the (C) language specification" ;-)

jcfuller

Thank you all for the information.
My main goal (if possible) is to get the same response from MinGW gcc/g++ as I do with all other compilers I test for my bc9 project.
I found this article:
http://www.linuxtopia.org/online_books/an_introduction_to_gcc/gccintro_70.html
So because I am out of my element here and only have a vague idea of what I'm doing I changed my code to this:


#include <stdio.h>

void
set_fpu (unsigned int mode)
{
asm ("fldcw %0" : : "m" (*&mode));
}


int foo (double d)
{
  printf("%s% .15G\n","d = ",(double)d);
  return (d*100);
}


int main (int argc,char** argv)
{
  int      rv={0};
  set_fpu(0x27F);
  rv= foo( 12.67);
  printf("%s% d\n","rv = ",(int)rv);
  return 0;
}




and now I get 12.67

compiled with MinGw 4.8.0 32bit: gcc -Wall rv.c -orv.exe

James

dedndave

you can use FSTCW to store the current control word to a 16-bit memory location
alter the control bits, as desired
then, use FLDCW to load the control word back into the FPU

from Ray's FPU tutorial...
QuoteThe RC field (bits 11 and 10) or Rounding Control determines how the FPU will round results in one of four ways:

00 = Round to nearest, or to even if equidistant (this is the initialized state)
01 = Round down (toward -infinity)
10 = Round up (toward +infinity)
11 = Truncate (toward 0)

now, it says round to nearest is the init state - but that is after an FINIT
the compiler start-up code may alters this

http://www.ray.masmcode.com/

rather than testing 1266/1267, just look at those 2 bits, as they are in each platform

jcfuller

Thanks Dave but all I really want is gcc to behave like all the other c/c++ compilers (if possible) for the duration of the app not for any specific call.
If this code does it and does not muck up something really important I may just use it.

James

qWord

A common soultion for your problem is this macro, which round FP values to nearest integer:
#define nearest(x)((x)>=0?(int)((x)+0.5):(int)((x)-0.5))
MREAL macros - when you need floating point arithmetic while assembling!

dedndave


jcfuller

I did find the gcc options page but didn't try all of  them.
One that does work (sorta) is -fsingle-precision-constant but it does extend(?) d
d =  12.6700000762939
rv =  1267

I can't find out how to detect if the compiler is MinGW or MinGWTDM so I think the best direction is QWords last one with the nearest macro.

Thank you all for your time and suggestions.

James

Gunther

James,

I think that Agner Fog has a solution for your problem, too.

Gunther
You have to know the facts before you can distort them.

jcfuller

Quote from: Gunther on March 28, 2013, 09:10:05 AM
James,

I think that Agner Fog has a solution for your problem, too.

Gunther

Agner's place is a vast site. Do you have any particular item in mind?

James

MichaelW

I had another go at manipulating the command line options, and when I changed:
-std=c99
To:
-std=gnu89
Then the compiled code, without the (double) cast that I used previously, returned:
rv = 1267

http://tigcc.ticalc.org/doc/comopts.html#SEC6

I didn't test any further, so there may be other options that will produce the same result.

Well Microsoft, here's another nice mess you've gotten us into.

jcfuller

MichaelW,
  I do not see that with a couple different MinGW builds 4.6.2, 4.7.2, 4.8.0 all return 1266

James