Author Topic: Rounding Mode in FPU question  (Read 2009 times)

raymond

  • Member
  • **
  • Posts: 234
    • Raymond's page
Re: Rounding Mode in FPU question
« Reply #15 on: February 14, 2019, 06:50:30 AM »
Whenever all the bits are set to 1 in the exponent field of a real number format, the value is designated as a NAN.

The number you used as an example would thus not qualify as a NAN. However, it would qualify as a denormalized number (http://www.ray.masmcode.com/tutorial/fpuchap2.htm#denormal) which is very different from a NAN.
Whenever you assume something, you risk being wrong half the time.
http://www.ray.masmcode.com/

AW

  • Member
  • *****
  • Posts: 2316
  • Let's Make ASM Great Again!
Re: Rounding Mode in FPU question
« Reply #16 on: February 14, 2019, 07:08:18 AM »
To be sincere I have no idea what guga is talking about and can't even see how he obtains the values he mentions.  :dazzled:

guga

  • Member
  • *****
  • Posts: 1043
  • Assembly is a state of art.
    • RosAsm
Re: Rounding Mode in FPU question
« Reply #17 on: February 14, 2019, 07:12:45 AM »
Hi Raymond, not sure if i´m fully undertanding this.

From what i understood on the documentation, the Word 07ED (last 2 bytes from the TenByte) represents the sign bit (79) and the exponent, right ? And so, they can´t be denormalized since the last Word (of the tenbyte) is not zero and the second Dword also is not zero. Demnormalized numbers are the ones where the second Dword of the tenbyte is zero and the last word is either zero or 08000, right ?

So, is this number qualified as an -inifinte ? I don´t understand why you say that this number is denormalized.
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

guga

  • Member
  • *****
  • Posts: 1043
  • Assembly is a state of art.
    • RosAsm
Re: Rounding Mode in FPU question
« Reply #18 on: February 14, 2019, 07:16:50 AM »
I´m quite confused right now. :dazzled: :dazzled:

I created a function to categorize the FPU numbers (Tenbyte in memory). Is this correct ?

Code: [Select]

    RealTenFPUNumberCategory
        This function identifies the Errors existant in a Real10 FPU data.

    Parameters:
        Float80Pointer - A pointer to a variable containing a TenByte (80 bit) value

    Returned Values:
   
        The function will return one of the following equates:

        Equate                          Value   Description
       
        SpecialFPU_PosValid             0       The FPU contains a valid positive number.
        SpecialFPU_NegValid             1       The FPU contains a valid negative number.
        SpecialFPU_PosSubNormal         2       The FPU produced a positive Subnormal (denormalized) number.
                                                Although it´s range is outside the range 3.6...e-4932, the number lost it´ precision, but it is still valid
                                                Ex: 0000 00000000 00000000
                                                    0000 00000000 FFFFFFFF
                                                    0000 00000000 00008000
                                                    0000 00000001 00000000
                                                    0000 FFFFFFFF FFFFFFFF
        SpecialFPU_NegSubNormal         3       The FPU produced a negative Subnormal (denormalized) number.
                                                Although it´s range is outside the range -3.6...e-4932, the number lost it´ precision, but it is still valid
                                                Ex: 8000 00000000 00000000 (0) (Negative zero must be considered only as zero)
                                                    8000 00000000 FFFFFFFF (-0.0000000156560127730E-4933)
                                                    8000 01000000 00000000 (-0.2626643080556322880E-4933)
                                                    8000 FFFFFFFF 00000001 (-6.7242062846585856000E-4932)
        SpecialFPU_QNAN                 4       QNAN - Quite NAN (Not a number)
        SpecialFPU_SNAN                 5       SNAN - Signaling NAN (Not a number)
        SpecialFPU_NegInf               6       Negative Infinite
        SpecialFPU_PosInf               7       Positive Infinite
        SpecialFPU_Indefinite           8       Indefinite
        SpecialFPU_SpecialIndefQNan     9       Special INDEFINITE QNAN
        SpecialFPU_SpecialIndefSNan     10      Special INDEFINITE SNAN
        SpecialFPU_SpecialIndefInfinite 11      Special INDEFINITE Infinite
__________________________________________________________________________

; Equates related to the function

[SpecialFPU_PosValid 0] ; The FPU contains a valid positive result
[SpecialFPU_NegValid 1] ; The FPU contains a valid negative result
[SpecialFPU_PosSubNormal 2] ; The FPU produced a positive Subnormal (denormalized) number. So, although it´ range is outside the range 3.6...e-4932, the number lost it´ precision, but it is still valid
[SpecialFPU_NegSubNormal 3] ; The FPU produced a negative Subnormal (denormalized) number. So, although it´ range is outside the range -3.6...e-4932, the number lost it´ precision, but it is still valid
[SpecialFPU_QNAN 4] ; QNAN
[SpecialFPU_SNAN 5] ; SNAN
[SpecialFPU_NegInf 6] ; Negative Infinite
[SpecialFPU_PosInf 7] ; Positive Infinite
[SpecialFPU_Indefinite 8] ; Indefinite
[SpecialFPU_SpecialIndefQNan 9] ; Special INDEFINITE QNAN
[SpecialFPU_SpecialIndefSNan 10] ; Special INDEFINITE SNAN
[SpecialFPU_SpecialIndefInfinite 11] ; Special INDEFINITE Infinite

; Updated in 12/02/2019
Proc RealTenFPUNumberCategory:
    Arguments @Float80Pointer
    Local @FPUErrorMode
    Uses edi, ebx


    mov ebx D@Float80Pointer
    mov D@FPUErrorMode SpecialFPU_PosValid

    ...If_And W$ebx+8 = 0, D$ebx+4 = 0 ; This is denormalized, but it is possible.
        ; 0000 00000000 00000000
        ; 0000 00000000 FFFFFFFF
        mov D@FPUErrorMode SpecialFPU_PosSubNormal
    ...Else_If_And W$ebx+8 = 0, D$ebx+4 > 0 ; This is Ok.
        ; 0000 00000001 00000000
        ; 0000 FFFFFFFF FFFFFFFF
        mov D@FPUErrorMode SpecialFPU_PosSubNormal
    ...Else_If_And W$ebx+8 > 0, W$ebx+8 < 07FFF; This is ok only if the fraction Dword is bigger or equal to 080000000
        .If D$ebx+4 < 080000000
            .Test_If D$ebx+4 040000000
                ; QNAN 40000000
                mov D@FPUErrorMode SpecialFPU_QNAN
            .Test_Else
                If_And D$ebx+4 > 0, D$ebx > 0
                    ; SNAN only if at least 1 bit is set
                    mov D@FPUErrorMode SpecialFPU_SNAN
                Else ; All fraction Bits are 0
                    ; Bit 15 is never reached. The bit is 0 from W$ebx+8
                    ; -INFINITE ; Bit15 = 0
                    mov D@FPUErrorMode SpecialFPU_NegInf
                End_If
            .Test_End
        .End_If
    ...Else_If W$ebx+8 = 07FFF; This is ok only if the fraction Dword is bigger or equal to 080000000
        .Test_If D$ebx+4 040000000
            ; QNAN 40000000
            mov D@FPUErrorMode SpecialFPU_QNAN
        .Test_Else
            If_And D$ebx+4 > 0, D$ebx > 0
                ; SNAN only if at least 1 bit is set
                mov D@FPUErrorMode SpecialFPU_SNAN
            Else ; All fraction Bits are 0
                ; Bit 15 is never reached. The bit is 0 from W$ebx+8
                ; -INFINITE ; Bit15 = 0
                mov D@FPUErrorMode SpecialFPU_NegInf
            End_If
        .Test_End
        ; Below is similar to W$ebx+8 = 0
    ...Else_If_And W$ebx+8 = 08000, D$ebx+4 = 0 ; This is denormalized, but possible.
        ; 8000 00000000 00000000 (0)
        ; 8000 00000000 FFFFFFFF (-0.0000000156560127730E-4933)
        mov D@FPUErrorMode SpecialFPU_NegSubNormal
    ...Else_If_And W$ebx+8 = 08000, D$ebx+4 > 0 ; This is Ok.
        ; 8000 01000000 00000000 (-0.2626643080556322880E-4933)
        ; 8000 FFFFFFFF 00000001 (-6.7242062846585856000E-4932)
        mov D@FPUErrorMode SpecialFPU_NegSubNormal
    ...Else_If_And W$ebx+8 > 08000, W$ebx+8 < 0FFFF; This is ok only if the fraction Dword is bigger or equal to 080000000
        .If D$ebx+4 < 080000000
            .Test_If D$ebx+4 040000000
                ; QNAN 40000000
                mov D@FPUErrorMode SpecialFPU_QNAN
            .Test_Else
                If_And D$ebx+4 > 0, D$ebx > 0
                    ; SNAN only if at least 1 bit is set
                    ;mov D$edi 'SNaN', B$edi+4 0
                    mov D@FPUErrorMode SpecialFPU_SNAN
                Else ; All fraction Bits are 0
                    ; Bit 15 is always reached. The bit is 1 from W$ebx+8
                    ; +INFINITE ; Bit15 = 1
                    ;mov D$edi '+INF', B$edi+4 0
                    mov D@FPUErrorMode SpecialFPU_PosInf
                End_If
            .Test_End
        .End_If

    ...Else_If W$ebx+8 = 0FFFF; This is to we identify indefined or other NAN values

        .If_And D$ebx+4 >= 040000000, D$ebx = 0
            ; INDEFINITE
            mov D@FPUErrorMode SpecialFPU_Indefinite
        .Else
            .Test_If D$ebx+4 040000000
                ; Special INDEFINITE QNAN 40000000
                mov D@FPUErrorMode SpecialFPU_SpecialIndefQNan
            .Test_Else
                If_And D$ebx+4 > 0, D$ebx > 0
                    ; Special INDEFINITE SNAN only if at least 1 bit is set
                    mov D@FPUErrorMode SpecialFPU_SpecialIndefSNan
                Else ; All fraction Bits are 0
                    ; Bit 15 is always reached. The bit is 1 from W$ebx+8
                    ; Special INDEFINITE +INFINITE ; Bit15 = 1
                    mov D@FPUErrorMode SpecialFPU_SpecialIndefInfinite
                End_If
            .Test_End
        .End_If
    ...End_If

    .If D@FPUErrorMode = SpecialFPU_PosValid
        Test_If B$ebx+9 0_80
            mov D@FPUErrorMode SpecialFPU_NegValid
        Test_End
    .End_If
    mov eax D@FPUErrorMode

EndP
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

AW

  • Member
  • *****
  • Posts: 2316
  • Let's Make ASM Great Again!
Re: Rounding Mode in FPU question
« Reply #19 on: February 14, 2019, 07:33:46 AM »
In my computer when I load the 0x07ED13900F0000000000 from memory I get +1.5836591183212933e-4322
I don't see neither a NaN nor a denormalized number. I can see a denormalized number if I look at the number backwards.

BTW, since we are talking about backwards, coffee spelled backwards is eeffoc. Just know that I don't give eeffoc until I've had some coffee.

K_F

  • Member
  • *****
  • Posts: 1505
  • Anybody out there?
Re: Rounding Mode in FPU question
« Reply #20 on: February 14, 2019, 08:35:24 AM »
A possible way of re-introducing the lost resolution is to run a 'random' generator at the 'resolution of lost bits', and then add that to the truncated Real4 (Real8).
A way of minimising FPU error propagation. ;)
Particular way of solving such type of problems, is to have ONE type for all calculations, I'm using this :
Code: [Select]
MACE TYPEDEF REAL10
True, Sometimes you have to convert down from Real10 ;)
'Sire, Sire!... the peasants are Revolting !!!'
'Yes, they are.. aren't they....'

guga

  • Member
  • *****
  • Posts: 1043
  • Assembly is a state of art.
    • RosAsm
Re: Rounding Mode in FPU question
« Reply #21 on: February 14, 2019, 10:14:35 AM »
AW, you are feeding an inverted order of the numbers, i presume. That´s why you are having different results from me and Raymond


I was talking about this sequence: db  0, 0, 0, 0, 0, 0F, 090, 013, 0ED, 07

The exponent and the sign are the last 2 bytes of the tenbyte.

I succeeded to convert the function to be ported to masm. Sorry for the lack os macros, i actually don´t remember the syntax in masm, but ported the whole functions if someone is interested in convert it to masm syntax more properly.

I´m not understanding why Raymond says it is denormalized, while on mine version and on ollydbg, this number shows as an -Infinite. What i´m doing wrong ?

Btw, the specifications of usage of the function is as follows:

Code: [Select]
    FloatToUString - Updated in 10/02/2019
   
    This function converts a FPU Number to decimal string (positive or negative) to be displayed on the debugger

    Parameters:
       
        Float80Pointer - A pointer to a variable containing a TenByte (80 bit) value to be converted to decimal string.
       
        DestinationPointer - A buffer to hold the converted string. The size of the buffer must be at least 32 bytes.
                             A maximum of 19 chars (including the dot) will be converted.
                             If the number cannot be converted, the buffer will contain a string representing the proper
                             category of the FPU error, such as: QNAN, SNAN, Infinite, Indefinite.
       
        TruncateBytes - The total amount of bytes to truncate. You can truncate a maximum of 3 numbers
                        on the end of a string. The truncation is to prevent rounding errors of some
                        compilers when tries to convert Floating Point Units.
                        For Terabytes we can discard the last 3 Bytes that are related to a diference in the error mode.
                        But, if you want to maintain accuracy, leave this parameter as 0.
       
        AddSubNormalMsg - A flag to enable append a warning message at the end of the number stored on the buffer at the DestinationPointer,
                          labeling it as a "(Bad)" number (positive or negative) meaning that the number is way too below the limit for
                          the FPU TenByte and is decreasing precision.
                           
                          To append the warning message, set this flag to &TRUE. Otherwise, set it to &FALSE.
                                                     
                          The 80-bit floating point format has a range (including subnormals) from approximately 3.65e-4951 to 1.18e4932.
                          Normal numbers from within the range 3.36210314311209208e-4932 to 1.18e4932) keeps their accuracy.
                          Numbers below that limit are called "SubNormal" (or denormalized) on a range from 3.65e-4951 to 3.362103...e-4932
                           
                          All subnormal numbers decreases their precision as they are going away from the limit of a normal number.
                          It have an approximated amount of 2^63 subnormal numbers that are way too close to zero and decreasing precision.
                           
                          The limit of a normal number is: 3.36210314311209208e-4932 (equivalent to declare it as: "FPULimit: D$ 0, 080000000, W$ 01")
                         
                         

    Return Values:

        The function will return one of the following equates:

        Equate                          Value   Description
       
        SpecialFPU_PosValid             0       The FPU contains a valid positive number.
        SpecialFPU_NegValid             1       The FPU contains a valid negative number.
        SpecialFPU_PosSubNormal         2       The FPU produced a positive Subnormal (denormalized) number.
                                                Although it´s range is outside the range 3.6...e-4932, the number lost it´ precision, but it is still valid
                                                Ex: 0000 00000000 00000000
                                                    0000 00000000 FFFFFFFF
                                                    0000 00000000 00008000
                                                    0000 00000001 00000000
                                                    0000 FFFFFFFF FFFFFFFF
        SpecialFPU_NegSubNormal         3       The FPU produced a negative Subnormal (denormalized) number.
                                                Although it´s range is outside the range -3.6...e-4932, the number lost it´ precision, but it is still valid
                                                Ex: 8000 00000000 00000000 (0) (Negative zero must be considered only as zero)
                                                    8000 00000000 FFFFFFFF (-0.0000000156560127730E-4933)
                                                    8000 01000000 00000000 (-0.2626643080556322880E-4933)
                                                    8000 FFFFFFFF 00000001 (-6.7242062846585856000E-4932)
        SpecialFPU_QNAN                 4       QNAN - Quite NAN (Not a number)
        SpecialFPU_SNAN                 5       SNAN - Signaling NAN (Not a number)
        SpecialFPU_NegInf               6       Negative Infinite
        SpecialFPU_PosInf               7       Positive Infinite
        SpecialFPU_Indefinite           8       Indefinite
        SpecialFPU_SpecialIndefQNan     9       Special INDEFINITE QNAN
        SpecialFPU_SpecialIndefSNan     10      Special INDEFINITE SNAN
        SpecialFPU_SpecialIndefInfinite 11      Special INDEFINITE Infinite

Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

jj2007

  • Member
  • *****
  • Posts: 9687
  • Assembler is fun ;-)
    • MasmBasic
Re: Rounding Mode in FPU question
« Reply #22 on: February 14, 2019, 12:15:43 PM »
For exploring the mysteries of REAL10 notation 8)

picked=5
ife picked
  NaN   REAL10 0x07ED13900F0000000000
elseif picked eq 1
  NaN   REAL10 0x00000000000F9013ED07
elseif picked eq 2
  NaN   db 07, 0EDh, 13h, 90h, 0Fh, 00, 00, 00, 00, 00
elseif picked eq 3
  NaN   db 0, 0, 1, 0, 0, 0, 0, 0, 0, 0
elseif picked eq 4
  NaN   REAL10 0x00000000000000000001
elseif picked eq 5
  NaN   db 1, 0, 0, 0, 0, 0, 0, 0, 0, 0
elseif picked eq 6
  NaN   db 00, 00h, 00h, 00h, 00h, 00h, 00, 80h, 00, 00
endif
...
  int 3
  fld REAL10 ptr NaN
  fld FP10(1.0e4900)
  fmul


picked 5 produces 3.6452e-51 after multiplying with 1.0e4900 - remember what Wiki says about the range?

Specific Watcom mysteries: I expected picked 4 & 5 to be identical. What exactly does the REAL10 0x111 notation mean?

raymond

  • Member
  • **
  • Posts: 234
    • Raymond's page
Re: Rounding Mode in FPU question
« Reply #23 on: February 14, 2019, 12:43:15 PM »
Hi Raymond, not sure if i´m fully undertanding this.

From what i understood on the documentation, the Word 07ED (last 2 bytes from the TenByte) represents the sign bit (79) and the exponent, right ? And so, they can´t be denormalized since the last Word (of the tenbyte) is not zero and the second Dword also is not zero. Demnormalized numbers are the ones where the second Dword of the tenbyte is zero and the last word is either zero or 08000, right ?

Maybe 'right' or 'wrong' depending on what you happen to consider the last 2 bytes. You would be right if they are the last 2 bytes in memory; wrong if you consider the 80 bits as a whole number.

I think we are getting confused with the actual location of the 10 bytes. The most significant byte of a TBYTE would have the sign bit followed by the 7 most significant bits of the biased exponent. The next byte would have the remaining 8 bits of the 15-bit biased exponent.

In the case we are discussing, the most significant byte appears to be either 7 or ED (the 7 indicating a positive number or the ED indicating a negative number), with the other byte being the remaining bits of the exponent depending on how your display of W$ 07ED is interpreted. Either way, such could never be considered as a NAN.
Whenever you assume something, you risk being wrong half the time.
http://www.ray.masmcode.com/

raymond

  • Member
  • **
  • Posts: 234
    • Raymond's page
Re: Rounding Mode in FPU question
« Reply #24 on: February 14, 2019, 02:44:29 PM »
I may have solved your mystery. If you look closely at the description of a REAL10, you will notice the following:

As opposed to the REAL4 and REAL8 formats, the first bit of the number is explicitly included in the significand field and followed by the fraction bits f1, f2, etc.

Therefore, if bit #63 of a REAL10 in memory is not set to a 1, it would not be considered as a valid extended precision "float". I would neither have the format of a NAN if the exponent bits are not all 1s, nor of a valid denormalized number if the exponent bits are not all 0s. The FPU would simply refuse to process it.

Such could happen easily if you try using a float from memory and referring to it as a REAL10 when it is not.
Whenever you assume something, you risk being wrong half the time.
http://www.ray.masmcode.com/

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 6660
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: Rounding Mode in FPU question
« Reply #25 on: February 14, 2019, 03:55:07 PM »
 :biggrin:

> coffee spelled backwards is eeffoc.

I wonder if this is 00EEFF0Ch  :P
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :skrewy:

AW

  • Member
  • *****
  • Posts: 2316
  • Let's Make ASM Great Again!
Re: Rounding Mode in FPU question
« Reply #26 on: February 14, 2019, 06:50:05 PM »
AW, you are feeding an inverted order of the numbers, i presume. That´s why you are having different results from me and Raymond
I was talking about this sequence: db  0, 0, 0, 0, 0, 0F, 090, 013, 0ED, 07
I am talking about the same thing as you are.
0x07ED13900F0000000000 is laid out in memory exactly like that in all little endian systems.
The number can't be a NaN because the exponent is not all ones. It can't be subnormal because the exponent is not all zeros. Is that difficult?

jj2007

  • Member
  • *****
  • Posts: 9687
  • Assembler is fun ;-)
    • MasmBasic
Re: Rounding Mode in FPU question
« Reply #27 on: February 14, 2019, 08:09:54 PM »
0x07ED13900F0000000000 is laid out in memory exactly like that in all little endian systems.
Code: [Select]
include \masm32\MasmBasic\MasmBasic.inc ; download
align 16
R10a REAL10 0x11223344556677889900
db 0AAh, 0BBh, 0CCh, 0DDh, 0EEh, 0FFh ; fill with AA BB CC DD EE FF
R10b db 00h, 99h, 88h, 77h, 66h, 55h, 44h, 33h, 22h, 11h
db 0AAh, 0BBh, 0CCh, 0DDh, 0EEh, 0FFh ; fill with AA BB CC DD EE FF
R8a REAL8 0x1122334455667788
REAL8 0
R8b db 88h, 77h, 66h, 55h, 44h, 33h, 22h, 11h
REAL8 0

  Init
  Inkey HexDump$(offset R10a, 64, notext)
EndOfCode

Result:
Code: [Select]
00000000  00 00 00 00 00 32 11 EF 1D 40 AA BB CC DD EE FF
00000010  00 99 88 77 66 55 44 33 22 11 AA BB CC DD EE FF
00000020  00 00 00 E0 9D 59 D5 41 00 00 00 00 00 00 00 00
00000030  88 77 66 55 44 33 22 11 00 00 00 00 00 00 00 00

HSE

  • Member
  • *****
  • Posts: 1106
  • <AMD>< 7-32>
Re: Rounding Mode in FPU question
« Reply #28 on: February 15, 2019, 01:44:50 AM »
It's in this way :
Code: [Select]
include \masm32\MasmBasic\MasmBasic.inc ; download
align 16
R10a REAL10 11223344556677889900r
db 0AAh, 0BBh, 0CCh, 0DDh, 0EEh, 0FFh ; fill with AA BB CC DD EE FF
R10b db 00h, 99h, 88h, 77h, 66h, 55h, 44h, 33h, 22h, 11h
db 0AAh, 0BBh, 0CCh, 0DDh, 0EEh, 0FFh ; fill with AA BB CC DD EE FF
R8a REAL8 1122334455667788r
REAL8 0.0
R8b db 88h, 77h, 66h, 55h, 44h, 33h, 22h, 11h
REAL8 0.0
NaN REAL8  7FF8000000000000r     ;>> I use a lot this

  Init
  Inkey HexDump$(offset R10a, 64+8, notext)
EndOfCode

Result:
Code: [Select]
00000000  00 99 88 77 66 55 44 33 22 11 AA BB CC DD EE FF
00000010  00 99 88 77 66 55 44 33 22 11 AA BB CC DD EE FF
00000020  88 77 66 55 44 33 22 11 00 00 00 00 00 00 00 00
00000030  88 77 66 55 44 33 22 11 00 00 00 00 00 00 00 00
00000040  00 00 00 00 00 00 F8 7F

jj2007

  • Member
  • *****
  • Posts: 9687
  • Assembler is fun ;-)
    • MasmBasic
Re: Rounding Mode in FPU question
« Reply #29 on: February 15, 2019, 03:07:50 AM »
It's in this way

Yep, the "r" thing, thanks for reminding me. Qword used this notation years ago.