Real4 to Real8 conversion problem

jj2007 · July 24, 2017, 08:38:42 PM

This function expects a REAL4 on the stack, and a pointer to a destination REAL8:

R4ToR8 proc uses esi ecx rx4:REAL4, pR8
  mov esi, rx4	; load real4 into one register
  mov ecx, esi
  shr esi, 23
  and esi, 11111111b	; 8 bits expo
  add esi, 1023-127	; new bias
  and ecx, 11111111111111111111111b	; 23 bits mantissa
  if 0
	shl ecx, 1	; seems to "improve" the results but...
  endif
  shl esi, 20		; 52-32
  or esi, ecx		; merge expo and mantissa
  test byte ptr rx4[3], 128	; is passed REAL4 negative?
  .if Sign?
	or esi, 10000000000000000000000000000000b	; make REAL8 negative, too
  .endif
  mov edx, pR8
  and dword ptr [edx], 0	; zero out the less significant bits of the mantissa
  mov dword ptr [edx+4], esi
  ret
R4ToR8 endp

Looks simple enough, but this loop...

Code Select

  For_ fct=-1.2 To 1.2 Step 0.1
	invoke R4ToR8, fct, addr r8new
	Print Str$("R4: %5f", fct)
	Print At(16, Locate(y)) Str$(" R8: %8f\n", r8new)
  Next

... produces the following:

Code Select

R4: -1.2000      R8: -1.6000004
R4: -1.1000      R8: -1.8000002
R4: -1.0000      R8: -1.0000000
R4: -0.90000     R8: -0.69999981
R4: -0.80000     R8: -0.89999962
R4: -0.70000     R8: -1.1999989
R4: -0.60000     R8: -1.5999985
R4: -0.50000     R8: -1.9999971
R4: -0.40000     R8: -0.44999933
R4: -0.30000     R8: -0.39999938
R4: -0.20000     R8: -0.22499943
R4: -0.100000    R8: -1.7999907
R4: 7.3016e-08   R8: 1.0728837e-07
R4: 0.10000      R8: 1.8000097
R4: 0.20000      R8: 0.22500062
R4: 0.30000      R8: 0.40000057
R4: 0.40000      R8: 0.45000052
R4: 0.50000      R8: 0.50000048
R4: 0.60000      R8: 1.6000013
R4: 0.70000      R8: 1.2000017
R4: 0.80000      R8: 0.90000105
R4: 0.90000      R8: 0.70000124
R4: 1.0000       R8: 1.0000010
R4: 1.1000       R8: 1.8000011
R4: 1.2000       R8: 1.6000013

The results are partly wrong. Anybody to spot the error in my code or in my logic? I am pretty bad at these things, so bear with me if I did something stupid.

And of course, I know that fld R4, fstp R8 is shorter and faster. This is intended to convert between REAL10 and REAL16 - if it is possible. Full MB source is attached.

aw27 · July 24, 2017, 09:04:37 PM

I did not run your code but it appears that you forgot that real8 is a qword.

shl esi, 20 ; 52-32

This looks wrong.

jj2007 · July 24, 2017, 09:46:58 PM

The shl puts the expo into its position, and that part works.
No, the issue is here:
Add esi, 1023-127
Shr ecx, 3
And ecx, ...
Mov eax, rx4
Shl eax, 29
Mov (edx), eax

Wotks! Internet down here, posting from a mobile. Horrible

mabdelouahab · July 24, 2017, 10:50:42 PM

I think the problem is not in the R4ToR8 proc, Maybe the problem is in For_ or Print with r4 , Try to out it as binary and see

aw27 · July 24, 2017, 11:13:35 PM

Code Select


R4ToR8 proc uses ebx esi ecx rx4:REAL4, pR8
  mov esi, rx4	; load real4 into one register
  mov ecx, esi
  
  shr esi, 23
  and esi, 11111111b	; 8 bits expo
  add esi, 1023-127	; new bias
  and ecx, 11111111111111111111111b	; 23 bits mantissa
  if 0
	shl ecx, 1	; seems to "improve" the results but...
  endif
  shl esi, 20		; 52-32
  mov ebx, ecx
  shr ebx, 3
  or esi, ebx
;  and ecx, 111b no need
  shl ecx, 29
  
  test byte ptr rx4[3], 128	; is passed REAL4 negative?
  .if Sign?
	or esi, 10000000000000000000000000000000b	; make REAL8 negative, too
  .endif
  mov edx, pR8
  mov dword ptr [edx], ecx	; fixed on 28th August (previously was "and dword ptr [edx], ecx")
  mov dword ptr [edx+4], esi
  ret
R4ToR8 endp

This works.

jj2007 · July 25, 2017, 10:09:47 AM

Internet is back :lol:

Here is the correct version with the changes announced above:

Code Select

R4ToR8 proc uses esi ecx rx4:REAL4, pR8
  mov esi, rx4	; load real4 into one register
  mov ecx, esi
  shr esi, 23
  and esi, 11111111b	; 8 bits expo
  add esi, 1023-127	; new bias
  mov eax, ecx	; save last 3 bits
  shr ecx, 3		; bingo!
  and ecx, 11111111111111111111111b	; 23 bits mantissa
  shl esi, 20		; 52-32
  or esi, ecx		; merge expo and mantissa
  test byte ptr rx4[3], 128	; is passed REAL4 negative?
  .if Sign?
	or esi, 10000000000000000000000000000000b	; make REAL8 negative, too
  .endif
  mov edx, pR8
  mov eax, rx4
  shl eax, 32-3
  mov dword ptr [edx], eax	; put the less significant bits of the mantissa
  mov dword ptr [edx+4], esi
  ret
R4ToR8 endp

Project attached, including José's almost correct version. The exe shows the difference between the original and the converted number, which should be zero, of course. Start from a DOS prompt, I forgot to put an inkey 8)

aw27 · July 25, 2017, 02:24:15 PM

Quote
Project attached, including José's almost correct version.

You are so funny, I can't spot any error in my code but in yours

1)
mov eax, ecx   ; save last 3 bits
You save but don't reuse the saved value.
2)
shr ecx, 3      ; bingo!
and ecx, 11111111111111111111111b   ; 23 bits mantissa
You use a 23-bit bitmask after you shift right to keep only 20 bits.

jj2007 · July 25, 2017, 07:59:31 PM

Thank you, José - you are right, the mov eax, ecx was redundant. Here is the final version:

R4ToR8 proc rx4:REAL4, pR8 ; version jj
mov edx, rx4 ; load real4 into one register
mov eax, edx ; make a copy
shr edx, 23
movzx edx, dl ; 8 bits expo
add edx, 1023-127 ; new bias
shr eax, 3 ; bingo!
and eax, 11111111111111111111111b ; 23 bits mantissa
shl edx, 20 ; 52-32
or edx, eax ; merge expo and mantissa
test byte ptr rx4[3], 128 ; is passed REAL4 negative?
.if Sign?
bts edx, 31 ; make REAL8 negative, too
.endif
push edx
mov eax, pR8
mov edx, rx4
shl edx, 32-3
mov dword ptr [eax], edx ; put the less significant bits of the mantissa
pop dword ptr [eax+4] ; expo and most significant bits of mantissa
ret
R4ToR8 endp

And here is the relevant output:

Code Select

-- R4ToR8 jj --
R4: -123456.789  R8: -123456.789  diff: 0.0     10100000000000000000000000000000.11000000111111100010010000001100
R4: -111111.789  R8: -111111.789  diff: 0.0     10100000000000000000000000000000.11000000111110110010000001111100
R4: -98766.7891  R8: -98766.7891  diff: 0.0     10100000000000000000000000000000.11000000111110000001110011101100
R4: -86421.7891  R8: -86421.7891  diff: 0.0     10100000000000000000000000000000.11000000111101010001100101011100
R4: -74076.7891  R8: -74076.7891  diff: 0.0     10100000000000000000000000000000.11000000111100100001010111001100
R4: -61731.7891  R8: -61731.7891  diff: 0.0     01000000000000000000000000000000.11000000111011100010010001111001
R4: -49386.7891  R8: -49386.7891  diff: 0.0     01000000000000000000000000000000.11000000111010000001110101011001
R4: -37041.7891  R8: -37041.7891  diff: 0.0     01000000000000000000000000000000.11000000111000100001011000111001
R4: -24696.7891  R8: -24696.7891  diff: 0.0     10000000000000000000000000000000.11000000110110000001111000110010
R4: -12351.7891  R8: -12351.7891  diff: 0.0     00000000000000000000000000000000.11000000110010000001111111100101
R4: -6.78906250  R8: -6.78906250  diff: 0.0     00000000000000000000000000000000.11000000000110110010100000000000
R4: 12338.2109   R8: 12338.2109   diff: 0.0     00000000000000000000000000000000.01000000110010000001100100011011
R4: 24683.2109   R8: 24683.2109   diff: 0.0     10000000000000000000000000000000.01000000110110000001101011001101
R4: 37028.2109   R8: 37028.2109   diff: 0.0     11000000000000000000000000000000.01000000111000100001010010000110
R4: 49373.2109   R8: 49373.2109   diff: 0.0     11000000000000000000000000000000.01000000111010000001101110100110
-- R4ToR8 aw27 --
R4: -123456.789  R8: -123456.781  diff: 0.0078  10000000000000000000000000000000.11000000111111100010010000001100
R4: -111111.789  R8: -111111.781  diff: 0.0078  10000000000000000000000000000000.11000000111110110010000001111100
R4: -98766.7891  R8: -98766.7812  diff: 0.0078  10000000000000000000000000000000.11000000111110000001110011101100
R4: -86421.7891  R8: -86421.7812  diff: 0.0078  10000000000000000000000000000000.11000000111101010001100101011100
R4: -74076.7891  R8: -74076.7812  diff: 0.0078  10000000000000000000000000000000.11000000111100100001010111001100
R4: -61731.7891  R8: -61731.7812  diff: 0.0078  00000000000000000000000000000000.11000000111011100010010001111001
R4: -49386.7891  R8: -49386.7812  diff: 0.0078  00000000000000000000000000000000.11000000111010000001110101011001
R4: -37041.7891  R8: -37041.7812  diff: 0.0078  00000000000000000000000000000000.11000000111000100001011000111001
R4: -24696.7891  R8: -24696.7812  diff: 0.0078  00000000000000000000000000000000.11000000110110000001111000110010
R4: -12351.7891  R8: -12351.7891  diff: 0.0     00000000000000000000000000000000.11000000110010000001111111100101
R4: -6.78906250  R8: -6.78906250  diff: 0.0     00000000000000000000000000000000.11000000000110110010100000000000
R4: 12338.2109   R8: 12338.2109   diff: 0.0     00000000000000000000000000000000.01000000110010000001100100011011
R4: 24683.2109   R8: 24683.2031   diff: -0.0078 00000000000000000000000000000000.01000000110110000001101011001101
R4: 37028.2109   R8: 37028.1875   diff: -0.023  00000000000000000000000000000000.01000000111000100001010010000110
R4: 49373.2109   R8: 49373.1875   diff: -0.023  00000000000000000000000000000000.01000000111010000001101110100110
62      bytes for R4ToR8
72      bytes for R4ToR8aw

Project attached. As before, "diff" is the difference between the Real4 and the converted Real8, and it should be zero, of course. This time there is an Inkey at the end, so you can start it directly from the archive (old versions included for comparison).

Adamanteus · July 26, 2017, 12:24:12 AM

Demormilised values looks like not processed, so why not use coprocessor for it :

Code (asm) Select


R4ToR8 proc rx4:REAL4, pR8 
 fld rx4
 mov ebx, pr8
 fstp dword ptr [ebx]
  ret
R4ToR8 endp

jj2007 · July 26, 2017, 12:34:40 AM

Quote from: Adamanteus on July 26, 2017, 12:24:12 AMwhy not use coprocessor for it

Quote from: jj2007 on July 24, 2017, 08:38:42 PMAnd of course, I know that fld R4, fstp R8 is shorter and faster.

jj2007 · July 31, 2017, 09:08:11 AM

In the meantime, I have polished the routines a little bit, and made them fit for conversions between REAL16 and other numbers. The syntax is hopefully simple enough - FpuPush somequad and Quad(somenumber):

include \masm32\MasmBasic\MasmBasic.inc
SetGlobals REAL16 quadNumber
SetGlobals MyFloat:REAL4=1234567890.1234567890
SetGlobals MyDouble:REAL8=1234567890.1234567890
SetGlobals MyReal10:REAL10=1234567890.1234567890
Init quad
MovVal quadNumber, "1234567890.123456789012345678901234567890" ; convert string to REAL16
PrintLine "Real16= ", Tb$, Quad$(quadNumber) ; print a REAL16 at full 33 digit precision
FpuPush quadNumber ; convert REAL16 and push it on the FPU
fstp MyReal10 ; save as REAL10
PrintLine Str$("Real10=\t\t%Jf", MyReal10) ; and print it
PrintLine "Real10= ", Tb$, Quad$(Quad(MyReal10)) ; convert REAL10 to REAL16 and print it
PrintLine "Real8= ", Tb$, Quad$(Quad(MyDouble)) ; convert REAL8 to REAL16 and print it
movups quadNumber, Quad(MyFloat) ; convert REAL4 to REAL16
PrintLine "Real4= ", Tb$, Quad$(quadNumber) ; and print it
Inkey "ok?"
EndOfCode

For the time being, building this code requires the 30 July beta (and running it needs two GCC DLLs).

Output:

Code Select

Real16=         1.23456789012345678901234567890123e+09
Real10=         1234567890.123456789
Real10=         1.23456789012345678894780576229095e+09
Real8=          1.23456789012345671653747558593750e+09
Real4=          1.23456793600000000000000000000000e+09

jj2007 · August 30, 2017, 03:31:56 AM

José has posted a new version of his code, and it works now :t

Code Select

-- R4ToR8 jj --
423 ms
-- R4ToR8 aw27 --
405 ms
-- R4ToR8 jj --
R4: -123456.789      R8: -123456.789       diff: 0.0
R4: -111111.789      R8: -111111.789       diff: 0.0
R4: -98766.7891      R8: -98766.7891       diff: 0.0
R4: -86421.7891      R8: -86421.7891       diff: 0.0
R4: -74076.7891      R8: -74076.7891       diff: 0.0
R4: -61731.7891      R8: -61731.7891       diff: 0.0
R4: -49386.7891      R8: -49386.7891       diff: 0.0
R4: -37041.7891      R8: -37041.7891       diff: 0.0
R4: -24696.7891      R8: -24696.7891       diff: 0.0
R4: -12351.7891      R8: -12351.7891       diff: 0.0
R4: -6.78906250      R8: -6.78906250       diff: 0.0
R4: 12338.2109       R8: 12338.2109        diff: 0.0
R4: 24683.2109       R8: 24683.2109        diff: 0.0
R4: 37028.2109       R8: 37028.2109        diff: 0.0
R4: 49373.2109       R8: 49373.2109        diff: 0.0
-- R4ToR8 aw27 --
R4: -123456.789      R8: -123456.789       diff: 0.0
R4: -111111.789      R8: -111111.789       diff: 0.0
R4: -98766.7891      R8: -98766.7891       diff: 0.0
R4: -86421.7891      R8: -86421.7891       diff: 0.0
R4: -74076.7891      R8: -74076.7891       diff: 0.0
R4: -61731.7891      R8: -61731.7891       diff: 0.0
R4: -49386.7891      R8: -49386.7891       diff: 0.0
R4: -37041.7891      R8: -37041.7891       diff: 0.0
R4: -24696.7891      R8: -24696.7891       diff: 0.0
R4: -12351.7891      R8: -12351.7891       diff: 0.0
R4: -6.78906250      R8: -6.78906250       diff: 0.0
R4: 12338.2109       R8: 12338.2109        diff: 0.0
R4: 24683.2109       R8: 24683.2109        diff: 0.0
R4: 37028.2109       R8: 37028.2109        diff: 0.0
R4: 49373.2109       R8: 49373.2109        diff: 0.0
73      bytes for R4ToR8jj
72      bytes for R4ToR8aw

Two projects attached, #1 as shown above, #2 includes timings of the MasmBasic RealX->Real16 conversion macro, Quad().

The MASM Forum

News:

Real4 to Real8 conversion problem

jj2007

aw27

jj2007

mabdelouahab

aw27

jj2007

aw27

jj2007

Adamanteus

jj2007

jj2007

jj2007