The MASM Forum

General => The Workshop => Topic started by: guga on May 23, 2025, 10:17:45 PM

Title: FputoString format
Post by: guga on May 23, 2025, 10:17:45 PM
Hi Guys

I`m updating a FputoString function and have some doubts about the scientific output format. In normal printf it seems that when dealing with Floating points, 0 can be represented as: 0.00000e+00, 0.00000e-00, or 0.000000 etc etc, right ?

My question is...this format to represent 0 is really necessary ? Shouldn´t it be simpler to represent it as a single 0 ?

What´s the purpose of using formats such as: 0.00000e+00, 0.000 etc etc ?
Title: Re: FputoString format
Post by: daydreamer on May 24, 2025, 12:35:57 AM
Quote from: guga on May 23, 2025, 10:17:45 PMHi Guys

I`m updating a FputoString function and have some doubts about the scientific output format. In normal print it seems that when dealing with Floating points, 0 can be represented as: 0.00000e+00, 0.00000e-00, or 0.000000 etc etc, right ?

My question is...this format to represent 0 is really necessary ? Shouldn´t it be simpler to represent it as a single 0 ?

What´s the purpose of using formats such as: 0.00000e+00, 0.000 etc etc ?
real4 and dword have in common zeros have same encoding,so if you want to you could use conditional jump and print "0",if you discover a zero
 
Title: Re: FputoString format
Post by: FORTRANS on May 24, 2025, 01:14:05 AM
Hi,

Quote from: guga on May 23, 2025, 10:17:45 PMWhat´s the purpose of using formats such as: 0.00000e+00, 0.000 etc etc ?

   The first thing that comes to mind is output alignment.  Or "pretty
printing".

Regards,

Steve N.
Title: Re: FputoString format
Post by: raymond on May 24, 2025, 02:52:47 AM
QuoteShouldn´t it be simpler to represent it as a single 0 ?

That was the philosophy for the FpuFLtoA function of the Fpulib, although it did include the possibility of front padding with spaces for potential "alignment" of the decimal delimiter when specified.

If you can have a logical reason to offer some other different option in your own functions, it would be entirely up to you (and be entirely acceptable as always).
Title: Re: FputoString format
Post by: guga on May 24, 2025, 03:41:25 AM
Thanks guys

Hi Raymond, thanks. I was thinking about removing this and keeping just 0, but maybe it would be better to add an optional flag in case someone wants this kind of output. I don't see any use for it, but maybe it will be useful for others.
Title: Re: FputoString format
Post by: NoCforMe on May 24, 2025, 05:07:30 AM
Quote from: guga on May 23, 2025, 10:17:45 PMHi Guys

I`m updating a FputoString function and have some doubts about the scientific output format. In normal printf it seems that when dealing with Floating points, 0 can be represented as: 0.00000e+00, 0.00000e-00, or 0.000000 etc etc, right ?

My question is...this format to represent 0 is really necessary ? Shouldn´t it be simpler to represent it as a single 0 ?

What´s the purpose of using formats such as: 0.00000e+00, 0.000 etc etc ?

False precision is all I can think of.

Heh; I'm always amused by seeing things on Microsoft Learn where they give constants like "0x00000001", where a simple "1" would suffice.
Title: Re: FputoString format
Post by: guga on May 24, 2025, 11:44:38 AM
Hi David. Yes, I agree. I usually opt for the simplest way. I'm updating another function that I created specifically for these conversions, but I got confused when I looked at the format that the M$ printf function exported when it came to FPU. I found it very strange. Unfortunately, it seems that some people use this, so I'm going to try to do as Raymond suggested and implement this as an option. The problem is trying to do it in a way that doesn't change the functionality too much (I mean, the processing speed). The worst thing is that I've already finished the function, it was ready, but I tried to look at this M$ function to see if I could adapt it to other ways of representing the FPU and I came across this weird format of 0.000e+00 etc.

Title: Re: FputoString format
Post by: NoCforMe on May 24, 2025, 12:36:45 PM
Quote from: guga on May 24, 2025, 11:44:38 AMHi David. Yes, I agree. I usually opt for the simplest way. I'm updating another function that I created specifically for these conversions, but I got confused when I looked at the format that the M$ printf function exported when it came to FPU. I found it very strange. Unfortunately, it seems that some people use this, so I'm going to try to do as Raymond suggested and implement this as an option. The problem is trying to do it in a way that doesn't change the functionality too much (I mean, the processing speed).

Again, I have to ask: WHY?
Who the hell cares how many more microseconds your conversion routine takes?
Think about it: you're converting a binary value to a visible format. Not some other internal binary format.
Which means that, by necessity, this is going to be used for some kind of output display, where speed is not of the essence. Not for converting into some other format where speed is important, like processing a huge database or some such.

Worst case, you could probably use wsprintf() to accomplish this formatting. (I use that function all the time.)

Just code the damn thing and be done with it.
Title: Re: FputoString format
Post by: guga on May 24, 2025, 12:50:29 PM
QuoteWhich means that, by necessity, this is going to be used for some kind of output display, where speed is not of the essence. Not for converting into some other value where speed is important, like processing a huge database or some such.
Indeed. It makes sense.
Title: Re: FputoString format
Post by: daydreamer on May 25, 2025, 01:50:28 PM
I agree with David because afterwards print or SendMessage to a gui control takes Milliseconds
Title: Re: FputoString format
Post by: sinsi on May 25, 2025, 03:19:43 PM
If you write to a file or capture some output it might need to be a specific format when it gets parsed?
That's probably why printf etc differentiate between decimal and scientific output.

99.9% it won't matter.

Then again, the FPU can have -0 and +0  :badgrin:
Title: Re: FputoString format
Post by: daydreamer on May 26, 2025, 01:12:19 AM
Quote from: sinsi on May 25, 2025, 03:19:43 PMIf you write to a file or capture some output it might need to be a specific format when it gets parsed?
That's probably why printf etc differentiate between decimal and scientific output.

99.9% it won't matter.

Then again, the FPU can have -0 and +0  :badgrin:
Latest calculator i made had support for showing unicode infinity and - infinity

Title: Re: FputoString format
Post by: NoCforMe on May 26, 2025, 04:36:58 AM
Quote from: sinsi on May 25, 2025, 03:19:43 PMThen again, the FPU can have -0 and +0  :badgrin:

True, but do we really need to display that?
I mean, when does it matter to us which zero the FPU is giving us?
Title: Re: FputoString format
Post by: guga on May 26, 2025, 06:25:53 AM
Quote from: NoCforMe on May 26, 2025, 04:36:58 AM
Quote from: sinsi on May 25, 2025, 03:19:43 PMThen again, the FPU can have -0 and +0  :badgrin:

True, but do we really need to display that?
I mean, when does it matter to us which zero the FPU is giving us?

yeah. I saw that too. But it´s useless IMHO (Unless we are working with Imaginary numbers, i suppose). Allowing the function output only +Infinite or -Infinite is enough.
Title: Re: FputoString format
Post by: daydreamer on May 26, 2025, 07:15:06 AM
Unicode Character "∞" (U+221E)
For those who want to use it in gui control,use with invoke sendwmessage


Title: Re: FputoString format
Post by: NoCforMe on May 26, 2025, 08:27:25 AM
Quote from: daydreamer on May 26, 2025, 07:15:06 AMUnicode Character "∞" (U+221E)
For those who want to use it in gui control,use with invoke sendwmessage

I think "+INF" and "-INF" would be far safer bets than trying to display Unicode characters.
Title: Re: FputoString format
Post by: guga on May 26, 2025, 09:15:42 AM
Quote from: NoCforMe on May 26, 2025, 08:27:25 AM
Quote from: daydreamer on May 26, 2025, 07:15:06 AMUnicode Character "∞" (U+221E)
For those who want to use it in gui control,use with invoke sendwmessage

I think "+INF" and "-INF" would be far safer bets than trying to display Unicode characters.

Indeed. Currently, the function output this messages in case of errors:

And for internal debugging, i added this while i´m testing for valid numbers

I believe this covers all common FPU errors. Not sure if there are additional error modes to output.

And when the function identifies a Positive or Negative Subnormal value, it tries to de-normalize multiplying the input (Which is always converted to a tenbyte 1st) by 2^64 and immediately multiplied by it´s opposite (1/(2 ^64)) (That is the same as if i divided it again by 2^64)
Title: Re: FputoString format
Post by: NoCforMe on May 26, 2025, 12:51:10 PM
Quote from: guga on May 26, 2025, 09:15:42 AMIndeed. Currently, the function output this messages in case of errors:
  • QNAN
  • SNAN
Are those "quiet" and "signaling" NANs? Do users really want to be informed of internal FPU conditions at that level of detail?

Have you checked with the FPU-meister here, Raymond F? (I don't know enough about it to be of much help.) He should be able to tell us what information here is relevant and what isn't.
Title: Re: FputoString format
Post by: guga on May 26, 2025, 03:12:28 PM
Hi David. Yeah, i´m using a variation of his function to check for the categories. I´m just trying to output as many as possible, so it can be used in debuggers for additional information, for example.

RosAsm uses those information on the debugger (on a similar way as in Ollydbg or IdaPro), but i´m updating the function on a way it can be more portable so others can use as well.

This is the older FloattoAscii routine based on Raymonds fpu. Btw...I´ll try porting this to masm once i´m done.
; Flags:

[REGULAR 0    SCIENTIFIC 1]

Proc FloatToAscii:
    Arguments @Source, @Destination, @Decimal, @Flag
    Local @temporary, @eSize, @oldcw, @truncw, @stword
    Structure @BCD 12, @bcdstr 0

        fclex                   ;clear exception flags on FPU

      ; Get the specified number of decimals for result (MAX = 15):
        On D@Decimal > 0F, mov D@Decimal 0F

      ; The FPU will be initialized only if the source parameter is not taken
      ; from the FPU itself (D@ Source <> &NULL):
        .If D@Source = &NULL
            fld st0             ;copy it to preserve the original value
        .Else
            mov eax D@Source
            If eax > 0400_000
                finit | fld T$eax
              ; Check first if value on FPU is valid or equal to zero:
                ftst                    ;test value on FPU
                fstsw W@stword          ;get result
                test W@stword 04000     ;check it for zero or NAN
                jz L0>                  ;continue if valid non-zero
                test W@stword 0100      ;now check it for NAN
                jnz L1>                 ;Src is NAN or infinity - cannot convert
                  ; Here: Value to be converted = 0
                    mov eax D@Destination | mov W$eax '0' ; Write '0', 0 szstring
                    mov eax &TRUE | finit | ExitP
            Else
L1:             finit | mov eax &FALSE | ExitP
            End_If
        .End_If

      ; Get the size of the number:
L0:     fld st0                 ;copy it
        fabs                    ;insures a positive value
        fld1 | fldl2t
        fdivp ST1 ST0           ;->1/[log2(10)]
        fxch | fyl2x            ;->[log2(Src)]/[log2(10)] = log10(Src)

        fstcw W@oldcw           ;get current control word
        mov ax W@oldcw
        or ax 0C00              ;code it for truncating
        mov W@truncw ax
        fldcw W@truncw          ;change rounding code of FPU to truncate

        fist D@eSize            ;store characteristic of logarithm
        fldcw W@oldcw           ;load back the former control word

        ftst                    ;test logarithm for its sign
        fstsw W@stword          ;get result
        test W@stword 0100      ;check if negative
        jz L0>
            dec D@eSize

L0:     On D@eSize > 15, mov D@Flag SCIENTIFIC

      ; Multiply the number by a power of 10 to generate a 16-digit integer:
L0:     fstp st0                ;get rid of the logarithm
        mov eax 15
        sub eax D@eSize         ;exponent required to get a 16-digit integer
        jz L0>                  ;no need if already a 16-digit integer
            mov D@temporary eax
            fild D@temporary
            fldl2t | fmulp ST1 ST0       ;->log2(10)*exponent
            fld st0 | frndint | fxch
            fsub st0 st1        ;keeps only the fractional part on the FPU
            f2xm1               ;->2^(fractional part)-1
            fld1
            faddp ST1 ST0       ;add 1 back
            fscale              ;re-adjust the exponent part of the REAL number
            fxch
            fstp st0
            fmulp ST1 ST0       ;->16-digit integer

L0:     fbstp T@bcdstr          ;transfer it as a 16-digit packed decimal
        fstsw W@stword          ;retrieve exception flags from FPU
        test W@stword 1         ;test for invalid operation
        jnz L1<<                ;clean-up and return error

      ; Unpack bcd, the 10 bytes returned by the FPU being in the little-endian style:
        push ecx, esi, edi
            lea esi D@bcdstr+9
            mov edi D@Destination
            mov al B$esi        ;sign byte
            dec esi | dec esi
            If al = 080
                mov al minusSign      ;insert sign if negative number
            Else
                mov al Space      ;insert space if positive number
            End_If
            stosb

            ...If D@Flag = REGULAR
              ; Verify number of decimals required vs maximum allowed:
                mov eax 15 | sub eax D@eSize
                cmp eax D@Decimal | jae L0>
                    mov D@Decimal eax

              ; ;check for integer digits:
L0:             mov ecx D@eSize
                or ecx ecx           ;is it negative
                jns L3>
                  ; Insert required leading 0 before decimal digits:
                    mov ax '0o' | stosw
                    neg ecx
                    cmp ecx D@Decimal | jbe L0>
                        jmp L8>>

L0:                 dec ecx | jz L0>
                        stosb | jmp L0<
L0:
                    mov ecx D@Decimal | inc ecx
                    add ecx D@eSize | jg L4>
                        jmp L8>>

              ; Do integer digits:
L3:             inc ecx
L0:             movzx eax B$esi | dec esi | ror ax 4 | ror ah 4
                add ax '00' | stosw | sub ecx 2 | jg L0<
                jz L0>
                    dec   edi

L0:             cmp D@Decimal 0 | jz L8>>
                    mov al pointSign | stosb
                    If ecx <> 0
                        mov al ah | stosb
                        mov ecx D@Decimal | dec ecx | jz L8>>
                    Else
                        mov ecx D@Decimal
                    End_If

              ; Do decimal digits:
L4:             movzx eax B$esi
                dec esi
                ror ax 4 | ror ah 4 | add ax 03030 | stosw
                sub ecx 2 | jg L4<
                jz L1>
                    dec edi
L1:             jmp L8>>

          ; scientific notation
            ...Else
                 mov ecx D@Decimal | inc ecx
                movzx eax B$esi | dec esi
                ror ax 4 | ror ah 4 | add ax '00' | stosb
                mov al pointSign | stosb
                mov al ah | stosb
                sub ecx 2 | jz L7>
                jns L0>
                    dec edi | jmp L7>
L0:             movzx eax B$esi
                dec esi
                ror ax 4 | ror ah 4
                add ax '00' | stosw | sub ecx 2 | jg L0<
                jz L7>
                    dec edi

L7:             mov al 'E' | stosb
                mov al plusSign, ecx D@eSize | or ecx ecx | jns L0>
                    mov al minusSign | neg ecx
L0:             stosb
              ; Note: the absolute value of the size could not exceed 4931
                mov eax ecx
                mov cl 100
                div cl          ;->thousands & hundreds in AL, tens & units in AH
                push eax
                    and eax 0FF ;keep only the thousands & hundreds
                    mov cl 10
                    div cl      ;->thousands in AL, hundreds in AH
                    add ax '00' ;convert to characters
                    stosw       ;insert them
                pop eax
                shr eax 8       ;get the tens & units in AL
                div cl          ;tens in AL, units in AH
                add ax '00'     ;convert to characters
                stosw           ;insert them
            ...End_If

L8:         mov B$edi Space         ;string terminating character
        pop edi, esi, ecx

        finit | mov eax D@eSize
EndP

And this is the one i´m currently updating.

Proc FloatToString:
    Arguments @Float80Pointer, @InputFlag, @DestinationPointer, @TruncateBytes, @AddSubNormalMsg
    Local @ExponentSize, @ControlWord, @FPUStatusHandle, @tempdw, @extra10x, @FPUMode
    Structure @TmpStringBuff 128, @pTempoAsciiFpuDis 0, @pBCDtempoDis 64, @pTmpInputDis 96
    Uses esi, edi, edx, ecx, ebx

    ; @FPUStatusHandle = 0 Default
    ; @FPUStatusHandle = 1 Increased. Positive exponent value
    ; @FPUStatusHandle = 2 Decreased. Negative exponent value

    call 'RosMem.FastZeroMem' D@TmpStringBuff, 128
    mov D@ExponentSize 0, D@FPUStatusHandle 0, D@extra10x 0 D@FPUMode SpecialFPU_PosValid; D@IsNegative &FALSE,
    mov edi D@DestinationPointer, eax D@Float80Pointer

    ; always work with a cOpy of the input to prevent it being changed, specially when the number is subnormal.
    lea ebx D@pTmpInputDis
    finit | fclex | fstcw W@ControlWord ; Save FPU control word, clear exceptions, reset FPU
    If D@InputFlag = FPU_STR_REAL4_INT
        fild F$eax
    Else_if D@InputFlag = FPU_STR_REAL4_FLOAT
        fld F$eax
    Else_if D@InputFlag = FPU_STR_REAL8_INT
        fild R$eax
    Else_if D@InputFlag = FPU_STR_REAL8_FLOAT
        fld R$eax
    Else
        fld T$eax
    End_If
    fstp T$ebx
    fldcw W@ControlWord | fwait         ; Restore FPU control word

    ; Check for zero (positive or negative)
    .If_and D$ebx = 0, D$ebx+4 = 0
        If_Or W$ebx+8 = 0, W$ebx+8 = 08000
            ;mov W$ebx+8 0 ; no longer needed since we are working only with a copy
            mov B$edi '0', B$edi+1 0
            mov eax D@FPUMode
            ExitP
        End_If
    .End_If


    ; Handle sign and special number categories
    call RealTenFPUNumberCategory ebx
    mov D@FPUMode eax
    If eax >= SpecialFPU_QNAN ; do we have any special FPU being used ? Yes, display the proper message and exit
        mov ebx eax
        call WriteFPUErrorMsg eax, edi
        mov eax ebx
        ExitP
    End_If

    Test_If B$ebx+9 0_80
        mov B$edi '-' | inc edi
        xor B$ebx+9 0_80
    Test_End

    ; Special handling for subnormal numbers
    .If_Or D@FPUMode = SpecialFPU_PosSubNormal, D@FPUMode = SpecialFPU_NegSubNormal

        finit | fclex | fstcw W@ControlWord ; Save FPU control word, clear exceptions, reset FPU

        fld T$ebx                           ; Load the subnormal number into ST(0) (e.g., X)
        fld T$Float_NormalizationFactor ; Load 2^60 into ST(0), pushing X to ST(1)
        fmulp ST1 ST0                      ; Compute X * 2^60. Result (normalized) is in ST(0). ST(1) is popped.
                                            ; FPU Stack: ST(0) = X * 2^60

        fld T$Float_DenormalizationFactor ; Load 1/2^60 into ST(0), pushing (X * 2^60) to ST(1)
        fmulp ST1, ST0                      ; Compute (X * 2^60) * (1/2^60). Result (original X, now normalized) is in ST(0).
                                            ; FPU Stack: ST(0) = X (normalized)

        fstp T$ebx                          ; Store the normalized original value back to memory (original @Float80Pointer location)

        ; NO ADJUSTMENT TO D@ExponentSize IS NEEDED HERE,
        ; because the number's actual mathematical value hasn't changed.
        ; Its internal representation is just optimized.

        fldcw W@ControlWord | fwait         ; Restore FPU control word
    .End_If


    ; extract the exponent. 1e4933
    finit | fclex | fstcw W@ControlWord
    fld T$ebx
    call GetExponentFromST0 &FPU_EXCEPTION_INVALIDOPERATION__&FPU_EXCEPTION_DENORMALIZED__&FPU_EXCEPTION_ZERODIV__&FPU_EXCEPTION_OVERFLOW__&FPU_EXCEPTION_UNDERFLOW__&FPU_EXCEPTION_PRECISION__&FPU_PRECISION_64BITS
    mov D@ExponentSize eax
    ffree ST0
    .If D@ExponentSize < FPU_ROUND
        fld T$ebx
        fld st0 | frndint | fcomp st1 | fstsw ax
        Test_If ax &FPU_EXCEPTION_STACKFAULT
            lea ecx D@pBCDtempoDis
            fbstp T$ecx     ; -> TBYTE containing the packed digits
            fwait
            lea eax D@pTempoAsciiFpuDis
            lea ecx D@pBCDtempoDis
            call FloatToBCD_SSE ecx, eax
            mov eax FPU_MAXDIGITS+1 | mov ecx D@ExponentSize | sub eax ecx | inc ecx
            lea esi D@pTempoAsciiFpuDis | add esi eax

            If B$esi = '0'
                inc esi | dec ecx
            End_If

            call 'FastCRT.StrncpyEx' edi, esi, ecx | add edi eax
            xor eax eax
            jmp L9>>
        Test_End
        ffree ST0
    .End_If

    ; Necessary for FPU 80 Bits. If it is 0, the correct is only 0 and not 0.e+XXXXX.
    If D@ExponentSize = 080000000
        mov D@ExponentSize 0
    Else_If D@ExponentSize = 0
        mov D@ExponentSize 0
    End_If

    ; multiply the number by the power of 10 to generate required integer and store it as BCD

    ; We need to extract here all the exponents of a given number and multiply the result by the power of FPU_MAXDIGITS+1 (1e17)
    ; So, if our number is 4.256879e9, the result must be 4.256879e17. If we have 3e-2 the result is 3e17.
    ; If we have 0.1 the result is 1e17 and so on.
    ; If we have as a result a power of 1e16. It means that we need to decrease the iExp by 1, because the original
    ; exponential value is wrong.
    ; This result will be stored in ST0


    ..If D@ExponentSize <s 0
        mov eax D@ExponentSize | neg eax | add eax FPU_MAXDIGITS ; always add MaxDigits-1
        mov edx D@ExponentSize | lea edx D$eax+edx
        .If eax > 4932
            mov edx eax | sub edx 4932 | sub edx FPU_MAXDIGITS
            mov D@extra10x edx | add D@extra10x FPU_MAXDIGITS
            mov eax 4932
            If D@extra10x >= FPU_MAXDIGITS
                inc D@ExponentSize
            End_If
        .Else_If edx >= FPU_MAXDIGITS
            inc D@ExponentSize
        .End_If
    ..Else_If D@ExponentSize > 0
        mov eax FPU_MAXDIGITS+1 | sub eax D@ExponentSize
    ..Else ; Exponent size = 0
        mov eax FPU_MAXDIGITS+1
    ..End_If
    mov D@tempdw eax


    ; Apply scaling
    fild D@tempdw
    call ST0PowerOf10
    fld T$ebx | fmulp ST0 ST1

    If D@extra10x > 0
        ; Calculate the exponencial value of the extrabytes
        fild F@extra10x
        call ST0PowerOf10
        fmulp ST0 ST1 ; and multiply it to we get XXe17 or xxe16
    End_If

    ; now we must get the power of FPU_MAXDIGITS+1. In this case, we will get the value 1e17.
    fstp T$FloatToStr_ScaledValue ; Load scaled number back into ST0

    Fpu_If T$FloatToStr_ScaledValue < T$FloatToStr_Reference
        fld T$FloatToStr_ScaledValue ; Reload the scaled value into ST0
        fmul R$FloatToStr_Ten             ; Perform the multiplication
        dec D@ExponentSize
        fstp T$FloatToStr_ScaledValue ; Store the result back, or use it directly if this is the last FPU operation
    Fpu_End_If

    ; Now, get the FPU status *after* the macro has executed its comparison and potentially fmul.
    ; This will capture the FPU status word including any exception flags from the fmul.
    ; If FPURoundFix needs the comparison result (CF, ZF, PF), this still won't work perfectly
    ; because the SAHF already transferred them to EFLAGS, and we need to get the full FSW.
    ; However, if FPURoundFix mostly cares about *exceptions* (invalid, underflow, overflow), this might suffice.

    fstsw ax | fwait | mov D@FPUStatusHandle eax ; save exception flags. THis is the best place to capture all exceptions
                                                 ; including the oned related to the comparitions

    fld T$FloatToStr_ScaledValue ; Reload the final scaled value for fbstp.

    ; Final conversion to BCD
    lea ecx D@pBCDtempoDis
    fbstp T$ecx             ; ->TBYTE containing the packed digits
    fwait

    lea ecx D@pBCDtempoDis
    lea eax D@pTempoAsciiFpuDis
    call FloatToBCD_SSE ecx, eax

    ; Adjust the Exponent when some Exceptions occurs and try to Fix whenever is possible the rounding numbers
    lea eax D@pTempoAsciiFpuDis
    call FPURoundFix eax, D@FPUStatusHandle D@ExponentSize, D@TruncateBytes
    mov D@ExponentSize eax

    lea esi D@pTempoAsciiFpuDis | mov ecx D@ExponentSize
    inc ecx

    ..If_And ecx <= FPU_ROUND, ecx > 0
        mov eax 0
        While B$esi <= ' ' | inc esi | End_While
        While B$esi = '0'
            inc esi
            ; It may happens that on rare cases where we had an ecx = 0-1, we have only '0' on esi.
            ; So while we are cleaning it, if all is '0', we set edi to one single '0', to avoid we have a
            ; Empty String.
            If B$esi = 0
                mov B$edi '0' | inc edi
                jmp L9>>
            End_If
        End_While

        C_call 'FastCRT.FormatStr' edi, {'%s' 0}, esi | add edi D@ExponentSize | inc edi
        add esi D@ExponentSize | inc esi
        .If B$esi <> 0
            mov B$edi '.' | inc edi
            C_call 'FastCRT.FormatStr' edi, {'%s' 0}, esi | add edi eax
            While B$edi-1 = '0' | mov B$edi-1 0 | dec edi | End_While
            If B$edi-1 = '.'
                dec edi
            End_If
        .End_If
    ..Else
        While B$esi <= ' ' | inc esi | End_While
        .While B$esi = '0'
            inc esi
            If B$esi = 0
                mov B$edi '0' | inc edi
                jmp L9>>
            End_If
         .End_While

        .If B$esi <> 0
            movsb | mov B$edi '.' | inc edi
            C_call 'FastCRT.FormatStr' edi, {'%s' 0}, esi | add edi eax
            ; Clean last Zeros at the end of the Number String.
            While B$edi-1 = '0' | mov B$edi-1 0 | dec edi | End_While
            If B$edi-1 = '.'
                dec edi
            End_If

            mov B$edi 'e' | mov eax D@ExponentSize
            mov B$edi+1 '+'

            Test_If eax 0_8000_0000
                neg eax | mov B$edi+1 '-'
            Test_End

            inc edi | inc edi
            C_call 'FastCRT.FormatStr' edi, {'%d' 0}, eax | add edi eax
        .Else
            mov B$edi '0' | inc edi
        .End_If
    ..End_If

    ; For developers only:
    ; Uncomment these function if you want to analyse the Exceptions modes of the FPU Data.
    ;call TestingFPUExceptions D@FPUStatusHandle ; Control Word exceptions
    ;call TestingFPUStatusRegister D@FPUStatusHandle ; Status Registers envolved on the operation

L9:

    mov B$edi 0
    fldcw W@ControlWord | fwait

    .If D@AddSubNormalMsg = &TRUE
        If_Or D@FPUMode = SpecialFPU_PosSubNormal, D@FPUMode = SpecialFPU_NegSubNormal
            call 'FastCRT.StrCpy' edi, {B$ " (Denormalized)", 0}
        End_If
    .End_If
    mov eax D@FPUMode

EndP

The new version wil include the output formats that lacks in RosAsm. So, one can output the numbers in a wide range of formats as needed.

The goal is create a general usage function and use it as an export on a dll. And also, add this function as an extension to a FormatString i created for RosAsm (A variation of ws_printf). I´m quite done. Just need to review the code, clean and create the routines to handle non-scientific outputs, such as 0.000121234, 12344.5645 etc etc
Title: Re: FputoString format
Post by: daydreamer on May 26, 2025, 03:41:55 PM
dont forget to also include the common usage of SSE/SSE2 denormal and such,which is high probability fp code is run with SSE/SSE2 instead,even if you dont write SSE/SSE2 code yourself,invoking math function from a library might be coded in SSE/SSE2

I am curious if it works on all OSes used today :win7,win8,win10,win11 with unicode in a gui control,or functionality on earliest win7 lack unicode caps ?

Title: Re: FputoString format
Post by: guga on May 26, 2025, 10:37:49 PM
Quote from: daydreamer on May 26, 2025, 03:41:55 PMdont forget to also include the common usage of SSE/SSE2 denormal and such,which is high probability fp code is run with SSE/SSE2 instead,even if you dont write SSE/SSE2 code yourself,invoking math function from a library might be coded in SSE/SSE2

I am curious if it works on all OSes used today :win7,win8,win10,win11 with unicode in a gui control,or functionality on earliest win7 lack unicode caps ?

Hi Daydreamer

SSE/SSE2 denormal i´m using specific for a pow, log and exp functions. For FPU it´s not necessary, since i´m focusing in precision as well and FPU grant´s that when dealing with 80 bits. The helper functions i created that uses SSE inside the FPUtoString are for the BCD conversion, dwtoAscii conversion and to copy the error messages to the output.

The older FloatToBCD inside RosAsm was this:

; new on 20/05/2025
Proc FloatToBCD:
    Arguments @Input, @Output
    Uses esi, edi, ecx, eax

    mov esi D@Input
    add esi 9
    mov edi D@Output

    ;  The 1st bytes of the BCD Will always be 0. So we will need to bypass them to properly
    ; achieve a better result in case we are dealing with negative exponent values or other non
    ; integer values that may result in more then One zero byte at the start. Example:
    ; Sometimes when we are analysing the number 1, it can have the following starting bytes:
    ; 00999999999999999 . So, the 1st Byte is ignored to the form in edi this: 09999999999.
    ; On this example it is the number 0.999999999999999e-6 that is in fact 1e-6
    mov ecx 10

    If B$esi = 0
        dec ecx
        dec esi
    End_If

    Do
        ; Process two nibbles at once
        mov al B$esi | mov ah al | shr al 4 | and ah 0F
        add ax '00'
        mov W$edi ax
        add edi 2
        dec esi
        dec ecx
    Repeat_Until_Zero
    mov B$edi 0

EndP

But, i updated it again to use SSE2 as:

; equates for pshufd and friends
[SSE_INVERT_DWORDS 27]      ; invert the order of dwords
[SSE_ROTATE_64BITS 78]       ; the same things and result as SSE_SWAP_QWORDS, SSE_ROTATE_RIGHT_64BITS, SSE_ROTATE_LEFT_64BITS, SSE_ROTATE_64BITS

; global variables used:
[<16 SSE2_NibbleMask: B$ 0F,0F,0F,0F,0F,0F,0F,0F,0F,0F,0F,0F,0F,0F,0F,0F]
[<16 SSE2_AsciiZero:  B$ '0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0']

Proc FloatToBCD_SSE:
    Arguments @Input, @Output
    Structure @XmmPreserve 128, @XMMReg0Dis 0, @XMMReg1Dis 16, @XMMReg2Dis 32, @XMMReg3Dis 48
    Uses esi, edi, ecx, eax

    ; Preserve XMM registers efficiently
    movdqu X@XMMReg0Dis XMM0
    movdqu X@XMMReg1Dis XMM1
    movdqu X@XMMReg2Dis XMM2
    movdqu X@XMMReg3Dis XMM3

    mov esi D@Input
    mov edi D@Output

    ; Check for leading zero byte
    mov ecx 2
    If B$esi+9 = 0
        dec ecx
    End_If
    add esi ecx

    ; Prepare SSE2 constants
    movdqa xmm3 X$SSE2_NibbleMask    ; 0x0F0F0F0F...
    movdqa xmm2 X$SSE2_AsciiZero     ; '00' repeated

    ; Process 8 bytes at a time (SSE2 can do 16, but we're limited by BCD format)
    movq xmm0 Q$esi        ; Load bytes [esi+ecx..esi]
    MOVDQA xmm1 xmm0       ; xmm1 for lower nibbles
    PAND xmm1 xmm3         ; Isolate lower nibbles
    POR xmm1 xmm2          ; Convert lower nibbles to ASCII

    PSRLW xmm0 4           ; Shift upper nibbles to lower positions
    PAND xmm0 xmm3         ; Isolate shifted upper nibbles
    POR xmm0 xmm2          ; Convert upper nibbles to ASCII

    ;SSE_SWAP_D_HI_LOW xmm2 xmm0
    ; --- Interleave the digits (Upper, Lower, Upper, Lower...) ---
    ; The result for the first 16 digits will be in xmm0
    PUNPCKLBW xmm0 xmm1    ; xmm0 now has [U0, L0, U1, L1, U2, L2, U3, L3, U4, L4, U5, L5, U6, L6, U7, L7]


    PSHUFHW xmm0 xmm0 SSE_INVERT_DWORDS   ; Reverse 16-bit word order in high 64 bits of xmm0
    PSHUFLW xmm0 xmm0 SSE_INVERT_DWORDS   ; Reverse 16-bit word order in low 64 bits of xmm0
    PSHUFD  xmm0 xmm0 SSE_ROTATE_64BITS   ; Swap 64-bit halves and reverse dword order within them.

    ; Store 16 ASCII characters
    movdqu X$edi xmm0
    add edi 16

    ; process remaining:
   sub esi ecx
    Do
        ; Process two nibbles at once
        mov al B$esi | mov ah al | shr al 4 | and ah 0F
        add ax '00'
        mov W$edi ax
        add edi 2
        dec esi
        dec ecx
    Repeat_Until_Zero

    mov B$edi 0


    ; Restore XMM registers
    movdqu XMM0 X@XMMReg0Dis
    movdqu XMM1 X@XMMReg1Dis
    movdqu XMM2 X@XMMReg2Dis
    movdqu XMM3 X@XMMReg3Dis

EndP


I´m using just a few SSE2 functions to optimize the whole function a little bit. It´s not completely necessary, but it won´t hurt optimizing it, specially if the function is to be used in large databases where someone needs to convert the FPU to Ascii string. For example, i´ll have to use it anyway on a large set of 130.000 variables i'll use in the pow function. So it won´t hurt forcing the function to be a bit faster, while maintaining performance.
Title: Re: FputoString format
Post by: daydreamer on May 27, 2025, 04:55:03 PM
Guga great to code BCD using SSE2 :thumbsup:
,also because in x64 mode,the special x86 bcd opcodes wont work

but best performance when optimizing big ascii files are
1:allocate big enough buffer in memory which is fastest and convert all numbers to ascii text
and code that adds together strings "3.14159 "+"1.4215 "+ lot faster than print/fprint
2:block read/write whole buffer to files,compared to fprint 130.000 numbers lot faster
Title: Re: FputoString format
Post by: guga on May 27, 2025, 06:51:53 PM
Hi Daydreamer

Tks  :thumbsup:  :thumbsup:  I didn´t knew that problem in x64 related to bcd opcodes. I thought it could be done easily. When i ported this BCD routine to SSE2 for x86, it was a hell, because i wasn´t being able to reverse the order of all 18 to 20 ascii chars at once (i could use pshufb, but for now, i can´t try using SSE4 opcodes yet). Fortunately, i remembered a old test in another function that swapped words, and i gave a last try to see if that could be adapted to work as expected :)

About optimizing Big ascii files. Yeah, i´m aware that using the data in memory of such larger dataset is faster. But, i didn´t tested it yet on the pow function. My original idea still is use a small table of variables varying in few blocks (3 perhaps) of only 128, 256 or 512 Real8 as i did when succeeded to port the M$ function. The problem is that in both (mine and M$) it starts loosing precision after the 15th to 16th digit, and i have no idea how M$ scaled this thing. I mean, i couldn´t succeed to recreate the exact Math equations they did to produce the values on those tables.

That´s why i tried a new table regardless the size. The problem is that, as you saw, it may loose performance, and it will occupy a large space in the data section, resulting on a even larger dll (The math functions will be used on a dll, btw).
On the other hand, if i use them in memory, i´ll have to precalculate the values of all used tables, before pass them onto the pow (or log, ln etc etc) function. And it will kill performance making the code slow.  Alternatively, to bypass this performance issue, i can simply add a Generate Table function, (or initially/Setup function - or whatever name it can be) and inform the users that if he wants to use the math functions, such initialization functions must be settled 1st.

Or since the functions will be used as exports apis inside a dll, i can use the initialization functions on the start of the Dll (Right after DLL_PROCESS_ATTACH), and if someone wants to use the pow, log, etc etc, it won´t loose his speed since the tables will be previously generated in memory.

I´ll have to test all of this to make sure what is the better to use. Ideally it would be better use just smaller tables but, no matter what math equation i try, i didn´t succeeded to make the result be precise up to the 17th digit yet and grant me the same value as in wolframalpha. The only way i found (yet) is with those larger tables.
Title: Re: FputoString format
Post by: NoCforMe on May 28, 2025, 05:01:37 AM
Quote from: guga on May 27, 2025, 06:51:53 PMHi Daydreamer

Tks  :thumbsup:  :thumbsup:  I didn´t knew that problem in x64 related to bcd opcodes. I thought it could be done easily. When i ported this BCD routine to SSE2 for x86, it was a hell, because i wasn´t being able to reverse the order of all 18 to 20 ascii chars at once (i could use pshufb, but for now, i can´t try using SSE4 opcodes yet).

This is all very interesting, but tell us this:
Who the hell even uses BCD anymore?
I'm pretty sure that format went the way of the dinosaurs with "big iron" IBM mainframes and such.
Do you know of even one instance of someone actually using that nowadays?
Sheesh, the amount of effort that people expend here on things that nobody[1] is never going to use ...

[1] For certain values of "nobody".
Title: Re: FputoString format
Post by: FORTRANS on May 28, 2025, 07:39:39 AM
Hi,

Quote from: guga on May 26, 2025, 06:25:53 AM
Quote from: NoCforMe on May 26, 2025, 04:36:58 AM
Quote from: sinsi on May 25, 2025, 03:19:43 PMThen again, the FPU can have -0 and +0  :badgrin:

True, but do we really need to display that?
I mean, when does it matter to us which zero the FPU is giving us?

yeah. I saw that too. But it´s useless IMHO (Unless we are working with Imaginary numbers, i suppose). Allowing the function output only +Infinite or -Infinite is enough.

   Any normal usage of imaginary numbers does not have the concept of
a "signed" zero.  Plus and minus zeroes are an artifact of the internal
representation of a floating point number used by the FPU.  At least those
made by Intel.

Cheers,

Steve N.
Title: Re: FputoString format
Post by: NoCforMe on May 28, 2025, 07:51:40 AM
Good. So we're paring down the number of cases that actually need to be presented to the user here.
That should make your life a little bit easier, @guga.
Title: Re: FputoString format
Post by: guga on May 28, 2025, 08:35:56 AM
Quote from: NoCforMe on May 28, 2025, 05:01:37 AM
Quote from: guga on May 27, 2025, 06:51:53 PMHi Daydreamer

Tks  :thumbsup:  :thumbsup:  I didn´t knew that problem in x64 related to bcd opcodes. I thought it could be done easily. When i ported this BCD routine to SSE2 for x86, it was a hell, because i wasn´t being able to reverse the order of all 18 to 20 ascii chars at once (i could use pshufb, but for now, i can´t try using SSE4 opcodes yet).

This is all very interesting, but tell us this:
Who the hell even uses BCD anymore?
I'm pretty sure that format went the way of the dinosaurs with "big iron" IBM mainframes and such.
Do you know of even one instance of someone actually using that nowadays?
Sheesh, the amount of effort that people expend here on things that nobody[1] is never going to use ...

[1] For certain values of "nobody".

Hi David.

The bcd part of the code is necessary to unpack the values stored in FBSTP opcode.
https://c9x.me/x86/html/file_module_x86_id_83.html (https://c9x.me/x86/html/file_module_x86_id_83.html)
https://masm32.com/masmcode/rayfil/BCDtut.html (https://masm32.com/masmcode/rayfil/BCDtut.html)

About other uses, I don't know if it is used in other ways besides these conversion operations with FPU.

QuoteGood. So we're paring down the number of cases that actually need to be presented to the user here.
That should make your life a little bit easier, @guga.

Yes. I don´t see the purpose of forcing the function to output +0 or -0. It already exports the proper error messages and message codes related to FPU.

I finished today all main representations and corresponding flags. So, i allowed to the user have full control of the rounding mode and amount of digits to output. Ex: transforming numbers as 1.99999965858 to 2.0 etc, truncating the output to a certain amount of digits, padding the ending digits with 0 (If someone needs this for alignment) etc. It works for scientific and not scientific format.

I´ll finish now the cases of negative exponents smaller than 18 (Ex: 1.5e-16 etc), so it can be represented as 0.00000000000000015 etc. And then, i´ll clean up the code and port it to masm.

Later i´ll adapt a function i created to mimic the behavior of ws_printf (and printf) to allow input things like: printf("%f", myFloat)  Also allowing the user to have full control on the output.

A general function that converts FPU to string is helpful for a wide range of things people can create, since representing the values on controls (edit controls, static controls etc) or using it in huge tables and so on. I believe a single and general function is easier than forcing people to create their own FPU conversion routines for each needs.
Title: Re: FputoString format
Post by: NoCforMe on May 28, 2025, 09:27:10 AM
Quote from: guga on May 28, 2025, 08:35:56 AM
Quote from: NoCforMe on May 28, 2025, 05:01:37 AM
Quote from: guga on May 27, 2025, 06:51:53 PMHi Daydreamer

Tks  :thumbsup:  :thumbsup:  I didn´t knew that problem in x64 related to bcd opcodes. I thought it could be done easily. When i ported this BCD routine to SSE2 for x86, it was a hell, because i wasn´t being able to reverse the order of all 18 to 20 ascii chars at once (i could use pshufb, but for now, i can´t try using SSE4 opcodes yet).

This is all very interesting, but tell us this:
Who the hell even uses BCD anymore?
I'm pretty sure that format went the way of the dinosaurs with "big iron" IBM mainframes and such.
Do you know of even one instance of someone actually using that nowadays?
Sheesh, the amount of effort that people expend here on things that nobody[1] is never going to use ...

[1] For certain values of "nobody".
The bcd part of the code is necessary to unpack the values stored in FBSTP opcode.

Yes, but again, who is going to use FBSTP? That would indicate someone who is doing computations in BCD; who does that nowadays?

The only use case I can think of is someone either running a COBOL program or using a COBOL emulator, which might use BCD as a numeric storage format.
Title: Re: FputoString format
Post by: NoCforMe on May 28, 2025, 09:36:19 AM
... unless you plan on advertising your FPU functions as "capable of handling all FPU data types" ...
Title: Re: FputoString format
Post by: guga on May 28, 2025, 10:06:06 AM
Take a look at Raymond´s FpuFLtoA here (https://masm32.com/masmcode/rayfil/downloads/Fpulib2_341.zip). I don't know other faster method to convert these values (which are packed) stored in a TenByte without using his method (which i adapted to work with SSE2 only on this specific part of the code).

Quote... unless you plan on advertising your FPU functions as "capable of handling all FPU data types" ...

I did it already. It uses all FPU types and exports the error messages according (Expect for those +0 or -0, which are useless, IMHO. After all all it represents are a type of +INF and -INF - Already existent on the function). The idea is make a general function simple to use as possible, and yet able to output all types of messages (and values) needed to work with FPU.
Title: Re: FputoString format
Post by: NoCforMe on May 28, 2025, 11:19:08 AM
Quote from: guga on May 28, 2025, 10:06:06 AMTake a look at Raymond´s FpuFLtoA here (https://masm32.com/masmcode/rayfil/downloads/Fpulib2_341.zip). I don't know other faster method to convert these values (which are packed) stored in a TenByte without using his method (which i adapted to work with SSE2 only on this specific part of the code).

So again, at the risk of sounding like a broken record[1]:
Who cares how fast the conversion s are?

[1] Probably not meaningful to those who've never heard a 78 rpm record with a crack ...
Title: Re: FputoString format
Post by: guga on May 28, 2025, 11:51:52 AM
No problem, but...Why would I take Raymond's function and make it slower? What's the point of using the function he created using a specific opcode for this type of conversion (fbstp) and making the result dozens of times slower? If you don't use fbstp, the alternative would be to convert each byte to decimal with opcodes like div that would loop up to 10 times to transform each byte into decimal ascii. Do you see how much slower this would make the function? Why would I ruin his code like this, if there is already a specific opcode for this that allows you to do the conversion all at once?

The function I created (adapted from Raymond's) is for general use. I can't assume that someone will use it solely and exclusively to put text (Float ascii, in fact) in an edit box. If the function is for general use, it has to be effective for other uses, especially if the user uses it for large databases, for example. Otherwise, it would be easier to just use printf to generate the float ascii.

The core of the function is already done. Why would I change it and make it slower? There is no harm in making the function faster for general uses.
Title: Re: FputoString format
Post by: NoCforMe on May 28, 2025, 12:21:35 PM
I'm not saying, obviously, that you should make it slower. That would be exceedingly stupid.

I'm just saying you shouldn't obsess over how fast it is. Of course you're going to code it in an efficient way that'll probably be at least as fast as Raymond's code (which BTW isn't particularly fast) or faster.

Besides, with your 0.1% of the share of the assembler "market", I'm not sure your efforts make all that much difference anyway.
Title: Re: FputoString format
Post by: fearless on May 28, 2025, 06:49:14 PM
https://www.youtube.com/watch?v=kw-U6smcLzk (https://www.youtube.com/watch?v=kw-U6smcLzk)
Title: Re: FputoString format
Post by: Siekmanski on May 28, 2025, 11:25:44 PM
Hi all,

I wrote this routine 7 years ago because, I needed 256 real4 values at once at 60Hz in realtime on my screen.
This SIMD routine is 28 times faster then sprintf for real4 values on my old PC.

Edit: latest version

align 4
Real4_2_ASCII proc Real4string:DWORD,floatnumber:REAL4

    mov         ecx,Real4string
    mov         eax,floatnumber
    test        eax,eax
    je          message_PosZero
    cmp         eax,080000000h
    je          message_NegZero
   
    ; check floating-point exceptions
    cmp         eax,07F7FFFFFh
    ja          TestExceptions

ProcessReal4:   
    mov         byte ptr [ecx],020h     ; Write " " to the string
    test        eax,80000000h           ; Check the sign bit
    jz          No_SignBit
    mov         byte ptr [ecx],02dh     ; write "-" character to the string
    and         floatnumber,7FFFFFFFh   ; Make it an absolute value
    and         eax,7FFFFFFFh           ; Remove the sign bit
No_SignBit:

    ; Fast Log10(x)-1 routine to calculate the number of digits
    shr         eax,23                  ; Get the 8bit exponent
    sub         eax,127                 ; Adjust for the exponent bias
    cvtsi2ss    xmm0,eax                ; Convert int32 to real4
    mulss       xmm0,Log10_2            ; Approximate Log10(x) == Log10(2) * exponent bits ==  0.30102999566398119 * exponent bits
    addss       xmm0,PowersOfTen[37*4]  ; Add one to get the approximated number of digits from the floating point value
    cvtss2si    eax,xmm0                ; Convert real4 to int32
    mov         ecx,eax                 ; Save approximated number of digits
    mov         edx,38+1                ; Highest possible number of digits + 1
    add         eax,edx                 ; Get the Power Of Ten offset for the digits rounding check

    ; Now do the check to get the exact rounded number of digits from the floating point value
    ; We can do this by comparing it to the closest Power Of Ten below the floating point value 
    movss       xmm0,floatnumber
    comiss      xmm0,PowersOfTen[eax*4-4]
    jc          ExactLog10xMin1         ; Is it below the closest Power Of Ten?
    cmp         ecx,edx                 ; It is above, also check the approximated number of digits
    je          ExactLog10xMin1         ; Is it not above the highest possible number of digits skip adjustment
    dec         edx                     ; Adjust the number of digits by subtracting one
ExactLog10xMin1:                        ; Now we are allmost done to get the exact number of digits
                                        ; There is one exception, the lowest Power Of Ten check value is out of range ( 1.0E+39 )
                                        ; See the last added value in the PowersOfTen table, it's used for the out of range check
    sub         edx,ecx                 ; edx holds the offsets for the PowersOfTen table and the scientific notation string table
    mulss       xmm0,PowersOfTen[edx*4] ; Get the calculated Power Of Ten value and multiply it with the floating point value
    comiss      xmm0,PowersOfTen[38*4]  ; Compare to 10.0
    jnc         ExactNumDigits          ; Is it below 10.0?
    inc         edx                     ; It is below, adjust the offset for the scientific notation string
    mulss       xmm0,PowersOfTen[38*4]  ; Adjust decimal position ( it also solves the out of range issue )
ExactNumDigits:                         ; At this point we have the exact number of digits from the floating point value
    mulss       xmm0,PowersOfTen[42*4]  ; Get the 7 significant digits from the range -1.175494E-38 to 3.402823E+38
    cvtss2si    eax,xmm0                ; We want a Natural Number
    cvtsi2ss    xmm0,eax                ; So, remove the digits after the decimal point

    shufps      xmm0,xmm0,0             ; Splat...., make 4 copies from the real4 number
    movaps      xmm1,xmm0               ; Copy to a total of 8 copies
    mulps       xmm0,dividers           ; Produce base10 numbers
    mulps       xmm1,dividers+16        ; Produce base10 numbers
    movaps      xmm2,xmm0               ; Copy them
    movaps      xmm3,xmm1               ; Copy them
    mulps       xmm2,div10              ; Nullify least significant base10 numbers
    mulps       xmm3,div10              ; Nullify least significant base10 numbers
    cvttps2dq   xmm0,xmm0               ; Truncate remaining fractions
    cvttps2dq   xmm1,xmm1               ; Truncate remaining fractions
    cvtdq2ps    xmm0,xmm0               ; Convert back to real4
    cvtdq2ps    xmm1,xmm1               ; Convert back to real4
    cvttps2dq   xmm2,xmm2               ; Truncate remaining fractions
    cvttps2dq   xmm3,xmm3               ; Truncate remaining fractions
    cvtdq2ps    xmm2,xmm2               ; Convert back to real4
    cvtdq2ps    xmm3,xmm3               ; Convert back to real4
    mulps       xmm2,mul10              ; Move them back in the correct base10 position
    mulps       xmm3,mul10              ; Move them back in the correct base10 position
    subps       xmm0,xmm2               ; Subtract to get the extracted digits
    subps       xmm1,xmm3               ; Subtract to get the extracted digits
    cvttps2dq   xmm0,xmm0               ; Convert back to int32
    cvttps2dq   xmm1,xmm1               ; Convert back to int32

    shufps      xmm0,xmm0,11100001b     ; Swap the 2 first digits, the first digit is always zero
                                        ; Now we can write the decimal point for free ( no more memory swaps )
                                        ; Using a prepared ASCIIconverter constant

    packssdw    xmm0,xmm1               ; Pack 8 x 32bit to 8 x 16bit ( signed but, we are within the limit )
    packuswb    xmm0,xmm0               ; Pack 8 x 16bit to 16 x 8bit unsigned
    movq        xmm1,ASCIIconverterE    ; Prepared to insert a decimal point and convert to ASCII in one go
    paddb       xmm1,xmm0               ; Convert the number to ASCII

    mov         edx,Scientific_sz[edx*4]; Get the 4 byte scientific notation string
    mov         ecx,Real4string

    movq        qword ptr [ecx+1],xmm1  ; Write the 7 significant digits
    mov         [ecx+9],edx             ; Write the scientific notation string
;    mov         byte ptr [ecx+13],0     ; Terminate the string ( not needed we are inside the 16 bytes )
    ret

TestExceptions:
    cmp         eax,07F800000h
    je          message_Inf
    cmp         eax,07F800001h
    je          message_SNaN
    cmp         eax,07FBFFFFFh
    je          message_SNaN
    cmp         eax,07FC00000h
    je          message_QNaN
    cmp         eax,07FFFFFFFh
    je          message_QNaN
    cmp         eax,0FFC00001h
    je          message_QnegNaN
    cmp         eax,0FFBFFFFFh
    je          message_SnegNaN
    cmp         eax,0FF800001h
    je          message_SnegNaN
    cmp         eax,0FFC00000h
    je          message_Indeterm
    cmp         eax,0FF800000h
    je          message_NegInf
    cmp         eax,0FFFFFFFFh
    je          message_QnegNaN
jmp ProcessReal4 ; No exceptions found, proceed... 
message_QnegNaN:
    movaps  xmm0,oword ptr szQnegNaN
    movaps  oword ptr [ecx],xmm0
    ret
message_SnegNaN:
    movaps  xmm0,oword ptr szSnegNaN
    movaps  oword ptr [ecx],xmm0
    ret
message_Indeterm:
    movaps  xmm0,oword ptr szIndeterm
    movaps  oword ptr [ecx],xmm0
    ret
message_NegInf:
    movaps  xmm0,oword ptr szNegInf
    movaps  oword ptr [ecx],xmm0
    ret
;message_NegNorm:
;    movaps  xmm0,oword ptr szNegNorm
;    movaps  oword ptr [ecx],xmm0
;    ret
;message_Norm:
;    movaps  xmm0,oword ptr szNorm
;    movaps  oword ptr [ecx],xmm0
;    ret
message_Inf:
    movaps  xmm0,oword ptr szInf
    movaps  oword ptr [ecx],xmm0
    ret
message_SNaN:
    movaps  xmm0,oword ptr szSNaN
    movaps  oword ptr [ecx],xmm0
    ret
message_QNaN:
    movaps  xmm0,oword ptr szQNaN
    movaps  oword ptr [ecx],xmm0
    ret
message_PosZero:
    movaps  xmm0,oword ptr szPosZero
    movaps  oword ptr [ecx],xmm0
    ret
message_NegZero:
    movaps  xmm0,oword ptr szNegZero
    movaps  oword ptr [ecx],xmm0
    ret

Real4_2_ASCII endp
Title: Re: FputoString format
Post by: guga on May 29, 2025, 12:53:32 AM
Tks a lot SiekManski

I´ll take a look.

Quote64 bit exp: 4 float: 1.000000e+004 power10: 1.000000e+002 result: 1000000.000000


64 bit exp: 0 float: 1.000000e-045

mxcsr_register: 1FA2h exp: 0 mantissa: 00000001h
 PowerOfTen: 1.401298e-045 00000001h


SIMD Real4 to ASCII conversion by Siekmanski 2018.

1000000 calls per Run for the Cycle counter and the Routine timer.

AMD Ryzen 5 2400G with Radeon Vega Graphics

 Routine timers running now....

Real4_2_ASCII Cycles: 86 RoutineTime: 0.021150000 seconds
sprintf       Cycles: 1960 RoutineTime: 0.554756400 seconds

Result Real4_2_ASCII:  3.402823e+38
Result sprintf      : 3.402823e+038

Press any key to continue...


Btw...This (https://randomascii.wordpress.com/2012/01/11/tricks-with-the-floating-point-format/) article is really interesting
Title: Re: FputoString format
Post by: daydreamer on May 29, 2025, 12:09:15 PM
Great code Siekmanski  :thumbsup: