IMPORTANT:
You will need Microsoft (R) Macro Assembler Version 14.20.27323.0, included in VS 2019 Preview 2, or later, to successfully build the source. This is due to the bug fixes.
The included readr10_86.asm contains the full x86 source code for the StrtoReal10 function, which reads a string and outputs a real10 number.
Tests are made by test.asm: After obtaining the real10, the reverse operation is performed by printing using 2 methods:
1) The ExactReal10toStr method explained here (http://masm32.com/board/index.php?topic=7706.0), (that had a few bugs which I fixed and then reupload the zip).
2) The FpuFLtoA method from the fpu.lib.
OUTPUT:
1) String representation of number to be converted to real10=1.5836591183212933e-43
2) ExactReal10toStr:
Sign=+, Exponent=0x3f70 (dec=16240), Mantissa=0xe20702862f984643
+0.00000000000000000000000000000000000000000015836591183212932999986759293858105397404559278349068388199049732228336074783578942142611317873079897800416053128058632769141478302543646350386552512645721435546875
Normalized Number
3) FpuFltoA representation of the floating point:
0.000000000000000
1) String representation of number to be converted to real10=3.141592653589793238462643
2) ExactReal10toStr:
Sign=+, Exponent=0x4000 (dec=16384), Mantissa=0xc90fdaa22168c235
+3.14159265358979323851280895940618620443274267017841339111328125
Normalized Number
3) FpuFltoA representation of the floating point:
3.141592653589793
1) String representation of number to be converted to real10=4.1e3900
2) ExactReal10toStr:
Sign=+, Exponent=0x729c (dec=29340), Mantissa=0xbc1430dcb22d2253
+4099999999999999999706278637845604346427548300111674249860023718182240274467641822399903164887976687129811233018323034490661092338298746270613099669516450390472389955092673057305231644401768368701000254802596173500598162434026131682860998194880037056873969862463983503294591305143974896393022405907844016922151545477321379021310629580333595093459988395655880269866445941273347672201610066942198402847455297992094505325783950819315153010901360514741961454059476629540079975211617590449357509354619505270524997793330644804912041948243456911164271614417039627387482094158924573035383136415961587625580783177717855327496862201530126312370939405912878761553329246083284342581928043274495965058272365695432799339860139722097855444459694149153102824878404972460373719238010109362695697787621395866963230879759214757250555026732972172551222171601665814590127894474486374423005124867015893531803620907937030024588320563613105347859654952373774125945841304651003819118501208343950480572842448039045198452142447584440135901884098773153647682152714220088995926104053204997167001219830482580230654323967883414475589274536332200670744221768178062497749037572071011326942415265734359888387675654660234087383446720062996434955481034309983226934279327425694911446909858076514398614893708544465891149555913516901996374572365353375016994960373166058960551864675554441357032801701478038219535763277890121219882702702514725332426104503790995990239028706047611800445177827594822852606500364651454120185770204174304131824277460693945825968192895977938038700039591264248515805954840421260741693616099427778201896674704790666860473126367578713925459359894597193495841295615474579833659150274005364231030590881536496550047231478893485933444575623519400571864454011894415389328666311516072554772620601980570887445180459977411550239395613364182001020628415847592487579318718141798412979468423487126638641212381179985616513011303392746289749935470547474601455474388455240463410900899759460259614893322740290715038757147849623155598794526546442639626885548059801389712086093139204827663284323756100213405521513378573394588483962842757462131301204537415811484978931510272387036884941427480702770648818172565406416935266315240772129045896558416394916060941237319819638944536227335184615819077581847820652200200480257807846007649668448708975622815109325128149442337382569861742738575358793167185054988522423709964696907516485863163684045653686913838632843172801269131064883043360137530196891765752609597758504117983890453115812862938418698513420289336532399076063096563298276016028903790567972371733859968849644213177775759385859436716133747940376734576157038913801596835737414468479271807106910149782318347442774980003862898236672311873001788503728215241242850767928387748587726565476444969681663312363059920351355508204005509723290911062808490762365334811650352181409162796303794343481038164307085248019257422743469254568876728612954943170297128928364479912841340546315580536836905446885297371829194469814063054668894750123340384304965848851105189004896816516101164930654899426686175515365839930353862980178215762370130038968657571070139172268612345572840239288595963678162093828537464765007530310613214432623188221973980028713653872457113974623270357424863262120663513625664035312352697933914276687488151942858756885173103208596246047057930436871798727307076768134283697065369220844123255304465498613011246711715566044305461011883412618049411620536733553113442764633731411115995287194958689699643952140026313524761922255794529915477735888457675787861972719061924793590861708765457349701091014498054526710948730225451046014682800513298357733863053587926552966348431618718370817894206972042529452811808997235158386292731936843440178458295050302243896505175497947082441476954092055484176254119249623423472723714250155011403586046722382155405485761651691710803089616710955875404222707441218150943823397972288854721564858730478907162624.
Normalized Number
3) FpuFltoA representation of the floating point:
4.100000000000001E+3900
1) String representation of number to be converted to real10=-12345.798
2) ExactReal10toStr:
Sign=-, Exponent=0x400c (dec=16396), Mantissa=0xc0e73126e978d4fd
-12345.79799999999999915445414444548077881336212158203125
Normalized Number
3) FpuFltoA representation of the floating point:
-12345.79800000000
1) String representation of number to be converted to real10=12345.798
2) ExactReal10toStr:
Sign=+, Exponent=0x400c (dec=16396), Mantissa=0xc0e73126e978d4fd
+12345.79799999999999915445414444548077881336212158203125
Normalized Number
3) FpuFltoA representation of the floating point:
12345.79800000000
1) String representation of number to be converted to real10=12345.798e12
2) ExactReal10toStr:
Sign=+, Exponent=0x4034 (dec=16436), Mantissa=0xaf71c06108f00000
+12345798000000000.
Normalized Number
3) FpuFltoA representation of the floating point:
1.234579800000000E+0016
Edited 24 February
Fixed bug mentioned in reply #12, added 2 more test numbers and updated the output in this message
This looks good Jose, sad to say I don't understand enough of it. Works fine on my middle aged Haswell.
Quote from: hutch-- on February 21, 2019, 09:04:33 PM
This looks good Jose, sad to say I don't understand enough of it.
I roughly based it in some Pascal code I can't post because am not allowed to, then partially converted it to C+ASM (because MS C does not know about 80-bit floats), finally I put all in ASM.
This is the C part (the ASM part is already supplied):
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
typedef struct _real10Struc
{
uint64_t frac;
uint16_t exp;
} real10Struc;
extern ReadReal10Number(real10Struc* AOut, char** ABuffer, char* CurrChar);
extern ReadReal10Exponent(char** ABuffer, char* CurrChar);
extern doPower10(real10Struc* AOut, int32_t lPower, int16_t lSign);
#define CMaxExponent 4999
#define CExponent 'E'
#define CPlus '+'
#define CMinus '-'
#define NextChar(lcurchar, chPtr) { \
lcurchar=*chPtr; \
chPtr++; \
}
#define SkipWhitespace(lcurchar, chPtr) { \
while (lcurchar == ' ') \
NextChar(lcurchar, chPtr);\
}
#define ReadSign(lsign, lcurchar, chPtr){ \
lsign=1;\
if (lcurchar==CPlus) {\
NextChar(lcurchar,chPtr);}\
else if (lcurchar==CMinus) {\
NextChar(lcurchar, chPtr);\
lsign = -1;\
};\
}
real10Struc StrtoReal10(char *Value)
{
uint16_t LSavedCtrlWord;
char LCurrChar;
int16_t LSign;
int32_t LPower;
real10Struc LResult = {0};
NextChar(LCurrChar, Value);
SkipWhitespace(LCurrChar, Value);
if (LCurrChar != 0)
{
ReadSign(LSign, LCurrChar, Value);
if (LCurrChar != 0)
{
ReadReal10Number(&LResult, &Value, &LCurrChar);
if (LCurrChar == '.')
{
NextChar(LCurrChar, Value);
LPower = -ReadReal10Number(&LResult, &Value, &LCurrChar);
}
else
LPower = 0;
if ((LCurrChar & 0xDF) == CExponent)
{
NextChar(LCurrChar, Value);
LPower += ReadReal10Exponent( &Value, &LCurrChar);
}
SkipWhitespace(LCurrChar, Value);
doPower10(&LResult, LPower, LSign);
}
}
}
Is is possible to get a portable version of VS 19.
I really dislike web installers and always avoid these cretans.
Being a South African I second that (K_F query), as we really have had enough of corruption. We do need simple and precise in these uncertain times.
deleted
LOL at nidud :biggrin: perfect! Now to convert or asm-alate..
deleted
For peeps that are always suggesting 'nix solutions to save the World, be aware that we can use the Intel free compiler which knows about 80-bit long double.
The Intel free compiler works from inside Visual Studio and, by default (we can change that, but no need in this case), uses the Visual Studio libraries :( but, don't distress, we don't really need any Visual Studio math library so we can compile the following with the Intel compiler without further complications (it has been tested) and without using a single line of ASM:
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
typedef union _real10Struc
{
struct
{
uint64_t mant;
int16_t exp;
};
long double bigone;
} real10Struc;
real10Struc StrtoReal10A(const char *Value);
#define CMaxExponent 4999
#define CExponent 'E'
#define CPlus '+'
#define CMinus '-'
#define NextChar(lcurchar, chPtr) { \
lcurchar=*chPtr; \
chPtr++; \
}
#define SkipWhitespace(lcurchar, chPtr) { \
while (lcurchar == ' ') \
NextChar(lcurchar, chPtr);\
}
#define ReadSign(lsign, lcurchar, chPtr){ \
lsign=1;\
if (lcurchar==CPlus) {\
NextChar(lcurchar,chPtr);}\
else if (lcurchar==CMinus) {\
NextChar(lcurchar, chPtr);\
lsign = -1;\
};\
}
#define ReadReal10Number(retVal, Aout,lcurchar, chPtr) {\
retVal=0; \
while ((lcurchar >= '0') && (lcurchar <= '9')) {\
Aout *= 10;\
Aout += (long double)((unsigned char)lcurchar - (unsigned char)('0'));\
NextChar(lcurchar, chPtr);\
retVal++;\
}\
}
#define ReadReal10Exponent(retVal, lcurchar, chPtr) {\
int16_t tempSign;\
retVal = 0;\
ReadSign(tempSign, lcurchar, chPtr);\
while ((lcurchar >= '0') && (lcurchar <= '9')) {\
retVal = retVal * 10;\
retVal += (long double)((unsigned char)lcurchar - (unsigned char)('0'));\
NextChar(lcurchar, chPtr);\
}\
if (retVal > CMaxExponent)\
retVal = CMaxExponent;\
retVal *= tempSign;\
}
long double Power10(long double longDoubleValue, int32_t power)
{
long double tempPower = 1.0;
int32_t i;
if (power > 0)
for (i = 0; i < power; i++)
tempPower *= 10.0;
else if (power < 0)
for (i = 0; i < -power; i++)
tempPower /= 10.0;
return longDoubleValue*tempPower;
}
real10Struc StrtoReal10A(const char *Value)
{
uint16_t LSavedCtrlWord;
char LCurrChar;
int16_t LSign;
int32_t LPower;
int32_t tempVal;
real10Struc LResult = {0};
int longdouble = sizeof(long double);
printf("Are we using a compiler that suports 80-bit long double (ex: the Intel compiler)? %s\n", (longdouble>8)?"YES":"NO");
if (longdouble <= 8)
{
printf("You need the Intel compiler or other that supports 80-bit long double\n");
return LResult;
}
NextChar(LCurrChar, Value);
SkipWhitespace(LCurrChar, Value);
if (LCurrChar != 0)
{
ReadSign(LSign, LCurrChar, Value);
if (LCurrChar != 0)
{
ReadReal10Number(LPower, LResult.bigone, LCurrChar, Value );
if (LCurrChar == '.')
{
NextChar(LCurrChar, Value);
ReadReal10Number(LPower, LResult.bigone, LCurrChar, Value);
LPower = -LPower;
}
else
LPower = 0;
if ((LCurrChar & 0xDF) == CExponent)
{
NextChar(LCurrChar, Value);
ReadReal10Exponent(tempVal, LCurrChar, Value);
LPower += tempVal;
}
SkipWhitespace(LCurrChar, Value);
LResult.bigone = Power10(LResult.bigone, LPower)*LSign;
}
}
return LResult;
}
Note that the Intel compiler aligns long double to a 16 byte boundary so it reports sizeof(long double) as 16 bytes.
Tested like this:
#define SRC1_REAL (2)
#define SRC2_DIMM (0x800)
#define STR_SCI (0x8000)
extern void _stdcall FpuFLtoA(long double* lpScr1, uint32_t lpSrc2, char* lpszDest, uint32_t uID);
const char * Real10TestArray[] = { "3.141592653589793238462643", "1.5836591183212933e-43", "1918827719982564.1e489", "5.7262511e4475", "5.7262511e-4475" };
#define n_array (sizeof (Real10TestArray) / sizeof (const char *))
int main()
{
real10Struc myStruct;
int i;
for (i = 0; i < n_array; i++) {
char buff[48] = { 0 };
myStruct = StrtoReal10A(Real10TestArray[i]);
// Yeah, I have used MASM here:
FpuFLtoA(&myStruct.bigone, 0x0110, buff, SRC1_REAL + SRC2_DIMM + STR_SCI);
printf("%s\n\n",buff );
}
return 0;
}
OUTPUT
Are we using a compiler that suports 80-bit long double (ex: the Intel compiler)? YES
3.141592653589793E+0000
Are we using a compiler that suports 80-bit long double (ex: the Intel compiler)? YES
1.583659118321293E-0043
Are we using a compiler that suports 80-bit long double (ex: the Intel compiler)? YES
1.918827719982564E+0504
Are we using a compiler that suports 80-bit long double (ex: the Intel compiler)? YES
5.726251100000001E+4475
Are we using a compiler that suports 80-bit long double (ex: the Intel compiler)? YES
5.726251100000000E-4475
It works also in 64-bit but we must set it to not use SSE or AVX and must compile with the /Qlong-double switch.
Sure, FpuFLtoA does not work in 64-bit.
Quote from: AW on February 23, 2019, 06:36:21 PM
and without using a single line of ASM:
and where is the fun on that? :shock:
Hi Aw, nice work. :t :t :t
Did you tested the precision and limits of amount of outputted chars ? I mean, it is outputting more then 20 chars, but how precise is the result ?
extended double / REAL10 exact values are sometimes presented as hex as string lenght is not known.
Quote1) String representation of number to be converted to real10=12345.798
2) ExactReal10toStr:
Sign=-, Exponent=0x7fff (dec=32767), Mantissa=0x0000000000000000
#IND
3) FpuFltoA representation of the floating point:
ERROR
What happened???????
12345.798 seems to be a valid number as an input.
Quote from: guga on February 24, 2019, 03:32:27 AM
Did you tested the precision and limits of amount of outputted chars ? I mean, it is outputting more then 20 chars, but how precise is the result ?
Precision is an extremely complicated subject, although does not look like that. Not everything can be solved by the so called round-trip approach suggested by IEEE. I will come back on this sometime in the future.
Quote from: raymond on February 24, 2019, 05:50:20 AM
1) String representation of number to be converted to real10=12345.798
2) ExactReal10toStr:
Sign=-, Exponent=0x7fff (dec=32767), Mantissa=0x0000000000000000
#IND
3) FpuFltoA representation of the floating point:
ERROR
Yes, it is a bug in the ASM version. It works fine in the last C version.
Fixed the bug mentioned in reply #12 and replaced the zipped file of first message. Added 2 other types of numbers for test.
QuoteThe ExactReal10toStr method
Such a title for a method to convert a REAL10 to a string is definitely prone to mislead those who don't have a good knowledge of basic mathematics, nor of the IEEE standard for floating points and their
REAL accuracy. Any digit past the 19
th significant one in the resulting decimal representation is strictly garbage, regardless of the accuracy of the input. Some of those 19 digits may also be garbage depending on the actual accuracy of the input.
This will always remind me of the days when an American association (of which I was a member) started to issue some of their standards in the metric system along with their U.S. units. One of the procedures required to take a 1-quart (U.S.) sample for some analysis. The metric updated document came out as requiring a 0.95 liter sample!!!!!!
The meaning here is the following:
Every floating point value is stored as a rational number a/b where b is a power of 2. This number in turn has its exact decimal representation and this is what the program displays.
The decimal representation is as precise as the stored binary representation.
I am not talking about the precision of the stored rational number, the idea is to look at the matter from a different angle. 8)
Please don't take me wrong. I appreciate the work you did on this.
However, I'm only insinuating that a lot of people consider exact = precise, and in that sense the title could be very misleading.
Quote from: nidud on February 23, 2019, 07:10:44 AM
:biggrin:
Asmc converts to quad float and scale down (faster):
https://github.com/nidud/asmc/tree/master/source/lib32/quadmath
I don't think your ldtoquad works as expected but is small and looks nice. :biggrin:
What I mean is that I tested with 2 other different methods which agree between them but disagree with yours.
Quote from: raymond on February 26, 2019, 04:05:48 AM
However, I'm only insinuating that a lot of people consider exact = precise, and in that sense the title could be very misleading.
I appreciated your comments and I know some people may interpret it that way. Thank you. :t
deleted
I have never used asmc and am not going to learn just for this. If you don't provide instructions for a dummy to build a simple example I will have to pass.
I tested the below x86 dtoquad and the number was a PI approximation. The results were not as expected.
dtoquad proc uses ebx p:ptr, ld:ptr
mov eax,p
mov ecx,ld
mov dx,[ecx+8]
mov [eax+14],dx
mov edx,[ecx+4]
mov ecx,[ecx]
shl ecx,1
rcl edx,1
mov [eax+6],ecx
mov [eax+10],edx
xor ecx,ecx
mov [eax],ecx
mov [eax+4],cx
ret
dtoquad endp
deleted
Quote from: nidud on February 26, 2019, 11:03:16 AM
Well, it works like this:
n equ <3.141592653589793238462643383279502884197169399375105820974945>
.data
real2 n ; 0x4248
real4 n ; 0x40490FDB
real8 n ; 0x400921FB54442D18
real10 n ; 0x4000C90FDAA22168C235
real16 n ; 0x4000921FB54442D18469898CC51701B7
; ldtoquad() ; 0x4000921FB54442D1846A000000000000
What did you expect?
Very very good work, Nidud. :t :t
Same result as in:
https://rosettacode.org/wiki/Arithmetic-geometric_mean/Calculate_Pi
Well...on the link they do it for pi from thousands of digits :dazzled: :dazzled: :dazzled: :dazzled:, but your result for Pi is the same as used with Julia. :t :t :t :t :t
It i also the same result as here :t :t :t :t :t:
https://www.perlmonks.org/?node_id=992580;displaytype=selectcode
Some guy produced an article Reading binary floating-point numbers (http://blogs.perl.org/users/rurban/2012/09/reading-binary-floating-point-numbers-numbers-part2.html) with a function cvt_num10_num16(unsigned char *dest, const unsigned char *src) that I now see is wrong. Strangely, I found an alternative in asm, along the same lines.
Nidud code is indeed correct. :t
Nice finding, AW.
The fast sqrt seems very handy
for Real4
https://cs.uwaterloo.ca/~m32rober/rsqrt.pdf
for Real8
https://stackoverflow.com/questions/11644441/fast-inverse-square-root-on-x64
I wonder, if the precision can be increased as well. Seems interesting (and fast)
https://github.com/herumi/misc/blob/master/rsqrt.cpp