The MASM Forum

General => The Laboratory => Topic started by: zedd151 on September 26, 2015, 04:52:10 AM

Title: ascii adder
Post by: zedd151 on September 26, 2015, 04:52:10 AM
Just for fun, I once made a procedure that adds two unsigned ascii decimal numbers.

I couldn't find the original version, so I rewrote it today. I wanted to find a better way
but still using only registers (no external calls, SSE, MMX, etc)

I'm sure the coding gurus here can find a better method.

bugfix

This is the only version that works 100% regardless of input string size. one string could be 1 byte and the other 1,000's of bytes and will work properly. (The console input version also works AFAIK 100% also.)

During the optimizations however that accuracy was lost, and the new 'optimised' algo would
only work on input strings of the same length.

Code: [Select]


        include \masm32\include\masm32rt.inc

        asciiadder PROTO :DWORD, :DWORD, :DWORD

    .data
       align 4
       asciinumber1 db 1024 dup (39h)
       dd 0
       align 4
       asciinumber2 db 1024 dup (39h)
       db 3 dup (0)
       align 4
       asciidest    db 1024 dup (0)
       db 4 dup (0)

    .code

    start:

        print offset asciinumber1, 13, 10, 13, 10
        print chr$("plus"), 13, 10, 13, 10
        print offset asciinumber2, 13, 10, 13, 10
        print chr$("equals"), 13, 10, 13, 10

        invoke asciiadder, addr asciinumber1, addr asciinumber2, addr asciidest
        print offset asciidest, 13, 10

        inkey
        exit

    OPTION PROLOGUE:NONE
    OPTION EPILOGUE:NONE
    align 16
    nops 13
    asciiadder proc src1:dword, src2:dword, dst:dword
        push ebp
        push ebx
        push esi
        push edi
        mov esi, [esp+14h]
        mov ebp, [esp+18h]
        mov edi, [esp+1Ch]
       
        invoke StrLen, ebp
        mov ebx, eax
       
        invoke StrLen, esi
        mov ecx, eax
       
        cmp ecx, ebx ;  effectively right aligning all strings - then work our way left
        jl ebxgr
        mov edx, ecx
        jmp setdst
        ebxgr:
        mov edx, ebx

        setdst:
        inc edx                         ; add an extra byte for potential carry

    calc:
        mov al, 0
        cmp ecx, -1
        js @f
        mov al, byte ptr [esi+ecx]
        @@:
        cmp ebx, -1
        js @f
        add al, byte ptr [ebp+ebx]
        @@:
        sub al, 30h
        cmp al, 0Fh
        jl @f
        sub al, 30h
        @@:
        cmp al, 0Ah
        jl @f
        sub al, 0Ah
        inc byte ptr [edi+edx-1]
        @@:

        add al, 30h
        add al, byte ptr [edi+edx]

        cmp al, 39h
        jng @f
        sub al, 0Ah
        inc byte ptr [edi+edx-1]
        @@:
        mov byte ptr [edi+edx], al

        dec ebx
        dec ecx
        dec edx
    jnz calc
        cmp byte ptr [edi+edx], 30h     ; test to see if first byte in dest is the carry byte
        jg @f
        add byte ptr [edi], 30h         ; if so, convert to ascii
        @@:
        pop edi
        pop esi
        pop ebx
        pop ebp
        ret 12
    asciiadder endp
    OPTION PROLOGUE:PrologueDef
    OPTION EPILOGUE:EpilogueDef


    end start
   

 :biggrin:
final version attached
Title: Re: ascii adder
Post by: dedndave on September 26, 2015, 06:37:07 AM
did you consider using the AAA instruction ?
Title: Re: ascii adder
Post by: zedd151 on September 26, 2015, 06:46:37 AM
did you consider using the AAA instruction ?

 :P

Code: [Select]

        add al, byte ptr [ebp+ebx]
        @@:
        aaa
        add al, 30h
        mov byte ptr [edi+edx], al
        mov byte ptr [edi+edx-1], ah
        @@:




something like that?

Nope that's not right.
The first result looked ok (using AAA), but tried other input values.....
And the result wasn't right, my implementation was somehow wrong.
thinking....
Title: Re: ascii adder
Post by: zedd151 on September 26, 2015, 06:55:57 AM
Okay, thats ok, but I know there has to be a fool-proof way to do it 4 bytes at a time.
But the endian-ness also gets in the way. (don't want to use bswap for obvious reasons)

lemme think on it...

trouble is the way I'm doing it, I'm using up all the registers.
I don't really want to use local variables.

and eax, 30303030h ?

thinking....
Title: Re: ascii adder
Post by: zedd151 on September 26, 2015, 07:31:33 AM
Okay, I tried different things using all of eax, to try to add 4 bytes at a time.
The results were less than good.

Hmmm.....

Title: Re: ascii adder
Post by: dedndave on September 26, 2015, 07:50:26 AM
i'm sure there is - lol

if you want a faster byte-oriented routine, though...
you could use a look-up-table
any 2 numeric ascii bytes can only have 1 of 20 possible results (if you include previous carry)
so table with 20 values - some way to set carry or something
maybe double the result values and set the low bit to carry a one
you pick up the result, and shift right - sets the carry flag for the next operation
Title: Re: ascii adder
Post by: zedd151 on September 26, 2015, 07:55:55 AM
hmmm, lookup table. Thats a possibility.

My implementation of AAA was wrong somehow. It looked good in olly, even carried in the right
place but.....  ;)

The results were not correct

I'm still looking for another way

The original version is the best (accurate) I have so far.
Title: Re: ascii adder
Post by: zedd151 on September 26, 2015, 07:58:26 AM
Adding a very large pair of numbers. They don't have to be equal length, btw. I just did it here as an example. Using the original version, with a 256 byte destinatination buffer.

Code: [Select]
123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789

plus

123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890

equals

0246913578135802468024691357013580245912469134802358023691346912580245801469135690358024679246913578135802468024691357013580245912469134802358023691346912580245801469135690358024679

It may not be the fastest method, but it gets the job done.  8)

Now to get a pencil and paper to verify the results. I don't have a calculator that will work with such large numbers.   :dazzled:

If this is good, I will do one for subtraction, the if that turns out well,
I will *attempt* to make one for multiplication and division.  :shock:
Title: Re: ascii adder
Post by: zedd151 on September 26, 2015, 10:53:36 AM
Full console input and output added:

Code: [Select]


    ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
    ;;                                                                            ;;
    ;;  asciiadder :: takes two unsigned ascii decimal strings and adds them      ;;
    ;;  designed to add very long numbers. Surely an easier way, but it was fun.  ;;
    ;;                                                                            ;;
    ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

    ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
    ;;                                        ;;
    ;;  build with console assemble and link  ;;
    ;;                                        ;;
    ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;


    include \masm32\include\masm32rt.inc

    asciiadder PROTO :DWORD, :DWORD, :DWORD
   
    zinput MACRO prompt:VARARG
      LOCAL txt
      LOCAL buffer
      IFNB <prompt>
        .data
          txt db prompt, 0
          align 4
        .data?
          buffer db 2052 dup (?)
          align 4
        .code
        invoke StdOut,ADDR txt
        invoke StdIn,ADDR buffer,2048
        mov BYTE PTR [buffer+eax], 0
        invoke StripLF,ADDR buffer
        EXITM <OFFSET buffer>
      ELSE
        .data?
          buffer db 2052 dup (?)
          align 4
        .code
        invoke StdIn,ADDR buffer,2048
        mov BYTE PTR [buffer+eax], 0
        invoke StripLF,ADDR buffer
        EXITM <OFFSET buffer>
      ENDIF
    ENDM


    .data
       align 4
 ;        asciinumber1 db 2048 dup (0)
       lpstring1 dd 0
       
       align 4
 ;        asciinumber2 db 2048 dup (0)
       lpstring2 dd 0
       
       align 4
       asciidest    db 2048 dup (0)
       db 4 dup (0)

    .code

    start:

        ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
        ;;                                    ;;
        ;;  get first number from user input  ;;
        ;;                                    ;;
        ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
       
        print "enter first number - 2048 digits max", 13, 10
        mov lpstring1, zinput()
        invoke locate,0,4
       
        ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
        ;;                                      ;;
        ;;   get second number from user input  ;;
        ;;                                      ;;
        ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
       
        print "enter second number - 2048 digits max", 13, 10
        mov lpstring2, zinput()
        invoke locate,0,8

        cls     ;   clear the screen to get the inputting outta the way

        ;;;;;;;;;;;;;;;;;;;;;;;;;;;
        ;;                       ;;
        ;;  print full equation  ;;
        ;;                       ;;
        ;;;;;;;;;;;;;;;;;;;;;;;;;;;
       
        print lpstring1, 13, 10, 13, 10
        print chr$("plus"), 13, 10, 13, 10
        print lpstring2, 13, 10, 13, 10


        ;;;;;;;;;;;;;;;;;;;;;;;;;;;;
        ;;                        ;;
        ;;  perform the addition  ;;
        ;;                        ;;
        ;;;;;;;;;;;;;;;;;;;;;;;;;;;;
       
        invoke asciiadder, lpstring1, lpstring2, addr asciidest

        ;;;;;;;;;;;;;;;;;;;;;;;;;
        ;;                     ;;
        ;;  print the results  ;;
        ;;                     ;;
        ;;;;;;;;;;;;;;;;;;;;;;;;;
       
        print chr$("equals"), 13, 10, 13, 10
        print offset asciidest, 13, 10

        inkey
        exit

    OPTION PROLOGUE:NONE
    OPTION EPILOGUE:NONE
    align 16
    nops 13
    asciiadder proc src1:dword, src2:dword, dst:dword
        push ebp
        push ebx
        push esi
        push edi
        mov esi, [esp+14h]
        mov ebp, [esp+18h]
        mov edi, [esp+1Ch]
        xor ecx, ecx
        @@:
        cmp byte ptr [esi+ecx+1], 0     ; get the last ascii byte
        jz @f
        inc ecx
        jmp @b
        @@:

        xor ebx, ebx
        @@:
        cmp byte ptr [ebp+ebx+1], 0     ; get the last ascii byte
        jz @f
        inc ebx
        jmp @b
        @@:

        cmp ecx, ebx ;  effectively right aligning all strings - then work our way left
        jl ebxgr
        mov edx, ecx
        jmp setdst
        ebxgr:
        mov edx, ebx

        setdst:
        inc edx                         ; add an extra byte for potential carry

    calc:
        mov al, 0
        cmp ecx, -1
        js @f
        mov al, byte ptr [esi+ecx]
        @@:
        cmp ebx, -1
        js @f
        add al, byte ptr [ebp+ebx]
        @@:
        sub al, 30h
        cmp al, 0Fh
        jl @f
        sub al, 30h
        @@:
        cmp al, 0Ah
        jl @f
        sub al, 0Ah
        inc byte ptr [edi+edx-1]
        @@:

        add al, 30h
        add al, byte ptr [edi+edx]

        cmp al, 39h
        jng @f
        sub al, 0Ah
        inc byte ptr [edi+edx-1]
        @@:
        mov byte ptr [edi+edx], al

        dec ebx
        dec ecx
        dec edx
    jnz calc
        cmp byte ptr [edi+edx], 30h     ; test to see if first byte in dest is the carry byte
        jg @f
        add byte ptr [edi], 30h         ; if so, convert to ascii
        @@:
        pop edi
        pop esi
        pop ebx
        pop ebp
        ret 12
    asciiadder endp
    OPTION PROLOGUE:PrologueDef
    OPTION EPILOGUE:EpilogueDef


    end start


Still trying to make the routine a bit faster....
But for demonstration purposes, it is fine as is, I suppose.

Using modified 'input()' macro, allowing up to 2048 digit input

Code: [Select]
72687256878728728578278273764572634567527634762347624623645268275497274697367264
59726597629769726979767654976479567269726957629764592769247629767629347697676576
876587872876575764578568745687564758756

plus

10101010101010101010101010101010101010101010101010101010101010101010101010101010
10101010101100110101010101010010100101001010101001010010100100101010010100101001
010100101001010010100101001010010100101001010010010100100101

equals

01010101010101010101017369735788882973867928837477467364466853773577244863472465
53692855982856984682746982760763986982798077775597748957736982705863977469287024
8639867730348707686676977588882886675865579578755697664858857
Title: Re: ascii adder
Post by: zedd151 on September 26, 2015, 12:03:36 PM
cycle counts using one of dedndaves templates  adding two, 256 digit ascii strings :dazzled:

Code: [Select]
5498
5499
5497
5500
5499
5501
5495
5498
5501
5498

still working, trying to make it faster, but so far no luck
I can make it faster, but accuracy is questionable.
Title: Re: ascii adder
Post by: dedndave on September 26, 2015, 12:57:12 PM
i would take one of two approaches
and, you'd have to test them to see which is fastest

one approach, seeing as you know the length of the longest input string,
you can estimate the length of the output string
and from that, you can calculate the size of a binary integer that would contain it

zero the required dwords on the stack
you can keep a running binary total in a register
when it overflows 32 bits, you will get a carry flag
at that time, you add the dword (with required carries) to the large integer on the stack
zero the accumulator register and start over until the input values are done

after all that, use a routine to convert the large integer to an ascii decimal string
the thinking behind this method is to eliminate un-aligned (probably byte) writes while processing the strings
my Ling Long Kai Fang routines can crank out the big ascii string - pretty fast, too
seems like a lot of code to write though - lol

the other way is the look-up table i mentioned earlier
when you add two ascii decimal digits together (plus a possible previous carry bit),
the end results will be values from 60h to 73h, inclusive
so, make a table with 20 byte values in it
access the table for the result rather than all that add/subtract stuff

this will give you an idea what the loop might look like
Code: [Select]
TopOfLoop:
    movzx   edx,byte ptr [esi]
    adc     dl,byte ptr [ebx]         ;with carry flag from previous loop pass
    mov     al,MyTable[edx-60h]
    shr     eax,1                     ;the carry flag is used on the next loop pass
    mov     MyDest[edi-1],al
    dec     esi
    dec     ebx
    dec     edi
    jnz     TopOfLoop
Title: Re: ascii adder
Post by: zedd151 on September 26, 2015, 01:13:29 PM
It's true, you never pass up the chance to say "Ling Long Kai Fang".  :biggrin:

Yeah, I've been kicking around some ideas here myself. Even experimented using a lookup table
as you had suggested earlier.

I'll come up with something...

Can't be that hard, but at the moment my mind brain isn't yet connecting the dots.






Eventually.


Title: Re: ascii adder
Post by: dedndave on September 26, 2015, 01:14:12 PM
i guess that loop needs a little work, as you would loop on the length of the shortest input value, not the output string
at the end, you'll have a carry bit to ripple through the remaining bytes of the longer value
Title: Re: ascii adder
Post by: zedd151 on September 26, 2015, 01:34:42 PM
I did a quick search to find this ling long kai fang that I've heard you mention several times, and I'm looking at it at the moment......
Code: [Select]

;bignum integer to string - by DednDave
;ling long kai fang......

2009, is that the only version (LLKF9_1a)?
Title: Re: ascii adder
Post by: dedndave on September 26, 2015, 02:04:11 PM
there were previous versions, LLKF8_1 and LLKF9_1
inside the LLKF9_1a package, you will find 3 versions of the routine:

one is for signed values
one is for unsigned values
one will handle either type (an extra argument to tell it which mode)
Title: Re: ascii adder
Post by: zedd151 on September 26, 2015, 04:07:44 PM
I went back and took another look at using AAA. It replaced a whole series of discrete instructions. Here are the results, using the same test methods as in the first (except only 5 reps):

Code: [Select]
original
5470
5498
5499
5496
5501

with AAA
2659
2557
2558
2556
2556

And this is the revised algo:

Code: [Select]
bugs

I found out that I was using a mov before, where I should have had an add.   ::)

Now lemme work on the code before the loop....
Title: Re: ascii adder
Post by: dedndave on September 26, 2015, 04:11:29 PM
that's a little better   :t

here's a table
each byte is meant to be SHR'ed
bit 0 becomes the carry flag for the next pass
the remaining bits are the ASCII byte total

Code: [Select]
AddAscTable db 60h,62h,64h,66h,68h,6Ah,6Ch,6Eh,70h,72h
            db 61h,63h,65h,67h,69h,6Bh,6Dh,6Fh,71h,73h
Title: Re: ascii adder
Post by: rrr314159 on September 26, 2015, 04:55:22 PM
FWIW,

Here's what I came up with. According to my (unstable) timings, takes about 2000 cycles for the inputs you used for timing (111... and 222...). Takes about 400 more for the other test inputs (where the answer is 0101010101010101010101736973578888
297386792883747746736446685377357724486347246553692855982856
984682746982760763986982798077775597748957736987058639774692
870248639867730348707686676977588882886675865579578755697664
858857) since that pair involves carries. On AMD takes considerably longer.

Here's the main routine:

Code: [Select]
; ascii adder by rrr3134159 9/25/2015

include support.inc
Timer_Data

ASCII_LENGTH = 256                      ; must be divisible by 4
ASCII_LENGTH_DWORDS = ASCII_LENGTH / 4

copy_from_end MACRO dest, src, cnt
    lea esi, src
    lea edi, dest
    mov ecx, cnt
    inc ecx
    std
    rep movsb
    cld
ENDM

.data

; didn't bother to input these from console ...

a1  db "72687256878728728578278273764572634567527634762347624623645268275497274697367264"
    db "59726597629769726979767654976479567269726957629764592769247629767629347697676576"
    db "876587872876575764578568745687564758756", 0

;a1 db 256 dup (31h), 0          ; these data take about 2000 cycles, 400 less, since no carry

    a1_end = $ - 1
    a1_length = a1_end - a1

a2  db "10101010101010101010101010101010101010101010101010101010101010101010101010101010"
    db "10101010101100110101010101010010100101001010101001010010100100101010010100101001"
    db "010100101001010010100101001010010100101001010010010100100101", 0

;a2 db 256 dup (32h), 0

    a2_end = $ - 1
    a2_length = a2_end - a2

    num1 db ASCII_LENGTH dup(30h)
    num1_end = $
    db 0
    num2 db ASCII_LENGTH dup(30h)
    num2_end = $
    db 0
    ans db ASCII_LENGTH dup(30h)
    ans_end = $
    db 0
    ans_length dd 0
.code

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
aa:
    pp "****************************************************************\n"
    pp "num1 %s\n\n", offset a1
    pp "num2 %s\n\n", offset a2

    InitCycles                          ; start timer

    copy_from_end num1_end, a1_end, a1_length
    copy_from_end num2_end, a2_end, a2_length

;    get_time inputting took             ; uncomment to separate input timing

; add per dword, and propagate the carry thru the dword
    mov ecx, ASCII_LENGTH_DWORDS
    dec ecx
    mov dl, 0
    add_loop:
        mov eax, DWORD PTR num1[ecx*4]
        add eax, DWORD PTR num2[ecx*4]
        sub eax, 30303030h
        add al, dl
        cmp al, 3ah
        jl @f
            sub al, 0ah
            add ah, 1
        @@:
        mov dl, 0
        cmp ah, 3ah
        jl @F
            sub ah, 0ah
            mov dl, 1 
        @@:
        bswap eax
        add ah, dl
        cmp ah, 3ah
        jl @f
            sub ah, 0ah
            add al, 1
        @@:
        mov dl, 0
        cmp al, 3ah
        jl @F
            sub al, 0ah
            mov dl, 1 
        @@:
        bswap eax
        mov DWORD PTR ans[ecx*4], eax
        dec ecx
        jge add_loop
; if dl = 1 the addition overflowed

; find first non-zero digit to determine length of answer
    xor ecx, ecx
    @@:
        cmp BYTE PTR ans[ecx], 30h
        jne @F
        inc ecx
        jmp @B
    @@:
    sub ecx, ASCII_LENGTH
    neg ecx
    mov ans_length, ecx

; get time, print answer
    get_time adding took
    lea esi, ans_end
    sub esi, ans_length
    pp "\nans  %s \n\n", esi

ret
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
end aa
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

I really have no idea if this is worth anything, if anyone cares I can clean it up in various ways.
Title: Re: ascii adder
Post by: zedd151 on September 26, 2015, 05:07:37 PM
I really have no idea if this is worth anything, if anyone cares I can clean it up in various ways.

No, the code looks fine. It's doing still around 5000 cycles, I think bswap is killing it.

is the main loop aligned, I haven't checked?

later:

with the main loop aligned, no change
Code: [Select]

[code]
adding took 5016 cycles



Anyway, I was trying earlier to do something very similar 4 bytes at a time.

Could you set up my original algo - the faster one - in your testbed for better comparison?

I would try to put yours into mine but it might alter certain facets of your algo.

Mine is a little more forgiving I think. You can put a call to it from a simple proc as in my example.

later:

I tried putting my algo in your testbed for better comparison, but this is the result I got,
which I question.

Code: [Select]
adding took 7044 cycles - yours

adding took 6288 cycles - mine

So, I'm pretty sure I screwed it up. lol.
Title: Re: ascii adder
Post by: zedd151 on September 26, 2015, 07:05:01 PM
used a rolled up version of StrLen as a macro here, in place of the clumsy string length loops.

Code: [Select]
w StrLen macro and AAA
2055
2003
2001
2000
2002

with AAA
2643
2565
2567
2565
2565

original
5492
5549
5533
5549
5509

Press any key to continue ...


I found a bug in the original design.

Not in the algo itself, but with the stack balancing.  ::)
So I removed all the faulty code.....
Title: Re: ascii adder
Post by: dedndave on September 26, 2015, 10:47:34 PM
the one i am writing uses the masm32 StrLen function, because i know it's reasonably fast   :P
Title: Re: ascii adder
Post by: rrr314159 on September 26, 2015, 11:45:28 PM
Quote from: zedd151
No, the code looks fine. It's doing still around 5000 cycles, I think bswap is killing it.

- dunno, on my i5 it's usually about 2400 - or close to 2000, with the "111..." and "222..." test cases; but on AMD more like twice that. Do u have an AMD? I think bswap is slow on them ...
Title: Re: ascii adder
Post by: TWell on September 26, 2015, 11:58:51 PM
AMD E450 1.65 GHz
Code: [Select]
w strlen
3720
3708
3719
3825
3719
with AAA
4878
4874
4907
4873
4875
original
5926
5992
6011
6109
6145
Title: Re: ascii adder
Post by: dedndave on September 27, 2015, 12:16:48 AM
i tested mine with 256 dup(31h) and 256 dup(32h)
the counts are over 5000 cycles, so no use finishing that routine - lol
Title: Re: ascii adder
Post by: dedndave on September 27, 2015, 12:44:05 AM
here's the code, if you're interested

look-up table methods usually compare rather well
in this case, though, it's a pipeline pile-up in the loop   :lol:

Code: [Select]
        ALIGN   16

AddAscii PROC USES EBX ESI EDI lpszVal1:LPSTR,lpszVal2:LPSTR,lpszResult:LPSTR

;--------------------------------

        .DATA

AddAscTable db 60h,62h,64h,66h,68h,6Ah,6Ch,6Eh,70h,72h
            db 61h,63h,65h,67h,69h,6Bh,6Dh,6Fh,71h,73h

        ALIGN   4

;--------------------------------

        .CODE

    mov     edi,lpszVal1
    mov     esi,lpszVal2
    INVOKE  StrLen,edi
    xchg    eax,ebx
    INVOKE  StrLen,esi
    mov     edx,lpszResult                  ;EDX = base address of result buffer
    .if eax<ebx
        xchg    eax,ebx                     ;EDI,EBX = address,length of shorter input
        xchg    esi,edi                     ;ESI,EAX = address,length of longer input
    .endif
    add     edx,eax                         ;EDX = address of last result byte
    lea     esi,[esi+eax-1]                 ;ESI = address of last input byte (longer)
    xor     ecx,ecx                         ;ECX = 0
    sub     eax,ebx                         ;EAX = length difference (ripple carry count) (clears CF)
    mov byte ptr [edx+1],cl                 ;null terminate result
    mov     eax,ecx                         ;EAX = 0
    lea     edi,[edi+ebx-1]                 ;EDI = address of last input byte (shorter)
    .repeat
        mov     cl,[esi]
        dec     esi
        adc     cl,[edi]
        dec     edi
        mov     al,AddAscTable[ecx-60h]
        shr     eax,1
        dec     ebx
        mov     [edx],al
        lea     edx,[edx-1]
    .until ZERO?

;more code here to ripple carry

    ret

AddAscii ENDP
Title: Re: ascii adder
Post by: zedd151 on September 27, 2015, 12:59:10 AM
the one i am writing uses the masm32 StrLen function, because i know it's reasonably fast   :P

Copycat. I had the same idea, but I macro-ized it. It did improve things.

Quote
look-up table methods usually compare rather well
I awas working on one too.

I'll check your out though. :t
Title: Re: ascii adder
Post by: dedndave on September 27, 2015, 01:09:16 AM
oops - one of the table values is wrong
won't speed things up, though
edited the two tables above
Title: Re: ascii adder
Post by: zedd151 on September 27, 2015, 01:11:03 AM
 
Code: [Select]
zedds w strlen macro
2043
2005
2011
2009
2016

with AAA
2687
2706
2668
2579
2570

dave AddAscii
5102
4885
4946
4886
4871

I dunno, dave.

run my testebd below, on your machine to see what he says.

I fixed the table in the download, dave.
Title: Re: ascii adder
Post by: zedd151 on September 27, 2015, 01:27:34 AM
Here, I swapped the calls to StrLen with the macro calls

Code: [Select]
dave AddAscii - using my StrLen macro

4954
4883
4875
4892
4900


Well, the macros use a rolled (not unrolled StrLen)

and my fastest algo using a call to StrLen, instead of the macros...

Code: [Select]
dave AddAscii
2032
1979
1985
1990
1988


I think I will stick with my fast version, and the StrLen macro.  8)

Your code is fine dave, find the bottleneck.
Title: Re: ascii adder
Post by: dedndave on September 27, 2015, 01:38:20 AM
oh - well, if it runs well on your machine, i will finish the code
i am using an older P4 model CPU
Title: Re: ascii adder
Post by: zedd151 on September 27, 2015, 01:43:53 AM
oh - well, if it runs well on your machine, i will finish the code
i am using an older P4 model CPU

My computer is a piece of  ^%$*^$#%#*^ also.

Interested to see the numbers from your machine using the same testbed.

My computer could be generating false results. It seems ok for side by side comparison though.

Not sure if the numbers actually represent actual cycles, though.

If your interested, the fast algo is attached to the first post of the thread, among other places.

As for my 'fast' algo, it seems about as fast as it can be. using the call to StrLen only offered slight improvement over using the macro. I tried unrolling the macro, but not much improvement.

All in all, I am satisfied with it.

It was a good experience. Thank you for some of your ideas, and different approaches to consider.

It was a nice little exercise.

zedd
Title: Re: ascii adder
Post by: dedndave on September 27, 2015, 01:58:55 AM
try this one
notice that the result may not be left-justified in the buffer
but, the result address is returned in EAX

Code: [Select]
AddAscii PROTO :LPSTR,:LPSTR,:LPSTR
Code: [Select]
;***********************************************************************************************

        ALIGN   16

AddAscii PROC USES EBX ESI EDI lpszVal1:LPSTR,lpszVal2:LPSTR,lpszResult:LPSTR

;Call With: lpszVal1   = address of first ASCII decimal input string
;           lpszVal2   = address of second ASCII decimal input string
;           lpszResult = address of result buffer (assumed to be large enough)
;
;  Returns: EAX        = address of result string
;
;Also Uses: EBX, ESI, EDI, EBP are preserved

;--------------------------------

        .DATA
        ALIGN   4

AddAscTable db 60h,62h,64h,66h,68h,6Ah,6Ch,6Eh,70h,72h
            db 61h,63h,65h,67h,69h,6Bh,6Dh,6Fh,71h,73h

;--------------------------------

        .CODE

    mov     edi,lpszVal1
    mov     esi,lpszVal2
    INVOKE  StrLen,edi
    xchg    eax,ebx
    INVOKE  StrLen,esi
    mov     edx,lpszResult                  ;EDX = base address of result buffer
    .if eax<ebx
        xchg    eax,ebx                     ;EDI,EBX = address,length of shorter input
        xchg    esi,edi                     ;ESI,EAX = address,length of longer input
    .endif
    add     edx,eax                         ;EDX = address of last result byte
    lea     esi,[esi+eax-1]                 ;ESI = address of last input byte (longer)
    xor     ecx,ecx                         ;ECX = 0
    sub     eax,ebx                         ;EAX = length difference (ripple carry count) (clears CF)
    mov byte ptr [edx+1],cl                 ;null terminate result
    push    eax                             ;save ripple carry count
    mov     eax,ecx                         ;EAX = 0
    lea     edi,[edi+ebx-1]                 ;EDI = address of last input byte (shorter)
    .repeat
        mov     cl,[esi]
        dec     esi
        adc     cl,[edi]
        dec     edi
        mov     al,AddAscTable[ecx-60h]
        shr     eax,1
        dec     ebx
        mov     [edx],al
        lea     edx,[edx-1]
    .until ZERO?
    pop     ecx                             ;recall ripple carry count
    rcl     ebx,1                           ;EBX has the last carry bit
    .while (ecx) && (ebx)
        mov     al,[esi]
        inc     al
        dec     esi
        .if al==3Ah
            mov     al,30h
        .else
            mov     bl,bh
        .endif
        mov     [edx],al
        dec     ecx
        lea     edx,[edx-1]
    .endw
    .while ecx
        mov     al,[esi]
        dec     esi
        mov     [edx],al
        dec     ecx
        lea     edx,[edx-1]
    .endw
    lea     eax,[edx+1]
    ret

AddAscii ENDP

;******
*****************************************************************************************
Title: Re: ascii adder
Post by: zedd151 on September 27, 2015, 02:06:35 AM
here ya go, dave..

Code: [Select]

dave AddAscii new
4951
4860
4860
4898
4882

Press any key to continue ...
Title: Re: ascii adder
Post by: dedndave on September 27, 2015, 02:07:52 AM
well, that went to shit - lol
let me see if i can improve the last parts
Title: Re: ascii adder
Post by: zedd151 on September 27, 2015, 02:10:11 AM
let me see if i can improve the last parts

That's ok dave. I'm done with it.

You can still try, though.

Next Topic:

Ascii Subtractor.

After that, Ascii Multiplier! (The divider will be a lot of work, dunno if I'd even attempt it)
Title: Re: ascii adder
Post by: rrr314159 on September 27, 2015, 02:29:16 AM
Quote from: zedd151
Could you set up my original algo - the faster one - in your testbed for better comparison?

- I hope I did it right :biggrin: used ascii_adder_4, got about 2100 cycles (varies +/- 100). Whereas using your adder_counts, it gets about 1600 (+/- 10). I already knew your (actually dedndave's, I believe) method is more stable (mine is made for on-the-fly performance analysis instead of stability, so is useful in its own way) but it seems your counter also gives a lower number, by about 25%. Fortunately they're both consistent so can still be used for comparison purposes. Anyway, my algo appears a bit slower (about 2400, after some fixes). But it's so different I can't immediately put it into your count algo. If I find the time I'll make it of the same form (a proc which accepts 3 inputs); seems worth the trouble for me to get on board with everyone else's approach - not just for this project, but for the future also. Let you know how it comes out.
Title: Re: ascii adder
Post by: zedd151 on September 27, 2015, 02:33:08 AM

. But it's so different I can't immediately put it into your count algo.


I had the same problem. Very different coding styles. Not that there is anything wrong with that.

Quote
Let you know how it comes out.

Sounds good. Was a fun little project, wasn't it?
Title: Re: ascii adder
Post by: dedndave on September 27, 2015, 02:40:24 AM
here's what i recommend....

create a routine that converts large ascii decimal strings into large binary integers (aka arbitrary integers)
create another routine that converts large integers into strings
well - we already have that one - Ding Dong Cafe - lol

now, write routines that add, subtract, multiply, and divide arbitrary integers

if you just want to add 2 strings together,  working directly on ascii may be ok
but, if you want to do more manipulation than just one operation, binary routines will be considerably faster

convert the string(s) to binary
do the math
convert the result back to a string
Title: Re: ascii adder
Post by: dedndave on September 27, 2015, 02:43:50 AM
a routine that converts large ascii strings to integers could be done using the Ling Long Kai Fang method (Horner's Rule)
Title: Re: ascii adder
Post by: zedd151 on September 27, 2015, 02:48:07 AM
....convert the string(s) to binary
do the math
convert the result back to a string

I know that.  :P

It was just a litle coding exercise. Subtraction, I magine will be pretty much the same.
Multiplication will be more complex. Division, forget it using the Ascii methods.

Better idea, buy a decent calculator or math software. :P

Yeah, I know you like your 'Moo Goo Guy Pan' procedures.  :lol:
Title: Re: ascii adder
Post by: dedndave on September 27, 2015, 02:56:33 AM
well, there are "BigNum" libraries out there
most of them are limited to 256 or 512 bits of precision, though

if you want fast, learn to use the FPU and SSE floats
even though they are limited in both precision and range, multiple-precision routines can be developed to extend them

for example, an 80-bit FPU real has 64 bits of binary precision
but, if i carry a single value in two 80-bit reals, it can have 128 bits of precision
and - math will be much faster than discrete integer math
Title: Re: ascii adder
Post by: dedndave on September 27, 2015, 02:58:56 AM
(https://upload.wikimedia.org/wikipedia/commons/thumb/2/26/IEEE_754_Extended_Floating_Point_Format.svg/762px-IEEE_754_Extended_Floating_Point_Format.svg.png)

a single value, then, might be expressed as two reals (A and B)

N = B *264 + A
Title: Re: ascii adder
Post by: zedd151 on September 27, 2015, 03:00:23 AM
well, there are "BigNum" libraries out there

Absolutely.
Quote
if you want fast, learn to use the FPU and SSE floats

I tried once or twice learning FPU, but it's a different animal than I am used to.
Quote
- math will be much faster than discrete integer math

Undoubtedly.

Was a fun project though.  8)
Title: Re: ascii adder
Post by: dedndave on September 27, 2015, 03:05:02 AM
the FPU isn't that bad   :biggrin:

you just have to learn to keep track of values on the FPU stack, is all

the FPU is like an 8-shooter, with a rotating cylinder
the "top of stack" is the one in the barrell at the time - lol

(http://files.harrispublications.com/wp-content/uploads/sites/6/2014/09/Smith-Wesson-PC-Model-929-2.jpg)
Title: Re: ascii adder
Post by: zedd151 on September 27, 2015, 03:06:13 AM
the FPU is like an 8-shooter,

You frighten me.  :lol:

 From time to time, I do look at Raymonds FPU tutes.

But this old brain is no longer a very absorbent sponge.

More like a sponge that was squeezed too often.  :P
Title: Re: ascii adder
Post by: dedndave on September 27, 2015, 03:07:05 AM
i highly recommend Ray's tutorial (and his libraries, too)

http://www.ray.masmcode.com/ (http://www.ray.masmcode.com/)
Title: Re: ascii adder
Post by: zedd151 on September 27, 2015, 03:08:17 AM
i highly recommend Ray's tutorial (and his libraries, too)

http://www.ray.masmcode.com/ (http://www.ray.masmcode.com/)

I just posted above yours :P
Title: Re: ascii adder
Post by: rrr314159 on September 27, 2015, 04:00:04 AM
Quote from: dedndave
if you just want to add 2 strings together,  working directly on ascii may be ok

- I bet just adding 2 numbers (or even doing a few additions) can be done faster working directly with ascii, than any other way. You have to remember that conversion from ascii decimal to binary integer - and back again - takes time. Not so sure about subtraction. For mult and div, or many adds/subtracts in a row, undoubtedly better to convert to binary first.

Quote from: dedndave
if you want fast, learn to use the FPU and SSE floats

- for exact precision of very large numbers (like 10^256, or 10^2048, as used in RSA algo) it's not so clear that floating point is the right way? 80 bits only gets you to 10^24 so you still need to chain them together - tricky to get exact precision? Don't forget 64-bit integers (with 64-bit code) are very easy to use. I suppose somebody knows what's the best approach; my guess, 64-bit integer is better than FPU (or SSE) - for exact precision. AVX, not to mention AVX512, is another story

Quote from: zedd151
Was a fun little project, wasn't it? ...
Was a fun project though.  8)

- well, sure! ... also, as I say, if you only want one or a few additions ascii approach is probably the best
Title: Re: ascii adder
Post by: zedd151 on September 27, 2015, 04:06:50 AM
... also, as I say, if you only want one or a few additions ascii approach is probably the best

 You just gave me an idea  :idea:

Suppose we have a variable number of additions to do. I am going to *try* to run a simple
test using 3, 4, 5 inputs, to determine how much longer each extra calculation takes.

During those tests, for simplicity all the input strings will have the same length.
Then again, they don't necessarily have to be the same length....

thinking....................
Title: Re: ascii adder
Post by: zedd151 on September 27, 2015, 09:08:26 AM
Tried a minor rewrite, attempting to free up ebp from usage.

Code: [Select]
    OPTION PROLOGUE:NONE
    OPTION EPILOGUE:NONE
    align 16
    nops 12
    asciiadderNA proc src1:dword, src2:dword, dst:dword
        push ebx
        push esi
        push edi
        mov esi, [esp+10h]
        mov ebx, [esp+14h]
        invoke StrLen, esi
        push eax
        invoke StrLen, ebx
        mov edx, eax
        pop ecx
        mov edi, [esp+18h]
        add edi, ecx
        add edi, 1
    @@:
        xor eax, eax
        mov al, byte ptr [esi+ecx]
        add al, byte ptr [ebx+edx]
        AAA
        add al, 30h
        add byte ptr [edi], al
        mov byte ptr [edi-1], ah
        dec edi
        dec edx
        dec ecx
        cmp ecx, -1
        jnz @b
        dec edi
        add word ptr [edi], 3030h
        pop edi
        pop esi
        pop ebx
        ret 12
    asciiadderNA endp
    OPTION PROLOGUE:PrologueDef
    OPTION EPILOGUE:EpilogueDef

Results on m,y machine were about ~20 cycles slower. I even used the unrolled StrLen proc,

and took a shorcut (necessitating the input strings be the same length)

It was a half-assed try any way.

I have another idea to do this two bytes at a time, no shr or ror needed....
lemme try it out.
I still want to exclude ebp from casual usage.....
Title: Re: ascii adder
Post by: zedd151 on September 27, 2015, 09:30:56 AM
 :biggrin:
Title: Re: ascii adder
Post by: rrr314159 on September 27, 2015, 09:45:23 AM
I managed to squeeze another 17 cycles out of the fastest (so far) algo. Now ebp is left alone.

 - I thought you were moving on to subtraction! Maybe I'll revisit the addition algo, see if I can squeeze out a few cycles.

- BTW, do u (and / or anyone else) agree that (as I think) for a few additions, this ascii-based way ought to be the fastest? If so it makes the exercise worthwhile; if not, not. For mult, div, any serious manipulations, clearly it's best to convert to binary and do all arithmetic there, then convert back. But there could be apps where just additions are desired. So do u (and / or anyone else) agree that "ascii arithmetic" is probably the fastest way to add two (or a few) large decimal ascii numbers?
Title: Re: ascii adder
Post by: zedd151 on September 27, 2015, 10:00:17 AM
Ummmm...the project is on hold. I found a bug where if one string is much shorter, a sort of buffer under run, if that makes any sense.

Now back to the drawing board.

I know it was working 100% at one point
Code: [Select]
1

plus

99999999999999999999999999999999999999999999999999999999999999999999999999999999
99999999999999999999999999999999999999999999999999999999999999999999999999999999
99999999999999999999999999999999999999999999999999999999999999999999999999999999
99999999999999999999999999999999999999999999999999999999999999999999999999999999
99999999999999999999999999999999999999999999999999999999999999999999999999999999
99999999999999999999999999999999999999999999999999999999999999999999999999999999
99999999999999999999999999999999999999999999999999999999999999999999999999999999
99999999999999999999999999999999999999999999999999999999999999999999999999999999
99999999999999999999999999999999999999999999999999999999999999999999999999999999
99999999999999999999999999999999999999999999999999999999999999999999999999999999
99999999999999999999999999999999999999999999999999999999999999999999999999999999
99999999999999999999999999999999999999999999999999999999999999999999999999999999
9999999999999999999999999999999999999999999999999999999999999999

equals

10000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000

all good now. Unfortunately that is basically the old algo.

Somewhere in the optimized main loop of the 'fast' algo, there is some sort of bug....
.................................................
..................................... now to go bughunting.

That'll teach me to make a bunch of changes without testing the results.  ::)

The old algo is the only one that works with inputs with uneven lengths.
I have tried to coerce the new algo to work properly, but have failed.

The new algo only works if the input string lengths are equal. :(


My software does not have bugs, it just has features you need to understand...
....to use it properly.  :biggrin:
Title: Re: ascii adder
Post by: zedd151 on September 27, 2015, 01:22:18 PM
Okay a fresh new look at this little program.

I removed ebp as a general purpose register. Also using StrLen, to obtain of course the length of the strings.

Code: [Select]

    include \masm32\include\masm32rt.inc

    asciiadder PROTO :DWORD, :DWORD, :DWORD

    .data
         ss1 db "1", 0
         ss2 db "99999999999999999999999999999999999", 0
         dd1 db 40h dup (0)
    .code

    start:
        invoke asciiadder, addr ss2, addr ss1, addr dd1
        invoke MessageBoxA, 0, addr dd1, 0, 0
        invoke ExitProcess, 0

    OPTION PROLOGUE:NONE
    OPTION EPILOGUE:NONE
    asciiadder proc src1:dword, src2:dword, dst:dword
        push ebx
        push esi
        push edi
        mov esi, [esp+10h]
        mov ebx, [esp+14h]
        mov edi, [esp+18h]

        invoke StrLen, esi
        dec eax
        push eax

        invoke StrLen, ebx
        dec eax
        mov edx, eax
        pop ecx

        cmp ecx, edx
        jl edxgr
        add edi, ecx
        jmp setdst
    edxgr:
        add edi, edx
    setdst:
        inc edi
        ; ---------------------- main loop
    calc:
        mov al, 0
        cmp ecx, -1
        js @f
        mov al, byte ptr [esi+ecx]
    @@:
        cmp edx, -1
        js @f
        add al, byte ptr [ebx+edx]
    @@:
        sub al, 30h
        cmp al, 0Ah
        jl @f
        sub al, 30h
    @@:
        cmp al, 0Ah
        jl @f
        sub al, 0Ah
        inc byte ptr [edi-1]
    @@:
        add al, 30h
        add al, byte ptr [edi]
        cmp al, 39h
        jng @f
        sub al, 0Ah
        inc byte ptr [edi-1]
    @@:
        mov byte ptr [edi], al
        dec edi
        dec edx
        dec ecx
        cmp edi, [esp+18h]
        ja calc
        ; ---------------------- end main loop
        cmp byte ptr [edi], 30h
        jg @f
        add byte ptr [edi], 30h
    @@:
        pop edi
        pop esi
        pop ebx
        ret 12
    asciiadder endp
    OPTION PROLOGUE:PrologueDef
    OPTION EPILOGUE:EpilogueDef

    end start

Now I will once again concentrate on optimizing the main loop. Or at least try to minimize the number of jumps in this thing.
Title: Re: ascii adder
Post by: zedd151 on September 27, 2015, 02:24:52 PM
New Test Piece.

Easy to use fixture for testing the modifications I will be making while 'optimising' the AsciiAdder.
 ;)
Title: Re: ascii adder
Post by: rrr314159 on September 27, 2015, 03:04:58 PM
We both tried the approach of adding dwords at a time, and subtracting 30303030h 4 bytes at a time, etc. Makes sense but doesn't gain enough for the trouble; byte-by-byte is still faster. Obvious is to use SSE: 16 bytes at a time will probably be fastest way. If I get around to it ...
Title: Re: ascii adder
Post by: zedd151 on September 27, 2015, 03:18:54 PM
I have reduced the register usage:

Code: [Select]
    OPTION PROLOGUE:NONE
    OPTION EPILOGUE:NONE
    asciiadder proc src1:dword, src2:dword, dst:dword
        push edi
        mov edi, [esp+10h]

        invoke StrLen, [esp+8]
        push eax

        invoke StrLen, [esp+10h]
        mov edx, eax
        pop ecx

        cmp ecx, edx
        jl edxgr
        add edi, ecx
        jmp setdst
    edxgr:
        add edi, edx
    setdst:
        add ecx, [esp+8]
        add edx, [esp+0Ch]
        dec ecx
        dec edx
        ; ---------------------- main loop
       
    calc:
        xor eax, eax
        cmp ecx, [esp+8]
        jl @f
        add al, byte ptr [ecx]
    @@:
        cmp edx, [esp+0Ch]
        jl @f
        add al, byte ptr [edx]
    @@:
        sub al, 30h
        cmp al, 0Ah
        jl @f
        sub al, 30h
    @@:
        cmp al, 0Ah
        jl @f
        sub al, 0Ah
        inc byte ptr [edi-1]
    @@:
        add al, 30h
        add al, byte ptr [edi]
        cmp al, 39h
        jng @f
        sub al, 0Ah
        inc byte ptr [edi-1]
    @@:
        mov byte ptr [edi], al
        dec ecx
        dec edx
        dec edi
        cmp edi, [esp+10h]
        jg calc
        inc edi
        ; ---------------------- end main loop
       
        cmp byte ptr [edi-1], 30h
        jg @f
        add byte ptr [edi-1], 30h
    @@:
        pop edi
        ret 12
    asciiadder endp
    OPTION PROLOGUE:PrologueDef
    OPTION EPILOGUE:EpilogueDef

Now I think I can put AAA back in. using a different mechnism to detect the start of the uneven string. I had previously relied on checking the sign flag for the buffers' counter register.

But since the buffers no longer have counters associated, I am simply directly comparing with the src pointer for each buffer. It took a little while to get to this point.

Now I will go through the process again of trying to get it up to speed. (again)
Title: Re: ascii adder
Post by: zedd151 on September 27, 2015, 03:53:23 PM
Okay!! I have finally managed to both keep register usage to minimum, and also reinstate using the AAA instruction.  :icon_exclaim:

It took longer than I thought it should

Code: [Select]
    OPTION PROLOGUE:NONE
    OPTION EPILOGUE:NONE
    asciiadder proc src1:dword, src2:dword, dst:dword
        push edi
        mov edi, [esp+10h]
            invoke StrLen, [esp+8]
        push eax
            invoke StrLen, [esp+10h]
        mov edx, eax
        pop ecx
        cmp ecx, edx
        jl edxgr
        add edi, ecx
        jmp setdst
    edxgr:
        add edi, edx
    setdst:
        add ecx, [esp+8]
        add edx, [esp+0Ch]
        dec ecx
        dec edx
    calc:
        xor eax, eax
        mov al, byte ptr [edi]
        cmp ecx, [esp+8]
      jl @f
        add al, byte ptr [ecx]
      @@:
        cmp edx, [esp+0Ch]
      jl @f
        add al, byte ptr [edx]
      @@:
        cmp al, 39h
      jle @f
        AAA                         ; AAA per suggestion by dedndave
        add al, 30h
      @@:
        mov byte ptr [edi-1], ah    ; ah contains the carry byte
        mov byte ptr [edi], al      ; al contains the sum
        dec ecx
        dec edx
        dec edi
        cmp edi, [esp+10h]
        jg calc
        inc edi
        cmp byte ptr [edi-1], 30h
      jg @f
        add byte ptr [edi-1], 30h
      @@:
        pop edi
        ret 12
    asciiadder endp
    OPTION PROLOGUE:PrologueDef
    OPTION EPILOGUE:EpilogueDef

Still don't like all the cmp's and jxx's in there...
Title: Re: ascii adder
Post by: zedd151 on September 27, 2015, 04:05:47 PM
Okay, here are the numbers so far..

Code: [Select]
newstyle --- * with the lower register usage. and AAA
3399
3344
3345
3363
3350

original old style - from the very first version, for comarison
5498
5543
5510
5513
5555

Title: Re: ascii adder
Post by: dedndave on September 27, 2015, 07:33:32 PM
P4 Prescott w/htt @3GHz, XP MCE2005 SP3, 4Gb RAM
Code: [Select]
newstyle --- *
8492
8504
8505
8504
8745
original old style
7783
8219
7853
7873
7852
Title: Re: ascii adder
Post by: TWell on September 27, 2015, 08:35:19 PM
AMD E450
Code: [Select]
newstyle --- *
6770
6780
6806
6771
6769
original old style
5607
5678
5683
5669
5706
Title: Re: ascii adder
Post by: zedd151 on October 07, 2015, 06:45:57 AM
Project aborted.  ::)

Too many inconsistencies involving certain lengths.
Title: Re: ascii adder
Post by: rrr314159 on October 07, 2015, 08:50:17 AM
Heck that's too bad. Well if you get bored tackle the more-standard way to do it, as dedndave said, convert to binary first, etc. That's a lot of work but, in a way, more straightforward
Title: Success!
Post by: zedd151 on July 26, 2018, 09:12:31 AM
Based on some of jimg's ideas, I now have a working ascii adder procedure, and it so far seems very accurate.
 
Code: [Select]
        ascii_adder proc src1:dword, src2:dword, dst:dword, lent:dword
        local carrie:dword
            push esi
            push edi
            push ebx
            mov esi, src1
            mov edi, src2
            mov ebx, dst
            mov ecx, lent
            mov carrie, 0
            top:
                mov eax,0
                mov al,byte ptr [esi+ecx-1]
                mov dl,byte ptr [edi+ecx-1]
                add al,dl
                sub al, 30h
                add eax, carrie
                mov carrie, 0
                cmp al, 39h
               
                jbe @f
                    mov carrie, 1
                    sub al, 10
                @@:
               
                mov [ebx+ecx-1], al
                dec ecx
                cmp ecx, 0
            jnz top
            pop ebx
            pop edi
            pop esi
            ret
        ascii_adder endp


Jim posted a routine in felipes' fibonacci thread, and it reminded me of this project.
So, I stripped out the essential parts and put them into their own procedure. Works well
for generating a fibonacci sequence.