News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

The fastest way to fill a dword array with string values

Started by frktons, December 09, 2012, 02:49:23 AM

Previous topic - Next topic

frktons

I'd like to fill an array of 1000 dword with the string values ranging
from '   0' to ' 999'.
To avoid 400k increment in program size and writing manually 1000
values in .data, I prefer to declare the array in .data? and fill the
array with a proc: call FillArray.
I've some ideas on how to to that, but before starting the tests I'd like
your suggestions, code, already done experiments.. to think upon.

Let me know.

Frank
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama

dedndave

FillArray PROC USES EBX ESI EDI

        mov     edi,offset MyArray   ;or whatever you name it - it should be 4-aligned
        xor     eax,eax
        mov     ebx,1
        mov     edx,2
        mov     esi,3
        mov     ecx,1000/4

FArry0: mov     [edi],eax
        mov     [edi+4],ebx
        mov     [edi+8],edx
        mov     [edi+12],esi
        add     eax,4
        add     ebx,4
        add     edx,4
        add     esi,4
        add     edi,16
        sub     ecx,1
        jnz     FArry0

        ret

FillArray ENDP

frktons

Quote from: dedndave on December 09, 2012, 03:04:46 AM
FillArray PROC USES EBX ESI EDI

        mov     edi,offset MyArray   ;or whatever you name it - it should be 4-aligned
        xor     eax,eax
        mov     ebx,1
        mov     edx,2
        mov     esi,3
        mov     ecx,1000/4

FArry0: mov     [edi],eax
        mov     [edi+4],ebx
        mov     [edi+8],edx
        mov     [edi+12],esi
        add     eax,4
        add     ebx,4
        add     edx,4
        add     esi,4
        add     edi,16
        sub     ecx,1
        jnz     FArry0

        ret

FillArray ENDP


Hi Dave, I'm not sure this routine matchs my need.
You sure this routine fills the array with strings 4
bytes long containing the ascii representation of
numbers 0-999 with leading spaces?
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama

dedndave

ohhhhhhhhhhhhhhh
no - i am sure that it does not - lol
it fills the array with the binary values

give me a minute.....

EDIT: it is run once at the beginning of the program
so, i would not think speed is super-critical, anyways

dedndave

FillArray PROC USES EBX ESI EDI

        mov     edi,offset MyArray   ;or whatever you name it
        xor     esi,esi
        mov     ebx,1000

FArry0: INVOKE  crt__ultoa,esi,edi,10
        add     edi,4
        inc     esi
        dec     ebx
        jnz     FArry0

        ret

FillArray ENDP

raymond

Before you design an algo, you first need to specify:
(i) if the ascii representations will be left-aligned or right-aligned within its 4-byte space when you print that 4000-byte string,
(ii) if right-aligned, should it have leading 0's or leading spaces for the numbers below 100,
(iii)which character do you need as the 4th character since each number needs at most 3 characters in ascii format.
Whenever you assume something, you risk being wrong half the time.
http://www.ray.masmcode.com

frktons

Quote from: raymond on December 09, 2012, 03:31:05 AM
Before you design an algo, you first need to specify:
(i) if the ascii representations will be left-aligned or right-aligned within its 4-byte space when you print that 4000-byte string,
(ii) if right-aligned, should it have leading 0's or leading spaces for the numbers below 100,
(iii)which character do you need as the 4th character since each number needs at most 3 characters in ascii format.

If you read the first post, all these questions will be answered.
All the answers lies in "   0" and " 999".  :P

Dave: the use of crt__ultoa makes me doubt it will be the fastest around.

In a few days we'll see.  :t
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama

TouEnMasm


Seems there is not to many soluces to be fast.
put 30h in a register = 0                >>> in memory
cmp 3ah
inc                              = 1               >>> //
..
inc   3Ah                    =  3130h = 10 >>> in memory
inc
cmp ...
Fa is a musical note to play with CL

frktons

Quote from: ToutEnMasm on December 09, 2012, 03:52:45 AM

Seems there is not to many soluces to be fast.
put 30h in a register = 0                >>> in memory
cmp 3ah
inc                              = 1               >>> //
..
inc   3Ah                    =  3130h = 10 >>> in memory
inc
cmp ...


This is a good solution, but I don't think it is the fastest around.
By the way if you want to post your test results we can compare
them with other solutions.
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama

dedndave

i didn't test it - and it may have a bug or 2 to work out
but, you get the idea
FillArray PROC

        mov     edx,offset MyArray   ;or whatever you name it - it should be 4-aligned
        mov     eax,302020h          ;EAX = "  0",0
        jmp short FArry4

FArry0: and     eax,030FFFFh
        cmp     ah,39h
        jz      FArry1

        cmp     ah,20h
        jz      FArry2

        add     ah,1
        jmp short FArry4

FArry1: sub     eax,8FFh
        cmp     al,30h
        ja      FArry4

        mov     al,31h
        jmp short FArry4

FArry2: mov     ah,31h
        jmp short FArry4

FArry3: add     eax,10000h

FArry4: mov     [edx],eax
        add     edx,4
        cmp     eax,393030h
        jb      FArry3

        ja      FArry0

        ret

FillArray ENDP

frktons

Better if you do test it and post the results. We are in the lab here,
and, by the way, you should use:

        mov     edx,offset MyArray   
        mov     eax,30202020h          ;EAX = "   0"


instead of:


        mov     edx,offset MyArray   ;or whatever you name it - it should be 4-aligned
        mov     eax,302020h          ;EAX = "  0",0


to meet the requisites of the test.  :P
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama

dedndave

Quote from: frktons on December 09, 2012, 04:26:58 AM
Better if you do test it and post the results. We are in the lab here...

better for who ?
i got you started - you can't get it from there ???   :lol:

Quote from: frktons on December 09, 2012, 04:26:58 AM
...and, by the way, you should use:

        mov     edx,offset MyArray   
        mov     eax,30202020h          ;EAX = "   0"


instead of:


        mov     edx,offset MyArray   ;or whatever you name it - it should be 4-aligned
        mov     eax,302020h          ;EAX = "  0",0


to meet the requisites of the test.  :P

no - we want to start with 2 spaces, an ASCII 0, and a null terminator
so the high byte of EAX should be binary 0

i will give you another idea, though
it might be slightly faster to start at the end of the array with "999",0 and work down

TouEnMasm


Quote
This is a good solution, but I don't think it is the fastest around.
By the way if you want to post your test results we can compare
them with other solutions.
I am not interested in wining some µs in a prog who is just used one time.
it was just a pseudo-code sample.
Fa is a musical note to play with CL

jj2007

Folks,
If it is supposed to be fast, it must be a solution with dword-packed xmm regs.
Load them with
[   0   1   2   3]
first, than add
[   4   4   4   4]
to get
[   4   5   6   7]
Go ahead, Frank!

frktons

Quote from: ToutEnMasm on December 09, 2012, 04:40:56 AM
Quote from: dedndave on December 09, 2012, 04:36:46 AM

i got you started - you can't get it from there ???   :lol:

I'm not able to correct your code, better if you do it yourself.  :eusa_snooty:

Quote
no - we want to start with 2 spaces, an ASCII 0, and a null terminator
so the high byte of EAX should be binary 0

Well that could be another test, now the test is what I stated in the first post.  :P


Quote
I am not interested in wining some µs in a prog who is just used one time.
it was just a pseudo-code sample.

In my intentions this will be part of a standard proc/routine I'm thinking about.
If you are not interested in the matter, it doesn't matter, you are welcome the
same.  :t

Quote from: jj2007 on December 09, 2012, 04:44:52 AM
Folks,
If it is supposed to be fast, it must be a solution with dword-packed xmm regs.
Load them with
[   0   1   2   3]
first, than add
[   4   4   4   4]
to get
[   4   5   6   7]
Go ahead, Frank!

This is what I was thinking about. It is not yet clear in my mind the sequence to
make it the fastest [how many xmm registers to use, which instructions...] but that
is the idea for a fast filling.  :t
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama