News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Fast algorithm to copy an array inside a bigger array.

Started by popcalent, May 24, 2022, 05:06:48 AM

Previous topic - Next topic

popcalent

Hi, all.

This is not exactly an ASM question, but, hopefully someone will be able to help. I'm programming a little game for MS-DOS using Borland C. When I draw a new screen I use a buffer that I fill with 20x22 sprites, then I copy the buffer to the video memory after I wait for a vertical retrace.

The problem is that the function that I use to put the sprites in the buffer is too slow.
The buffer is 30804 positions (4 positions for metadata plus 220x140 pixels), the sprite is 404 positions (4 positions for metadata plus 20x22). Here's the code:


// x, y = coordinates where to put the sprite inside the buffer
void PutSpriteInBuffer (int x, int y, unsigned char sprite[])
        {
        int i, j;
        for (j=0;j<SPRITE_HEIGHT;j++)
                for (i=0;i<SPRITE_WIDTH;i++)
                        buffer[((y+j)*BUFFER_WIDTH)+x+i+4]= sprite[j*SPRITE_WIDTH+i+4];
        }


Example:
Let's say the buffer is an array of 5x5 zeroes, and the sprite is an array of 2x2 ones. I want to put the sprite in position (2,3) of the buffer. This is what I would get:

0,0,0,0,0,
0,0,0,0,0,
0,0,0,0,0,
0,0,1,1,0,
0,0,1,1,0,

This is very slow... So I tried to hardcode it so it only makes SPRITE_HEIGHT iterations (as opposed to SPRITE_HEIGHT*SPRITE_WIDTH), but still is too slow... Any suggestion on how I could improve this code?

Thanks!

daydreamer

But problem is you perform two multiplications in your innerloop
The faster algo is designed to use + BUFFER_WIDTH  and + SPRITE_WIDTH instead
Because it compiles to slower mul instructions than fast add

my none asm creations
https://masm32.com/board/index.php?topic=6937.msg74303#msg74303
I am an Invoker
"An Invoker is a mage who specializes in the manipulation of raw and elemental energies."
Like SIMD coding

FORTRANS

Hi,

   As stated, pull the multiplications out of the inner loop.
I am not a C programmer, by choice, so there may be errors.
But something along the lines of the following.


        int i, j;
        for (j=0;j<SPRITE_HEIGHT;j++)
                bTEMP = ((y+j)*BUFFER_WIDTH)+x+4;
                sTEMP = j*SPRITE_WIDTH+4;
                for (i=0;i<SPRITE_WIDTH;i++)
                        buffer[bTEMP+i]= sprite[sTEMP+i];


Cheers,

Steve N.

popcalent

Thanks! That helped!

I'm still having issues with many other things, though, but one step after another.