News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

include file for memset

Started by gelatine1, September 24, 2015, 01:21:07 AM

Previous topic - Next topic

gelatine1

Hello Im trying to find the right include file to use the memset function (https://msdn.microsoft.com/en-us/library/aa246471(v=vs.60).aspx). I tried both libc and msvcrt but masm keeps telling me "error A2006: undefined symbol : memset".
Anyone knows what include file/ lib file i should use ?

thanks in advance

zedd151

seems like you should probably convert one of these headers to ".inc" include file???

use a tool such as h2inc.exe to make the conversion

Quote
<memory.h> or <string.h>

that is if you have those headers.

edited for clarity

zedd151

#2
Also, when you see that a library requires a <header.h>

The ".h" implies that a C language header file.

You can use the library with masm32, but you must obtain the C header files, and convert
using a tool such as the h2inc mentioned above.

Once the header is converted, the ".inc" include file should be ready to use with
the library.

I have located h2incX which is funamentally the same as h2inc, I believe.

Attached...

TWell


zedd151


jj2007

include \masm32\include\masm32rt.inc ; plain Masm32 for the fans of pure assembler

.data
somestring db "Hello, I am a stupid string", 0

.code
AppName db "Masm32:", 0

start:
invoke crt_memset, offset somestring+7, "x", 13
inkey offset somestring
exit

end start

rrr314159

Quote from: zedd151Doh! Now why didn't I think of that?

- the fact that masm32 crt functions start with crt_ is not intuitive! crt_printf (for instance) confused me for a long time, didn't understand it was just printf with a prefix

Quote from: jj2007for the fans of pure assembler

- who isn't? BTW, do you put "AppName" there to make it easier to find beginning of your code in a debugger?
I am NaN ;)

ragdog

Hi

MemSet  or other msvcrt funktion is open source you can find it in visual studio VC\crt\src\intel.

Memset (Sets buffers to a specified character) is a simply function this fill a buffer with a char

similar

cld
lea edi,plainbuff
mov ecx,_len
shr ecx, 2
mov eax, "X"       ; <<<<<<<
rep stosd
mov ecx,_len
and ecx, 3
rep stosb


regards,

jj2007

Quote from: rrr314159 on September 24, 2015, 02:11:30 AMdo you put "AppName" there to make it easier to find beginning of your code in a debugger?

Exactly  :P

Quote from: ragdog on September 24, 2015, 03:10:58 AM
      mov      eax, "X"       ; <<<<<<<

mov eax, "XXXX" yields more convincing results 8)

ragdog

Hehe right Jochen

was a quickly example :biggrin:

gelatine1


gelatine1

Quote from: ragdog on September 24, 2015, 03:10:58 AM


cld
lea edi,plainbuff
mov ecx,_len
shr ecx, 2
mov eax, "X"       ; <<<<<<<
rep stosd
mov ecx,_len
and ecx, 3
rep stosb


I never actually use these string instructions such as rep stosd. So i was wondering what is in fact the difference between the following code (which i would normally use) and the code above?


                lea edi,plainbuff
mov ecx,_len
mov eax, "X"
                @@:
                mov byte ptr [edi],al
                inc edi
dec ecx
                jnz            @B

zedd151

The simple answer is the 'rep stosd' would be faster as it operates on a dword (4 bytes) at a time.

The other one only operates on a single byte at a time.


Now if you changed the bottom to

                lea edi,plainbuff
mov ecx,_len
mov eax, "XXXX"
                @@:
                mov dword ptr [edi], eax
                add edi, 4                                    ; 4 bytes at a time
sub ecx, 4 ;
                jnz            @B


would run faster than the bytewise move as well.
There are better ways to code it though.

I'm still a n00b.

dedndave

REP STOSD is pretty fast, provided you are writing on 4-aligned addresses
it is a strong competitor with any discrete loop code, even SSE code

the reason is, CPU manufacturers have optimized internal microcode to get good perfomrance
you are approaching the point where hardware limitations over-ride software limitations
i.e., no matter how fast the code is, the hardware can only clear memory so fast
and, that's where REP STOSD sits on most 32-bit machines

also, per the windows ABI....
the direction flag is normally cleared (up direction) unless the process alters it
and, when calling many API functions, it must be cleared
so - if you want to go up, you don't have to mess with it (not like DOS days)

if you need to go down, STD and CLD are slow instructions on many machines
but - STD, do your thing, then CLD to clear it again before calling any API functions

jj2007

Just for fun :P
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz (SSE4)

[rsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrrr#]
5575    cycles for 100 * rep stosd
[ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc#]
6393    cycles for 100 * crt memset
[GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG#]
14681   cycles for 100 * Gelatine
[sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sss#]
2159    cycles for 100 * SSE2

[rsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrrr#]
5571    cycles for 100 * rep stosd
[ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc#]
6393    cycles for 100 * crt memset
[GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG#]
14675   cycles for 100 * Gelatine
[sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sss#]
2048    cycles for 100 * SSE2

[rsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrrr#]
5572    cycles for 100 * rep stosd
[ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc#]
6403    cycles for 100 * crt memset
[GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG#]
14673   cycles for 100 * Gelatine
[sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sss#]
2172    cycles for 100 * SSE2

[rsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrrr#]
5652    cycles for 100 * rep stosd
[ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc#]
6391    cycles for 100 * crt memset
[GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG#]
14676   cycles for 100 * Gelatine
[sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sss#]
2149    cycles for 100 * SSE2