Hello Im trying to find the right include file to use the memset function (https://msdn.microsoft.com/en-us/library/aa246471(v=vs.60).aspx). I tried both libc and msvcrt but masm keeps telling me "error A2006: undefined symbol : memset".
Anyone knows what include file/ lib file i should use ?
thanks in advance
seems like you should probably convert one of these headers to ".inc" include file???
use a tool such as h2inc.exe to make the conversion
Quote
<memory.h> or <string.h>
that is if you have those headers.
edited for clarity
Also, when you see that a library requires a <header.h>
The ".h" implies that a C language header file.
You can use the library with masm32, but you must obtain the C header files, and convert
using a tool such as the h2inc mentioned above.
Once the header is converted, the ".inc" include file should be ready to use with
the library.
I have located h2incX which is funamentally the same as h2inc, I believe.
Attached...
msvcrt.inc
crt_memset
msvcrt.lib
Quote from: TWell on September 24, 2015, 01:52:43 AM
msvcrt.inc
crt_memset
msvcrt.lib
Doh! Now why didn't I think of that?
include \masm32\include\masm32rt.inc ; plain Masm32 for the fans of pure assembler
.data
somestring db "Hello, I am a stupid string", 0
.code
AppName db "Masm32:", 0
start:
invoke crt_memset, offset somestring+7, "x", 13
inkey offset somestring
exit
end start
Quote from: zedd151Doh! Now why didn't I think of that?
- the fact that masm32 crt functions start with crt_ is not intuitive! crt_printf (for instance) confused me for a long time, didn't understand it was just printf with a prefix
Quote from: jj2007for the fans of pure assembler
- who isn't? BTW, do you put "AppName" there to make it easier to find beginning of your code in a debugger?
Hi
MemSet or other msvcrt funktion is open source you can find it in visual studio VC\crt\src\intel.
Memset (Sets buffers to a specified character) is a simply function this fill a buffer with a char
similar
cld
lea edi,plainbuff
mov ecx,_len
shr ecx, 2
mov eax, "X" ; <<<<<<<
rep stosd
mov ecx,_len
and ecx, 3
rep stosb
regards,
Quote from: rrr314159 on September 24, 2015, 02:11:30 AMdo you put "AppName" there to make it easier to find beginning of your code in a debugger?
Exactly :P
Quote from: ragdog on September 24, 2015, 03:10:58 AM
mov eax, "X" ; <<<<<<<
mov eax, "XXXX" yields more convincing results 8)
Hehe right Jochen
was a quickly example :biggrin:
Thanks a lot guys
Quote from: ragdog on September 24, 2015, 03:10:58 AM
cld
lea edi,plainbuff
mov ecx,_len
shr ecx, 2
mov eax, "X" ; <<<<<<<
rep stosd
mov ecx,_len
and ecx, 3
rep stosb
I never actually use these string instructions such as rep stosd. So i was wondering what is in fact the difference between the following code (which i would normally use) and the code above?
lea edi,plainbuff
mov ecx,_len
mov eax, "X"
@@:
mov byte ptr [edi],al
inc edi
dec ecx
jnz @B
The simple answer is the 'rep stosd' would be faster as it operates on a dword (4 bytes) at a time.
The other one only operates on a single byte at a time.
Now if you changed the bottom to
lea edi,plainbuff
mov ecx,_len
mov eax, "XXXX"
@@:
mov dword ptr [edi], eax
add edi, 4 ; 4 bytes at a time
sub ecx, 4 ;
jnz @B
would run faster than the bytewise move as well.
There are better ways to code it though.
I'm still a n00b.
REP STOSD is pretty fast, provided you are writing on 4-aligned addresses
it is a strong competitor with any discrete loop code, even SSE code
the reason is, CPU manufacturers have optimized internal microcode to get good perfomrance
you are approaching the point where hardware limitations over-ride software limitations
i.e., no matter how fast the code is, the hardware can only clear memory so fast
and, that's where REP STOSD sits on most 32-bit machines
also, per the windows ABI....
the direction flag is normally cleared (up direction) unless the process alters it
and, when calling many API functions, it must be cleared
so - if you want to go up, you don't have to mess with it (not like DOS days)
if you need to go down, STD and CLD are slow instructions on many machines
but - STD, do your thing, then CLD to clear it again before calling any API functions
Just for fun :P
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz (SSE4)
[rsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrrr#]
5575 cycles for 100 * rep stosd
[ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc#]
6393 cycles for 100 * crt memset
[GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG#]
14681 cycles for 100 * Gelatine
[sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sss#]
2159 cycles for 100 * SSE2
[rsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrrr#]
5571 cycles for 100 * rep stosd
[ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc#]
6393 cycles for 100 * crt memset
[GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG#]
14675 cycles for 100 * Gelatine
[sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sss#]
2048 cycles for 100 * SSE2
[rsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrrr#]
5572 cycles for 100 * rep stosd
[ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc#]
6403 cycles for 100 * crt memset
[GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG#]
14673 cycles for 100 * Gelatine
[sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sss#]
2172 cycles for 100 * SSE2
[rsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrrr#]
5652 cycles for 100 * rep stosd
[ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc#]
6391 cycles for 100 * crt memset
[GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG#]
14676 cycles for 100 * Gelatine
[sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sss#]
2149 cycles for 100 * SSE2
AMD E-450 APU with Radeon(tm) HD Graphics (SSE4)
[rsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsr
srsrsrsrsrsrsrsrsrrr#]
9943 cycles for 100 * rep stosd
[ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
cccccccccccccccccccc#]
10118 cycles for 100 * crt memset
[GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
GGGGGGGGGGGGGGGGGGGG#]
22639 cycles for 100 * Gelatine
[sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse
2sse2sse2sse2sse2sss#]
6668 cycles for 100 * SSE2
[rsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsr
srsrsrsrsrsrsrsrsrrr#]
9900 cycles for 100 * rep stosd
[ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
cccccccccccccccccccc#]
10122 cycles for 100 * crt memset
[GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
GGGGGGGGGGGGGGGGGGGG#]
22582 cycles for 100 * Gelatine
[sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse
2sse2sse2sse2sse2sss#]
6752 cycles for 100 * SSE2
[rsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsr
srsrsrsrsrsrsrsrsrrr#]
9986 cycles for 100 * rep stosd
[ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
cccccccccccccccccccc#]
10279 cycles for 100 * crt memset
[GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
GGGGGGGGGGGGGGGGGGGG#]
22700 cycles for 100 * Gelatine
[sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse
2sse2sse2sse2sse2sss#]
6754 cycles for 100 * SSE2
[rsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsr
srsrsrsrsrsrsrsrsrrr#]
9967 cycles for 100 * rep stosd
[ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
cccccccccccccccccccc#]
10196 cycles for 100 * crt memset
[GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
GGGGGGGGGGGGGGGGGGGG#]
22758 cycles for 100 * Gelatine
[sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse
2sse2sse2sse2sse2sss#]
6761 cycles for 100 * SSE2
--- ok ---
that looks like the test is designed to favor SSE
let's just fill it with 0's :t
Genuine Intel(R) CPU T2060 @ 1.60GHz (SSE3)
[rsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsr
srsrsrsrsrsrsrsrsrrr#]
8152 cycles for 100 * rep stosd
[ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
cccccccccccccccccccc#]
8299 cycles for 100 * crt memset
[GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
GGGGGGGGGGGGGGGGGGGG#]
21351 cycles for 100 * Gelatine
[sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse
2sse2sse2sse2sse2sss#]
5799 cycles for 100 * SSE2
[rsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsr
srsrsrsrsrsrsrsrsrrr#]
7192 cycles for 100 * rep stosd
[ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
cccccccccccccccccccc#]
8294 cycles for 100 * crt memset
[GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
GGGGGGGGGGGGGGGGGGGG#]
21255 cycles for 100 * Gelatine
[sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse
2sse2sse2sse2sse2sss#]
5803 cycles for 100 * SSE2
[rsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsr
srsrsrsrsrsrsrsrsrrr#]
7208 cycles for 100 * rep stosd
[ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
cccccccccccccccccccc#]
8313 cycles for 100 * crt memset
[GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
GGGGGGGGGGGGGGGGGGGG#]
21292 cycles for 100 * Gelatine
[sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse
2sse2sse2sse2sse2sss#]
5882 cycles for 100 * SSE2
[rsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsrsr
srsrsrsrsrsrsrsrsrrr#]
7304 cycles for 100 * rep stosd
[ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
cccccccccccccccccccc#]
8315 cycles for 100 * crt memset
[GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
GGGGGGGGGGGGGGGGGGGG#]
21246 cycles for 100 * Gelatine
[sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse2sse
2sse2sse2sse2sse2sss#]
5804 cycles for 100 * SSE2
--- ok ---
Quote from: dedndave on September 25, 2015, 04:37:40 AM
that looks like the test is designed to favor SSE
let's just fill it with 0's :t
:eusa_dance:
See Fast MemSet and Instr() algos (http://masm32.com/board/index.php?topic=94.msg49848#msg49848).