Author Topic: RtlCopyMemory  (Read 14579 times)

Zen

  • Member
  • ****
  • Posts: 962
  • slightly red-shifted
RtlCopyMemory
« on: July 04, 2014, 09:19:34 AM »
 :biggrin:
« Last Edit: July 20, 2014, 03:23:04 AM by Zen »
Zen

nidud

  • Member
  • *****
  • Posts: 1849
    • https://github.com/nidud/asmc
Re: RtlCopyMemory
« Reply #1 on: July 04, 2014, 09:30:00 AM »
the most reliable way to copy byte memory from one allocated block to another
Code: [Select]
mov edi,des
mov esi,src
mov ecx,count
rep movsb

jj2007

  • Member
  • *****
  • Posts: 10087
  • Assembler is fun ;-)
    • MasmBasic
Re: RtlCopyMemory
« Reply #2 on: July 04, 2014, 09:36:09 AM »
Try
   RtlCopyMemory equ crt_memcpy
   invoke RtlCopyMemory, addr dest, addr src, sizeof dest
(more)

MemCopy is also fine, so is rep movsb:
   push esi
   push edi
   mov esi, offset src
   mov edi, offset dest
   mov ecx, sizeof dest
   rep movsb
   pop edi
   pop esi

If speed matters, use rep movsd.

MichaelW

  • Global Moderator
  • Member
  • *****
  • Posts: 1209
Re: RtlCopyMemory
« Reply #3 on: July 04, 2014, 04:49:10 PM »
I was hoping to test the speed of RtlCopyMemory, but in my tests under Windows XP and Windows 7 ntoskrnl.exe did not export the function RtlCopyMemory, but did export the function RtlMoveMemory (along with a long list of other functions).
Code: [Select]
;==============================================================================
include \masm32\include\masm32rt.inc
;==============================================================================
.data
    hModule         HMODULE 0
    pRtlCopyMemory  dd      0
    pRtlMoveMemory  dd      0
    buff1           db      "my other brother darryl",0
    buff2           db      "                       ",0
.code
;==============================================================================
start:
;==============================================================================
    invoke LoadLibrary, chr$("ntoskrnl.exe")
    mov   hModule, eax
    invoke GetProcAddress, hModule, chr$("RtlCopyMemory")
    mov   pRtlCopyMemory, eax
    .IF eax == 0
        printf("%s\n",LastError$())
        jmp   @F
    .ENDIF
    push  SIZEOF buff1
    push  OFFSET buff1
    push  OFFSET buff2
    call  pRtlCopyMemory
  @@:
    invoke GetProcAddress, hModule, chr$("RtlMoveMemory")
    mov   pRtlMoveMemory, eax
    .IF eax == 0
        printf("%s\n",LastError$())
        jmp   @F
    .ENDIF
    push  SIZEOF buff1
    push  OFFSET buff1
    push  OFFSET buff2
    call  pRtlMoveMemory
    printf("%s\n\n", ADDR buff2)
  @@:
    inkey
    exit
;==============================================================================
end start
Well Microsoft, here’s another nice mess you’ve gotten us into.

jj2007

  • Member
  • *****
  • Posts: 10087
  • Assembler is fun ;-)
    • MasmBasic
Re: RtlCopyMemory
« Reply #4 on: July 04, 2014, 05:51:18 PM »

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 7027
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: RtlCopyMemory
« Reply #5 on: July 04, 2014, 07:38:07 PM »
Its a bit to do with the hardware, long ago a DWORD copy was faster but at least some of the later processors had special case circuitry that made REP MOVSB as fast as REP MOVSD. We did tests years ago that showed that under about 500 bytes incremented pointers were faster but over that the special case circuitry kicked in and was faster. You MAY get faster with an SSE copy but from the test I saw years ago this was barely the case.

If reliability is the main issue, REP MOVSB does the job fine. Its small and it can easily be inlined if it matters.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :skrewy:

MichaelW

  • Global Moderator
  • Member
  • *****
  • Posts: 1209
Re: RtlCopyMemory
« Reply #6 on: July 04, 2014, 08:23:27 PM »
You can test it easily...

I'm not convinced that it's so simple. I need to examine ntoskrnl.lib.
Well Microsoft, here’s another nice mess you’ve gotten us into.

nidud

  • Member
  • *****
  • Posts: 1849
    • https://github.com/nidud/asmc
Re: RtlCopyMemory
« Reply #7 on: July 04, 2014, 08:43:02 PM »
Rtl - Run Time Library

I think the C compiler inline this function, so it may just be a macro

Code: [Select]
#if defined(_M_AXP64) || defined(_M_IA64)

NTSYSAPI
VOID
NTAPI
RtlCopyMemory (
   VOID UNALIGNED *Destination,
   CONST VOID UNALIGNED *Source,
   SIZE_T Length
   );
...
#else

#define RtlMoveMemory(Destination,Source,Length) memmove((Destination),(Source),(Length))
#define RtlCopyMemory(Destination,Source,Length) memcpy((Destination),(Source),(Length))

nidud

  • Member
  • *****
  • Posts: 1849
    • https://github.com/nidud/asmc
Re: RtlCopyMemory
« Reply #8 on: July 04, 2014, 11:21:26 PM »
You may also assume crt_memcpy and crt_memmove is the same

These (or this) functions are fast on aligned data and thus difficult beat with a conventional algo. However they are not that fast on unaligned data. I wrote one that is a bit faster (at least on my CPU) on unaligned data but more or less the same speed aligned.
Code: [Select]
memcpy proc uses esi edi s1:ptr byte, s2:ptr byte, count:dword
mov edi,s1
mov esi,s2
mov ecx,count
test edi,3
jnz utail
mov edx,ecx
and edx,3
shr ecx,2
rep movsd
jmp tails[edx*4]
tails dd toend,t1,t2,t3
     @@:
mov eax,[esi]
mov edx,[esi+4]
mov ebx,[esi+8]
mov [edi],eax
mov [edi+4],edx
mov [edi+8],ebx
add esi,12
add edi,12
sub ecx,12
   utail:
cmp ecx,12
jnb @B
     @@:
rep movsb
jmp toend
     t2:
mov ax,[esi]
mov [edi],ax
jmp toend
     t3:
mov ax,[esi]
mov [edi],ax
mov al,[esi+2]
mov [edi+2],al
jmp toend
     t1:
mov al,[esi]
mov [edi],al
  toend:
mov eax,s1
ret
memcpy endp

Quote
AMD Athlon(tm) II X2 245 Processor (SSE3)
--------------------------------------------------
16535      cycles - a   1..256  (  0) crt_memcpy
16470      cycles - a   1..256  (  0) crt_memmove
14555      cycles - a   1..256  (124) memcpy

28018      cycles - u   1..256  (  0) crt_memcpy
28032      cycles - u   1..256  (  0) crt_memmove
21436      cycles - u   1..256  (124) memcpy

1393642    cycles - a 400..4000 (  0) crt_memcpy
1396192    cycles - a 400..4000 (  0) crt_memmove
1383366    cycles - a 400..4000 (124) memcpy

4464020    cycles - u 400..4000 (  0) crt_memcpy
4485279    cycles - u 400..4000 (  0) crt_memmove
3433094    cycles - u 400..4000 (124) memcpy

ragdog

  • Member
  • ****
  • Posts: 610
Re: RtlCopyMemory
« Reply #9 on: July 04, 2014, 11:30:49 PM »
Hi

Quote
Microsoft plans to formally banish the popular programming function that's been responsible for an untold number of security vulnerabilities over the years, not just in Windows but in countless other applications based on the C language. Effective later this year, Microsoft will add memcpy(), CopyMemory(), and RtlCopyMemory() to its list of function calls banned under its secure development lifecycle.

http://msdn.microsoft.com/en-us/library/bb288454.aspx

MichaelW

  • Global Moderator
  • Member
  • *****
  • Posts: 1209
Re: RtlCopyMemory
« Reply #10 on: July 05, 2014, 12:09:50 AM »
I finally managed to get a Windows 7 SDK installed, and in winnt.h RtlCopyMemory is defined as memcpy, and inlined only if _DBG_MEMCPY_INLINE_ is defined. The inline version checks for the source and destination overlapping, so it’s apparently inlined only for convenience.

And there is no ntoskrnl.lib.
Well Microsoft, here’s another nice mess you’ve gotten us into.

Tedd

  • Member
  • ***
  • Posts: 377
  • Procrastinor Extraordinaire
Re: RtlCopyMemory
« Reply #11 on: July 05, 2014, 01:05:50 AM »
RtlCopyMemory/memcpy copies memory from A to B, under the assumption they do not overlap.
RtlMoveMemory/memmove copies memory from A to B, under the assumption they do overlap.

The latter will still work if they are not overlapping, but takes extra unnecessary steps in that case.
Potato2

jj2007

  • Member
  • *****
  • Posts: 10087
  • Assembler is fun ;-)
    • MasmBasic
Re: RtlCopyMemory
« Reply #12 on: July 05, 2014, 01:10:19 AM »
Quote
Microsoft plans to formally banish ... memcpy(), CopyMemory(), and RtlCopyMemory()

And Intel & AMD will ban rep movsb :lol:

And there is no ntoskrnl.lib.

But there is ..\system32\ntoskrnl.exe, and it's crammed full of interesting exports. No RtlCopyMemory, however :(

Just for fun:

include \masm32\MasmBasic\MasmBasic.inc      ; download
NtSTRING STRUCT
 NtLength          USHORT ?
 NtMaxLength       USHORT ?
 NtBuffer          PCHAR ?
NtSTRING ENDS

.data
src      NtSTRING <sizeof xsrc, sizeof xsrc, xsrc>
dest     NtSTRING <0, sizeof xdest, xdest>
xsrc     db "This is a string", 0
xdest    db 100 dup(?)

  Init                  ; ### RtlCopyString example ###
  Dll "ntoskrnl.exe"
  Declare RtlCopyString, 2
  void RtlCopyString(addr dest, addr src)
  Print Str$("%i bytes copied, result: [", dest.NtLength), offset xdest, "]"
  Exit
end start

guga

  • Member
  • *****
  • Posts: 1074
  • Assembly is a state of art.
    • RosAsm
Re: RtlCopyMemory
« Reply #13 on: July 05, 2014, 01:25:51 AM »
RtlCOpyMemory and some others are only macros as defined in ntrtl.h and winnt.h

Code: [Select]
#else
#define RtlEqualMemory(Destination,Source,Length) (!memcmp((Destination),(Source),(Length)))
#endif

#define RtlMoveMemory(Destination,Source,Length) memmove((Destination),(Source),(Length))
#define RtlCopyMemory(Destination,Source,Length) memcpy((Destination),(Source),(Length))
#define RtlFillMemory(Destination,Length,Fill) memset((Destination),(Fill),(Length))
#define RtlZeroMemory(Destination,Length) memset((Destination),0,(Length))
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

nidud

  • Member
  • *****
  • Posts: 1849
    • https://github.com/nidud/asmc
Re: RtlCopyMemory
« Reply #14 on: July 05, 2014, 02:09:57 AM »
RtlCopyMemory/memcpy copies memory from A to B, under the assumption they do not overlap.
RtlMoveMemory/memmove copies memory from A to B, under the assumption they do overlap.

The latter will still work if they are not overlapping, but takes extra unnecessary steps in that case.
If you look at the source code (.\crt\string\I386\MEMCPY.ASM) you will see it's the same code. So both of them check for overlap.
Code: [Select]
ifdef MEM_MOVE
_MEM_   equ <memmove>
else  ; MEM_MOVE
_MEM_   equ <memcpy>
endif  ; MEM_MOVE

% public  _MEM_

A bit smaller and faster version  :biggrin:
Code: [Select]
memcpy proc uses esi edi s1:ptr byte, s2:ptr byte, count:dword
mov edi,s1
mov esi,s2
mov ecx,count
test edi,3
jnz utail
mov edx,ecx
shr ecx,2
and edx,3
rep movsd
jz toend
dec edx
jz t1
mov ax,[esi]
mov [edi],ax
     t1:
mov al,[esi+edx]
mov [edi+edx],al
  toend:
mov eax,s1
ret
     @@:
mov eax,[esi]
mov edx,[esi+4]
mov [edi],eax
mov [edi+4],edx
mov eax,[esi+8]
mov edx,[esi+12]
mov [edi+8],eax
mov [edi+12],edx
lea esi,[esi+16]
lea edi,[edi+16]
lea ecx,[ecx-16]
   utail:
cmp ecx,16
jnb @B
     @@:
rep movsb
jmp toend
memcpy endp