News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Memory allocation test piece.

Started by hutch--, June 24, 2016, 08:43:47 PM

Previous topic - Next topic

hutch--

The great seduction with 64 bit is the amount of memory you can allocate. This is a test piece to ensure that the allocation worked and is writable. I have dropped it down to 8 gig but tested it with 16, 32 and 48 gig and all seems to work OK. Note that the scanit algo is slow, its purpose is to test if the memory is writable, not to go fast.


; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    OPTION DOTNAME
   
    option casemap:none

    include \masm64\include\win64.inc
    include \masm64\include\temphls.inc

    include \masm64\include\kernel32.inc
    include \masm64\include\user32.inc
    include \masm64\include\msvcrt.inc

    includelib \masm64\lib\user32.lib   
    includelib \masm64\lib\kernel32.lib
    includelib \masm64\lib\msvcrt.lib

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

  .const
    gbc equ <8>     ;; gigabyte count

  .data?
    msize db 32 dup (?)

  .data
    ptrm dq msize
    pttl db "Result",0

  .code

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

main proc

    LOCAL pMem  :QWORD
    LOCAL bcnt  :QWORD

    sub rsp, 48
    ; push rbp

    invoke GlobalAlloc,GMEM_FIXED,1024*1024*1024*gbc
    mov pMem, rax

  ; ------------------------------
  ; test if all memory is readable
  ; ------------------------------
    mov rax, pMem
    mov rcx, 1024*1024*1024*gbc
    call scanit
    mov bcnt, rax

    invoke _ui64toa,bcnt,ptrm,10

    invoke MessageBox,0,ptrm,ADDR pttl,0

    invoke GlobalFree,pMem

    invoke ExitProcess,0

    ; pop rbp

main endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

scanit proc

    ; rax = address
    ; rcx = length

    xor r11, r11

  lbl:
    mov BYTE PTR [rax], "x"
    add rax, 1
    add r11, 1
    sub rcx, 1
    jnz lbl

    mov rax, r11

    retn

scanit endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

comment #

    https://msdn.microsoft.com/en-us/library/9z1stfyw.aspx

    Volatile
    rax rcx rdx r8 r9 r10 r11

    Non Volatile
    r12 r13 r14 r15 rdi rsi rbx rbp rsp

    Volotile
    xmm0 ymmo
    xmm1 ymm1
    xmm2 ymm2
    xmm3 ymm3
    xmm4 ymm4
    xmm5 ymm5

    Nonvolatile (XMM), Volatile (upper half of YMM)
    xmm6-15
    ymm6-15

#

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

  end

sinsi

Try 1TB :biggrin:

Seriously, I wonder how it would go, or if it would thrash the disk (or fail the allocation).
Maybe try alloc/dealloc in 8GB increments until it fails?
🍺🍺🍺

hutch--

The box has 64 gig so I have tested 16, 32 and 48 gig and it works OK. It fails on 128 gig as the box does not have that much memory.

hutch--

Not that it matters in this test piece but this is a bit faster for the scanit algo.


scanit proc

    ; rax = address
    ; rcx = length

    mov r11, rax

  lbl:
    mov BYTE PTR [rax], "x"
    add rax, 1
    sub rcx, 1
    jnz lbl

    sub rax, r11

    retn

scanit endp

hutch--

Playing with the "scanit" algo, I tried what should have been a much faster version that wrote 8 bytes at a time rather than 1 but it ranges from about 30% faster on smaller scans to about twice as fast on larger scans.


scanit proc

    ; rax = address
    ; rcx = length

    mov r11, rax

    shr rcx, 3

    mov rdx, ppad       ; address of "xxxxxxxx"

  lbl:
    mov QWORD PTR [rax], rdx
    add rax, 8
    sub rcx, 1
    jnz lbl

    sub rax, r11

    retn

;     mov r11, rax
;
;   lbl:
;     mov BYTE PTR [rax], "x"
;     add rax, 1
;     sub rcx, 1
;     jnz lbl
;
;     sub rax, r11
;
;     retn


scanit endp