News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Fast memory allocation

Started by nidud, December 21, 2015, 11:35:32 PM

Previous topic - Next topic

nidud

From Wayback Machine, this thread from before nidud's mass deletion of all of his posts
niduds deleted posts from this thread
Unfortunately the zip files were not archived (by Wayback Machine) and will not work.

guga

my results


Intel(R) Core(TM) i7 CPU         870  @ 2.93GHz (SSE4.2)
----------------------------------------------
512K -- 200000h
185290    cycles -    499K - 0: HeapAlloc
23885     cycles -    499K - 7: malloc - fixed
23607     cycles -    499K - 8: malloc - auto
17786     cycles -    499K - 9: malloc - dynamic
2M -- 200000h
1058865   cycles -   1996K - 0: HeapAlloc
24210     cycles -   1996K - 7: malloc - fixed
39535     cycles -   1996K - 8: malloc - auto
26909     cycles -   1996K - 9: malloc - dynamic
3M -- 200000h
1556853   cycles -   2904K - 0: HeapAlloc
1088794   cycles -   2904K - 7: malloc - fixed
50941     cycles -   2904K - 8: malloc - auto
694567    cycles -   2904K - 9: malloc - dynamic
4M -- 200000h
1624188   cycles -   3993K - 0: HeapAlloc
1235034   cycles -   3993K - 7: malloc - fixed
25434     cycles -   3993K - 8: malloc - auto
728995    cycles -   3993K - 9: malloc - dynamic
8M -- 200000h
2322818   cycles -   7986K - 0: HeapAlloc
1964005   cycles -   7986K - 7: malloc - fixed
71131     cycles -   7986K - 8: malloc - auto
1154075   cycles -   7986K - 9: malloc - dynamic

result:
   210648 cycles - code(544) 8: malloc - auto
  2622332 cycles - code(496) 9: malloc - dynamic
  4335928 cycles - code(325) 7: malloc - fixed
  6748014 cycles - code( 53) 0: HeapAlloc
hit any key to continue...



question. Can this new malloc (auto) be a replacement for virtualalloc ?
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

jj2007

Here are some more, but honestly, I have no idea how to interpret your numbers ::)

Can you give a verbose version of this?
   176684 cycles - code(544) 8: malloc - auto
  2541613 cycles - code(496) 9: malloc - dynamic



Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz (AVX)
----------------------------------------------
512K -- 200000h
173231    cycles -    499K - 0: HeapAlloc
15217     cycles -    499K - 7: malloc - fixed
14822     cycles -    499K - 8: malloc - auto
20625     cycles -    499K - 9: malloc - dynamic
2M -- 200000h
406836    cycles -   1996K - 0: HeapAlloc
16913     cycles -   1996K - 7: malloc - fixed
19218     cycles -   1996K - 8: malloc - auto
21011     cycles -   1996K - 9: malloc - dynamic
3M -- 200000h
729334    cycles -   2904K - 0: HeapAlloc
985867    cycles -   2904K - 7: malloc - fixed
50724     cycles -   2904K - 8: malloc - auto
735871    cycles -   2904K - 9: malloc - dynamic
4M -- 200000h
587725    cycles -   3993K - 0: HeapAlloc
1070638   cycles -   3993K - 7: malloc - fixed
29647     cycles -   3993K - 8: malloc - auto
711767    cycles -   3993K - 9: malloc - dynamic
8M -- 200000h
846687    cycles -   7986K - 0: HeapAlloc
1006395   cycles -   7986K - 7: malloc - fixed
62273     cycles -   7986K - 8: malloc - auto
1052339   cycles -   7986K - 9: malloc - dynamic

result:
   176684 cycles - code(544) 8: malloc - auto
  2541613 cycles - code(496) 9: malloc - dynamic
  2743813 cycles - code( 53) 0: HeapAlloc
  3095030 cycles - code(325) 7: malloc - fixed

nidud

#3
deleted

guga

Tks nidud,

Btw, i presume that your function uses a table of structures (M_BLOCK) to store the address of the memory to be allocated/deallocated, right ?

So, if it is a array of these structures, is there a limit of the number of elements ?  mean, what is the maximum amount of M_BLOCKS structures ? When this amount exceeds, another array is created to ?

I´m asking this, because, let´s say a program have 300 calls to this function  and the maximum amount of these calls are 100. The new memalloc will create the extra 200 group of M_BLOCKS so all 300 calls can work ? In other words, in case of exceeding is there a correspondence of realloc or a function that extendes the memory blocks ?
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

nidud

#5
deleted

Grincheux

Here are my results. What can you say in final? Which kind of memory allocation must we use? Is there an algo (HeapAlloc, malloc, VirtualAlloc...) for intel and an other to use and/or if user have <8Mb memory or > 16Mb. What about the size of the memory allocated ? Same question as above.

I expect your are patient, this a very hard analyze.

Quote
C:\Users\Grincheux\Downloads\malloc>timeit
AMD Athlon(tm) II X2 250 Processor (SSE3)
----------------------------------------------
512K -- 200000h
341027    cycles -    499K - 0: HeapAlloc
17903     cycles -    499K - 7: malloc - fixed
17255     cycles -    499K - 8: malloc - auto
23946     cycles -    499K - 9: malloc - dynamic
2M -- 200000h
659761    cycles -   1996K - 0: HeapAlloc
18834     cycles -   1996K - 7: malloc - fixed
22671     cycles -   1996K - 8: malloc - auto
25272     cycles -   1996K - 9: malloc - dynamic
3M -- 200000h
682142    cycles -   2904K - 0: HeapAlloc
765550    cycles -   2904K - 7: malloc - fixed
42529     cycles -   2904K - 8: malloc - auto
1188740   cycles -   2904K - 9: malloc - dynamic
4M -- 200000h
633392    cycles -   3993K - 0: HeapAlloc
908497    cycles -   3993K - 7: malloc - fixed
30416     cycles -   3993K - 8: malloc - auto
1344787   cycles -   3993K - 9: malloc - dynamic
8M -- 200000h
1255745   cycles -   7986K - 0: HeapAlloc
1048506   cycles -   7986K - 7: malloc - fixed
65802     cycles -   7986K - 8: malloc - auto
2025293   cycles -   7986K - 9: malloc - dynamic

result:
   178673 cycles - code(544) 8: malloc - auto
  2759290 cycles - code(325) 7: malloc - fixed
  3572067 cycles - code( 53) 0: HeapAlloc
  4608038 cycles - code(496) 9: malloc - dynamic
hit any key to continue...
C:\Users\Grincheux\Downloads\malloc>

Grincheux

Quote from: Grincheux on January 04, 2016, 08:58:44 AM
Here are my results. What can you say in final? Which kind of memory allocation must we use? Is there an algo (HeapAlloc, malloc, VirtualAlloc...) for intel and an other to use and/or if user have <8Gb memory or > 16Gb. What about the size of the memory allocated ? Same question as above.
Is there a possibility that the rams vendor affects the algo, or the quality ot this ram?

I expect your are patient, this a very hard analyze.

Quote
C:\Users\Grincheux\Downloads\malloc>timeit
AMD Athlon(tm) II X2 250 Processor (SSE3)
----------------------------------------------
512K -- 200000h
341027    cycles -    499K - 0: HeapAlloc
17903     cycles -    499K - 7: malloc - fixed
17255     cycles -    499K - 8: malloc - auto
23946     cycles -    499K - 9: malloc - dynamic
2M -- 200000h
659761    cycles -   1996K - 0: HeapAlloc
18834     cycles -   1996K - 7: malloc - fixed
22671     cycles -   1996K - 8: malloc - auto
25272     cycles -   1996K - 9: malloc - dynamic
3M -- 200000h
682142    cycles -   2904K - 0: HeapAlloc
765550    cycles -   2904K - 7: malloc - fixed
42529     cycles -   2904K - 8: malloc - auto
1188740   cycles -   2904K - 9: malloc - dynamic
4M -- 200000h
633392    cycles -   3993K - 0: HeapAlloc
908497    cycles -   3993K - 7: malloc - fixed
30416     cycles -   3993K - 8: malloc - auto
1344787   cycles -   3993K - 9: malloc - dynamic
8M -- 200000h
1255745   cycles -   7986K - 0: HeapAlloc
1048506   cycles -   7986K - 7: malloc - fixed
65802     cycles -   7986K - 8: malloc - auto
2025293   cycles -   7986K - 9: malloc - dynamic

result:
   178673 cycles - code(544) 8: malloc - auto
  2759290 cycles - code(325) 7: malloc - fixed
  3572067 cycles - code( 53) 0: HeapAlloc
  4608038 cycles - code(496) 9: malloc - dynamic
hit any key to continue...
C:\Users\Grincheux\Downloads\malloc>

nidud

#8
deleted

hutch--

It would be worth adding to the list of allocation techniques the old API GlobalAlloc() with the GMEM_FIXED flag as it is a fair bit faster that the older style of GlobalLock() then GlobalAlloc(). Very few of the memory allocation strategies are well suited for large counts of small allocations, VirtualAlloc() is usually the slowest followed by HeapAlloc() depending on the flags used. A language like BASIC pays a penalty for its string allocation as it does a separate allocation for each string and a reallocation for each string change.

Depending on your needs, a single fixed size allocation that is addressed in slices with a pointer array is far faster if you can live with fixed sized blocks, it also has the additional advantage that you can clear it with a zero fill of the entire allocation while retaining the same set of pointers. If you need individually sized blocks that require to be resized on a needs basis, you are stuck with individual allocations for each item but you pay a speed penalty for doing so both in the allocation and freeing each block when no longer required.

nidud

#10
deleted