News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Hash tables for ultra fast dictionaries

Started by Biterider, January 10, 2022, 01:09:58 AM

Previous topic - Next topic

LiaoMi

Quote from: Biterider on January 24, 2022, 06:03:08 AM
Hi
I hacked something together using CRC32. I don't care about bit reflection at the moment since the goal is to get a hash value.
The results differ from my previous attempts because the polynomial used for the CRC32 is also different.

The results beat the FNV-1a for the first time, which is great.

Hi Biterider,

is it a graphical representation of graphs from python?  :rolleyes: On what principle does your Hash search work? Looking at the code, a simple search of values?

; ??????????????????????????????????????????????????????????????????????????????????????????????????
; Method:     Dictionary.Search
; Purpose:    Search in the Dictionary for a given Key.
; Arguments:  Arg1: -> Key
; Return:     xax -> Item or NULL if not found.

Method Dictionary.Search, uses xbx xdi xsi, pKey:PSTRING
;  DbgText "Dictionary.Search"
  SetObject xsi
  OCall xsi.Hash, pKey
  mov xdi, xax                                            ;Remember the hash
  mov ebx, [xsi].dMask
  and xbx, xax                                            ;ebx is used as index into a pointer array

  ;Search for the correct item where the hash and the Key match
  .while TRUE
    mov xdx, [xsi].pBuckets
    mov xax, POINTER ptr [xdx + sizeof(POINTER)*xbx]
    .if xax == HTL_BUCKET_EMPTY
      DbgText "Not found"
      ExitMethod
    .endif
    .if xax != HTL_BUCKET_TOMBSTONE
      invoke StrComp, pKey, [xax].DIC_ITEM.pKey
      .break .if eax == 0                                 ;Found => exit loop
    .endif
    shr xdi, DIC_PERTURB_SHIFT                            ;edi = Perturbation
    lea ebx, [4*ebx + ebx + 1]
    add ebx, edi
    and ebx, [xsi].dMask
  .endw
  mov xdx, [xsi].pBuckets
  mov xax, POINTER ptr [xdx + sizeof(POINTER)*xbx]
MethodEnd


Optimizing HashMap's Performance - https://www.baeldung.com/java-hashmap-optimize-performance

3. Tree Instead of LinkedList
Starting from Java 8, one optimization is built-in in HashMap: When buckets are getting too large, they're transformed into trees, instead of linked lists. That brings the pessimistic time of O(n) to O(log(n)), which is much better. For that to work, the keys of HashMap need to implement the Comparable interface.

That's a nice and automatic solution, but it's not perfect. O(log(n)) is still worse than desired constant time, and transforming and storing trees takes additional power and memory.

4.3. Custom hashCode
Usually, we want to override the equals method, and then we also need to override hashCode. Sometimes, we can take advantage of the specific identity of the class and easily make a very fast hashCode method.

Let's say our object's identity is purely based on its integer id. Then, we can just use this id as a hash function:

@Override
public boolean equals(Object o) {
    if (this == o) return true;
    if (o == null || getClass() != o.getClass()) return false;

    MemberWithId that = (MemberWithId) o;

    return id.equals(that.id);
}

@Override
public int hashCode() {
    return id;
}



High performance hashmap(wen load factor is high), using AVL tree instead linked list to resolve hash collision - https://github.com/LYKZZzz/HashMap
Using red-black tree and hash-map to implement a spelling corrector - https://github.com/nirnicole/Spelling_Corrector-C

OPTIMIZING HASHMAPS EVEN MORE - https://blog.yoshuawuyts.com/optimizing-hashmaps-even-more/
However we've known for a long time about a faster hashmap than Swisstable: the perfect hashtable. The idea is that if you know all your keys ahead of time, you can create a lookup table which never creates collisions, never has to re-index the hashmap, or grow in size2. Though it's not always possible to know all keys ahead of time, so there are limits to when it can be used.

Even so, let's take a look at when it does make sense to use perfect hashtables, and show examples of how to implement them.

Sorting a HashMap - https://www.interviewkickstart.com/learn/sort-hashmap
HashMap is a very popular data structure used in the programming paradigm. Many programming problems in software engineering interviews can be tackled by the simple use of a HashMap.

In this article, we will cover:

The Internal Workings of a HashMap
Problem Statement
Solution #1: Sorting a HashMap Using a LinkedHashMap
Solution #2: Sorting a HashMap Using a TreeMap
Time and Space Complexity of Sorting a HashMap
FAQs on Sorting a HashMap

six_L

Hi, LiaoMi
as you said, For short data/strings, simplicity of FNV or djb2 are hard to beat, they are very performant on short data as well.

draw_PointThread proc USES rdi rbx rsi

lea rsi,org_str
invoke GetSystemTimePreciseAsFileTime, addr StartTime
@@:

invoke RtlZeroMemory,addr NameA,sizeof NameA
lea rdi,NameA
invoke iRandom_1,3,15
mov _rndstrlen,rax ;NameALen

xor rbx,rbx
.repeat
invoke iRandom_1,0,org_StrlenMax
mov al,[rsi+rax]
mov [rdi+rbx],al
inc rbx
.until rbx == _rndstrlen

invoke lstrlen,addr NameA
invoke XXH3_64bits,addr NameA,rax
xor rdx,rdx
mov rcx,x_max
div rcx ;hash % x_max
mov hHash_IndexA,edx

invoke RtlZeroMemory,addr NameB,sizeof NameB

lea rdi,NameB
invoke iRandom_1,3,15
mov _rndstrlen,rax ;NameBLen

xor rbx,rbx
.repeat
invoke iRandom_1,0,org_StrlenMax
mov al,[rsi+rax]
mov [rdi+rbx],al
inc rbx
.until rbx == _rndstrlen

invoke lstrlen,addr NameB
invoke XXH3_64bits,addr NameB,rax
xor rdx,rdx
mov rcx,y_max
div rcx ;hash % y_max
mov hHash_IndexB,edx

invoke InvalidateRect,Main_hWnd,0,FALSE
invoke UpdateWindow,Main_hWnd

lea rdi,TxtColorTable
mov rax,dqColorCnt
mov rcx,[rdi+rax*4]
mov point_cor,ecx
inc dqColorCnt
.if dqColorCnt == 6
mov    dqColorCnt,0
.endif

invoke Sleep,1
inc dqTotalCalcHash
.if dqTotalCalcHash >= 10000
mov StopThreadFlag,TRUE
.endif
cmp StopThreadFlag,TRUE
jnz @B

invoke GetSystemTimePreciseAsFileTime, addr EndTime

xor rax,rax
xor rdx,rdx
mov eax,StartTime.dwHighDateTime
shl rax, 32
mov edx,StartTime.dwLowDateTime
add rax,rdx
mov rcx,rax

xor rax,rax
xor rdx,rdx
mov eax,EndTime.dwHighDateTime
shl rax, 32
mov edx,EndTime.dwLowDateTime
add rax,rdx

sub rax,rcx
invoke wsprintf,addr szBuffer,CStr("SpendTime: %Id"),rax
invoke SetWindowText,Main_hWnd,addr szBuffer
ret

draw_PointThread endp


1669717931(us)   XXH3_64bits
1669303733(us)   FNV-1A

FNV-1A is faster 414198(us)than XXH3_64bits on calculating  20000 hash values
Say you, Say me, Say the codes together for ever.

jj2007

On a side not, six_L: I had never seen GetSystemTimePreciseAsFileTime - interesting, thanks. It seems, though, that it is not ideal for benchmarking ("QueryPerformanceCounter will never speed up or slow down") :cool:

Biterider

Hi LiaoMi
Quote from: LiaoMi on January 24, 2022, 08:24:02 AM
is it a graphical representation of graphs from python?  :rolleyes: On what principle does your Hash search work? Looking at the code, a simple search of values?
I coded my own testbed and dumped the results into an Excel spreadsheet to create the charts.
The HashTable and Dictionary are very similar to what Python does. They use "open addressing", ie no trees or linked lists are used in the event of a collision. Instead, the next available bucket is used to store the pointer to the dictionary data. The search pattern to find this bucket comes from Python. It uses a perturbation to take advantage of the unused part of the hash number. The concept is somewhat similar to double hashing, but with much less computational effort.
The strategy is described in the following source
https://github.com/python/cpython/blob/main/Objects/dictobject.c#:~:text=%23define%20PERTURB_SHIFT%205

By the way, if there is no need to delete information from the dictionary, the tombstones can be removed.

Biterider

LiaoMi

Quote from: six_L on January 24, 2022, 06:24:30 PM
Hi, LiaoMi
as you said, For short data/strings, simplicity of FNV or djb2 are hard to beat, they are very performant on short data as well.

Hi six_L,

thanks for the stats and tests! In general, this is not surprising if we compare the code and the input data  :icon_idea:

Could you please check these functions? - XXH32 and XXH64, XXHL64_default_sse2, XXHL64_default_avx2, XXH3_64bits_dispatch ...

six_L

Hi,jj2007
thanks you for the following useful information.
QuoteAs long as your computer's clock is not currently undergoing SystemTimeAdjustment, the elapsed intervals measured by:

  •    GetSytemTimeAsPreciseFileTime
  •    QueryPerformanceCounter
will both:

  •     be in sync
  •     be within a few tenths of a microsecond of each other
  •     have an ultimate resolution of 100ns (0.1 us)
For accurate time measurements these are the two functions you should use. The question, which one do you care for the task:
    GetSystemTimeAsPreciseFileTime: high resolution measurement of "now"

  •     good for log files
  •     timestamping of events
  •     synchronized to UTC for cross-machine agreement of "now"
    QueryPerformanceCounter: high resolution measurement of "elapsed time"

  •     good for bench-marking
  •     measuring duration of processes
the GetSystemTimePreciseAsFileTime is not ideal for benchmarking.

Hi,LiaoMi
QuoteCould you please check these functions? - XXH32 and XXH64, XXHL64_default_sse2, XXHL64_default_avx2, XXH3_64bits_dispatch ...
i'll test them lately
Say you, Say me, Say the codes together for ever.

LiaoMi

#51
XXH3: a new hash speed record holder - https://sudonull.com/post/31491-XXH3-New-Hash-Rate-Record-GlobalSign-Blog

Some algorithms from Google:
Сityhash - https://github.com/google/cityhash
Fast strong hash functions: SipHash/HighwayHash - https://github.com/google/highwayhash

https://en.wikipedia.org/wiki/SipHash
SipHash is an add–rotate–xor (ARX) based family of pseudorandom functions created by Jean-Philippe Aumasson and Daniel J. Bernstein in 2012. SipHash computes a 64-bit message authentication code from a variable-length message and 128-bit secret key. It was designed to be efficient even for short inputs, with performance comparable to non-cryptographic hash functions, such as CityHash.

C library to calculate the FNV Hash function in 32, 64, 128, 256 or 1024 bits. The FNV-1a version is supported - https://github.com/fnvhash/libfnv
(C language FNV hash function library with all supported bit lengths of FNV-1a: 32, 64, 128, 256, 512, and 1024)

t1ha - One of the fastest hash functions - https://github.com/erthink/t1ha
WangYi´s wyhash - The FASTEST QUALITY hash function, random number generators (PRNG) and hash map - https://github.com/wangyi-fudan/wyhash
Git mirror of the Fowler Noll Vo (FNV) hash algorithm original C source - https://github.com/catb0t/fnv-hash or http://www.isthe.com/chongo/tech/comp/fnv/index.html

Biterider

Hi
Here is a simple description of how xxHash works.
The secret of its speed is the parallel processing of 4 streams.  :icon_idea:

https://create.stephan-brumme.com/xxhash/#git1

Biterider

six_L

Hi,LiaoMi

there was the better benchmarking algos in our forum,  e.g. get the CPU info, set the priority level of thread. but it's the x86 style. i need time to convert into x64.
sorry for testing more slowly.

;=================================================
QuoteWhat are the longest words in the Oxford English Dictionary?

    Pneumonoultramicroscopicsilicovolcanoconiosis (45 letters)
    Hippopotomonstrosesquippedaliophobia (36 letters)
    Supercalifragilisticexpialidocious (34 letters)
    Pseudopseudohypoparathyroidism (30 letters)
    Floccinaucinihilipilification (29 letters)
    Antidisestablishmentarianism (28 letters)

we'll use the hash to create and determine a data structures position, not to checksum. so The test datas are limited to less than 64 characters.

regard.
Say you, Say me, Say the codes together for ever.

HSE

Hi!

Quote from: LiaoMi on January 24, 2022, 08:24:02 AM
Sorting a HashMap - https://www.interviewkickstart.com/learn/sort-hashmap
HashMap is a very popular data structure used in the programming paradigm. Many programming problems in software engineering interviews can be tackled by the simple use of a HashMap.

Then:

    1) for most purposes HashTable and HashMap are exactly same thing?

    2) You never sort the HashMap, You create a new ordered HashMap?

   3) can be sorted the HashTable under design?

Thanks in advance, HSE.
Equations in Assembly: SmplMath

Biterider

Hi HSE
The example of the linked article only refers to "open hashing" or "closed addressing", which usually means chaining.
As far as I can tell in this context, the difference between a HashTable and a HashMap is more sematic. The map is a special case of the table where the bucket contains a pointer to the data, while the more general case of the table has the complete data stored in the bucket.

In the case of the article, there are many roads that lead to Rome and it is up to the programmer to choose the best solution. For example, you can use superhero strength as the key value and put the superhero names in a sorted lists attached to the buckets. In this case, the Items are sorted by design.

One thing must be clear, hash tables are not primarily intended for sorting. The goal is to be as close to O(1) as possible.

Biterider

HSE

Hi Biterider!!

A little more clear now.  :thumbsup:

In lockdown I begin translation of a very complex program in Java. Mostly everything was easy translated to ObjAsm32, but there was a problem with HashTables and HahsMaps. :biggrin:

Quote from: Biterider on January 27, 2022, 07:38:05 AM
For example, you can use superhero strength as the key value and put the superhero names in a sorted lists attached to the buckets. In this case, the Items are sorted by design.

I'm reading that Java project now, and become clear that is something like that.

Look like using HashTables like tables can add a lot of flexibility at a not so high price, but I skipped all that and maked simple tables  :cool:.

Thanks, HSE.   
Equations in Assembly: SmplMath

Biterider

Hi
Looking for the hardware implementation of the CRC calculation I found this extremely easy to understand video.
I highly recommend it.  :thumbsup:

https://www.youtube.com/watch?v=sNkERQlK8j8

Biterider

six_L

QuoteHash tables are a pupular way to implement the dictionary data type. Theoretically they are intuitive to understand. However,real-life implementations of hash tables could be complex.
hash tables waste 40% of memory in 1024 max item, when user items reach 1000. maybe the wasted 40% of memory never use in future.
Say you, Say me, Say the codes together for ever.

Biterider

Hi six_L
You're absolutely right, but don't forget what the HashTable is for. From my perspective, it is not intended for memory optimization, but for quick data access using a key.
Imagine a compiler where you put the symbols into a hash table. It doesn't matter how much memory you waste because after compilation the application will be closed and all memory will be released to the operating system. What you really want is a fast compilation process!

Regards, Biterider