The MASM Forum

Miscellaneous => Hardware & Software Corner => Topic started by: Vortex on January 22, 2023, 07:55:13 PM

Title: xxHash
Post by: Vortex on January 22, 2023, 07:55:13 PM
QuotexxHash is an Extremely fast Hash algorithm, processing at RAM speed limits. Code is highly portable, and produces hashes identical across all platforms (little / big endian). The library includes the following algorithms :

XXH32 : generates 32-bit hashes, using 32-bit arithmetic
XXH64 : generates 64-bit hashes, using 64-bit arithmetic
XXH3 (since v0.8.0): generates 64 or 128-bit hashes, using vectorized arithmetic. The 128-bit variant is called XXH128.

https://github.com/Cyan4973/xxHash (https://github.com/Cyan4973/xxHash)
Title: Re: xxHash
Post by: Biterider on January 22, 2023, 08:15:15 PM
Thanks Vortex
Very useful information!  :thumbsup:
The benchmark data is very interesting.
Let's see if we can use them...

Biterider
Title: Re: xxHash
Post by: jack on January 22, 2023, 10:03:42 PM
had look at the collisions test
Quote
The test requires a very large amount of memory. By default, it will generate 24 billion of 64-bit hashes, requiring 192 GB of RAM for their storage
that's a bit more memory than what my PC has  :biggrin:
one thing that has always bothered me about hash tables is collisions, how you deal with a collision?
Title: Re: xxHash
Post by: LiaoMi on January 23, 2023, 02:42:40 AM
:tongue: :thup:
Hash tables for ultra fast dictionaries - http://masm32.com/board/index.php?topic=9754.msg107371#msg107371 (http://masm32.com/board/index.php?topic=9754.msg107371#msg107371)
Title: Re: xxHash
Post by: jj2007 on January 23, 2023, 05:45:32 AM
Have you noticed that there is a big difference between searching with Google vs using Forum search (http://masm32.com/board/index.php?action=search;advanced;search=)?

Google finds 12 matches for "exgetsel" (with quotes) in the whole Internet.
Forum search finds 18 matches, in this forum only.

How could this be improved using hash tables?
Title: Re: xxHash
Post by: NoCforMe on January 23, 2023, 06:00:02 AM
So maybe the mighty vaunted Google ain't the miraculous collection of algorithms that it's cracked up to be, eh? Who woulda thunk it?

(OTOH, it's a hell of a lot better than Duck Duck Go, which I no longer use: I love its privacy protections, but its search results are quite inferior to Google's.)
Title: Re: xxHash
Post by: jj2007 on January 23, 2023, 06:02:43 AM
The explanation is that Google uses fast hash tables, while forum search uses slower algos.
Title: Re: xxHash
Post by: NoCforMe on January 23, 2023, 07:18:19 AM
So the tortoise wins the race ...
Title: Re: xxHash
Post by: jj2007 on January 23, 2023, 07:50:22 AM
Yep. Google uses hash tables, and therefore can find only full words. No partial search :cool:
Title: Re: xxHash
Post by: mineiro on January 23, 2023, 12:27:35 PM
You can check this link, sound that authors are talking about:
https://encode.su/threads/2556-Improving-xxHash (https://encode.su/threads/2556-Improving-xxHash)
Title: Re: xxHash
Post by: jj2007 on January 23, 2023, 07:09:21 PM
Quote from: mineiro on January 23, 2023, 12:27:35 PM
You can check this link, sound that authors are talking about:
https://encode.su/threads/2556-Improving-xxHash

Bulat Ziganshin definitely knows what he is talking about, he is the author of FreeArc, the best archiver ever :thumbsup: