## Wednesday, September 28, 2011

### Better Bit Mixing - Improving on MurmurHash3's 64-bit Finalizer

Austin Appleby's superb MurmurHash3 is one of the best general purpose hash functions available today. Still, as good as it is, there are a couple of minor things about it that make me uneasy. I want to talk about one of those things today.

Here is the 64-bit finalizer from MurmurHash3:

UInt64 MurmurHash3Mixer( UInt64 key )
{
key ^= (key >> 33);
key *= 0xff51afd7ed558ccd;
key ^= (key >> 33);
key *= 0xc4ceb9fe1a85ec53;
key ^= (key >> 33);

return key;
}


The goal of a finalizer (sometimes called a "bit mixer") is to take an intermediate hash value that may not be thoroughly mixed and increase its entropy to obtain both better distribution and fewer collisions among hashes. Small differences in input values, as you might get when hashing nearly identical data, should result in large differences in output values after mixing.

Ideally, flipping a single bit in the input key results in all output bits changing with a probability of 0.5. This is called the Avalanche effect. Cryptographically-secure hashes are willing to spend an impressive number of computer cycles getting this probability close to 0.5. Hash functions that don't have to be cryptographically-secure trade away some avalanche accuracy in favor of performance. MurmurHash3 gets close to 0.5 with a remarkable economy of effort: Just two multiplies, three shifts and three exclusive-ors. Austin reports MurmurHash3 avalanches all bits to within 0.25% bias.

The overall structure of the finalizer (shifts and multiplies) was designed by Austin. The constants were chosen by "a simple simulated-annealing algorithm" (his description.) The part that troubles me is the simulation was driven with random numbers. That doesn't seem right. After all, if the inputs were truly random what would be the point of mixing? It seems to me that a good mixer should be able to take a series of, say, counting numbers and produce output that is virtually indistinguishable from random.

First question: Could I build a better mixer by training the simulation on low-entropy inputs such as counting numbers and high-likelihood bit patterns? Here are the results for MurmurHash3 and the top 14 mixers my simulation found:

## 4,853,184 Low-entropy keys

MixerMaximum errorMean error
MurmurHash3 mixer0.0059746343843550.000265202074121
Mix010.0007918512877320.000190873655461
Mix020.0008851920718440.000197317606821
Mix030.0008281161398370.000191444267889
Mix040.0009919261251990.000199204707592
Mix050.0009321715393440.000195036565653
Mix060.0008198741279950.000199337714668
Mix070.0008056566575670.000194049828213
Mix080.0009064152523370.000191828700595
Mix090.0009270202819430.000194149835046
Mix100.0008777742611860.000194585176663
Mix110.0009558673233900.000194238221367
Mix120.0009323775896400.000193450189656
Mix130.0007899968350680.000190778327016
Mix140.0008009175007580.000191264175101

The maximum error tends to be about seven times lower. Mean error is lower too but only slightly. The answer, then, is yes, it does get better results with low-entropy keys. But that leads to the second question: Does training on low-entropy keys result in better or worse performance with high-entropy keys? To find out, I tested it with 100 million cryptographic-quality random numbers:

## 100,000,000 Random keys

MixerMaximum errorMean error
MurmurHash3 mixer0.0002123800000000.000040445117187
Mix010.0001774100000000.000040054211426
Mix020.0001791500000000.000039797316895
Mix030.0001700700000000.000040068117676
Mix040.0001854700000000.000039775007324
Mix050.0001925100000000.000039626535645
Mix060.0001956600000000.000040216433105
Mix070.0001938100000000.000039834248047
Mix080.0001965900000000.000039063793945
Mix090.0001742800000000.000039541943359
Mix100.0001817900000000.000039569926758
Mix110.0001811400000000.000039501779785
Mix120.0001839200000000.000039622690430
Mix130.0001756100000000.000039437180176
Mix140.0001820000000000.000040092158203

Yes, we can do slightly better than MurmurHash3 even on random test keys.

On average this is a small but measurable improvement that comes at no performance cost. For the worst-case, low-entropy keys that concern us the most, it provides a significant improvement.

Here's a handy tip: A great use for a mixer like this is to use it to "purify" untrustworthy input hash keys. If a calling function provides poor-quality hashes (even counting numbers!) as input to your code then purifying it with one of these mixers will insure it does no harm.

Here are the parameters for the mixers I tested:

## Mixer parameters

MixerMixer operations
MurmurHash3 mixer330xff51afd7ed558ccd330xc4ceb9fe1a85ec5333
Mix02330x64dd81482cbd31d7310xe36aa5c61361299731
Mix03310x99bcf6822b23ca35300x14020a57acced8b733
Mix04330x62a9d9ed799705f5280xcb24d0a5c88c35b332
Mix06310x69b0bc90bd9a8c49270x3d5e661a2a77868d30
Mix07300x16a6ac37883af045260xcc9c31a4274686a532
Mix08300x294aa62849912f0b280x0a9ba9c8a5b1511731
Mix09320x4cd6944c5cc20b6d290xfc12c5b19d3259e932
Mix10300xe4c7e495f4c683f5320xfda871baea35a29333
Mix11270x97d461a8b11570d9280x02271eb7c6c4cd6b32
Mix12290x3cd0eb9d47532dfb260x63660277528772bb33
Mix13300xbf58476d1ce4e5b9270x94d049bb133111eb31

1. Thanks for sharing this.

I was wondering : Could the finalization mix in 32 bit murmurhash3 be improved in a similar fashion?

Cheers!

2. The 32-bit mixer has the same structure as the 64-bit mixer and both were generated by simulated annealing so, while I can't be certain without trying it, I strongly suspect the answer is yes.

3. That's very interesting, thanks for sharing it! Can you provide more details how you found those new mixer?

4. Nice work!

Coincidentally, I have been using a quite similar construction that seems to behave more like a pseudo-random permutation. At least I can so far not find any obvious ways to do measurements that clearly will distinguish it from a PRP:
http://mostlymangling.blogspot.com/2018/07/mathjax.html

5. This seems to have become a "go to" page referenced even in Java 8 source code. I am interested to see more in regards to mix function properties. Specifically I'd like to know if it is a 1-to-1 function and whether the function can produce collisions e.g. assuming y=mix(x) can it produce the same "y" for two different "x"?

1. Since the individual steps are bijective, the whole function is bijective or 1-to-1.

See http://mostlymangling.blogspot.com/2018/07/on-mixing-functions-in-fast-splittable.html for a longer analysis of MurmurHash3, Variant 13 (above) and an attempt to remedy some of the obvious (statistical) shortcomings of MurmurHash3 and friends. Included is the inverse for a similar function.

2. Stupid me. Indeed. This would strike me as obvious some 20 years ago but one thing about math is if you don't use it - you lose it :)