Universal Hashing

Both the remainder and multiplication hashing methods have a common vulnerability. If an attacker knows the details of our hash function (table size and any constant values), he/she could devise an input key sequence resulting in a collision on every item, turning our hash table into a linked list and slowing down our program. To address this problem, a hashing technique called universal hashing can be used.

Universal hashing works by choosing a random function from a universal set of hash functions at the start of execution. This makes it difficult for an attacker to guess the exact workings of the hashing technique used. By using this technique, the same sequence of keys will produce a different sequence of hash values on every execution.

A set of hash functions, H, with size n, where each function maps a universe of keys ∪ to a fixed range of [0, s), is said to be universal for all pairs, where a, b ∈ ∪, a ≠ b and the probability that h(a) = h(b), h ∈ H is less than or equal to n/s.

We can construct our set of universal hash functions by using two integer variables, i in a range of [1, p), and j in a range of [0, p), where p is a prime number larger than any possible value of the input key universe. We can then generate any hash function from this set using:

Where s is the size of the hash table and x is the key.

The following code snippet shows a Java implementation of universal hashing suitable for integer-type keys. Note how our implementation makes use of the BigInteger class to work out the hash key. This is needed because multiplying a long Java numeric type with a large integer might result in a big enough value that exceeds the maximum capacity of a Java long. The choice of p in this method is such that any integer key input will always have a smaller value, since in Java an integer only has a maximum value of 2^31:

public UniversalHashing() {
  j = BigInteger.valueOf((long) (Math.random() * p));
  i = BigInteger.valueOf(1 + (long) (Math.random() * (p -  1L))); 
}
public int hashKey(Integer key, int tableSize) {
  return i.multiply(BigInteger.valueOf(key)).add(j)
    .mod(BigInteger.valueOf(p))

    .mod(BigInteger.valueOf(tableSize))
    .intValue();
}

Snippet 3.7: Universal hashing for integer keys. Source class name: UniversalHashing

Go to https://goo.gl/5Kv7qG to access this code.

Java provides hash tables and built-in hashing mechanisms using the Object.hashcode() method. As a result of this, it is very difficult to implement a universal hashing table which integrates with Java's existing hashcode() method, since the i and j variables in the preceding code would have to be shared between different objects being inserted in the same table.

For more information and mathematical proofs about why we pick a larger than key prime number, refer to Carter and Wegman, Universal Classes of Hash Functions, Journal of Computer and System Sciences: https://doi.org/10.1016/0022-0000(79)90044-8.

Universal hashing provides us with good results, minimizing collisions, and is immune to malicious attacks, since the function parameters are chosen at random.

Table of Contents for Universal Hashing

Create new playlist

Sign In

Sign Up

Table of Contents for
Universal Hashing