A hash is a mathematical function that assigns a number to an element. Say you work at a university administration and you have to tell if Wilkinson is a student at your class. You can store the names on small papers in envelopes one for each starting letter. Instead of searching through the 10 thousand students, you can look at the papers in the envelope titled W. This very simple hash function assigns the first letter of the name to the name (or the ordinal number of the letter, as we said that a hash function results a number). This is not really a good hash function because it puts only a few elements, if any, into the envelope denoted X and many to A for example.
A good hash function results each possible ordinal number with similar probability. In hash tables, we usually have more buckets (envelopes in the previous example) than the number of elements to be stored. Therefore, when an element is searched for, it is likely that there is only one element there. At least that is what we would like to have. If there are multiple elements in a single bucket, it is called collision. A good hash function has as little collisions as possible.
Hash tables are arrays that use the result of the hash function to index the array. Usually, linked lists manage collisions. Hash table implementations also implement a strategy to resize the array when the number of elements to be stored becomes too high and the likelihood of collisions increase. This operation may take considerable time and, during this, the individual elements are moved between the buckets.
HashSet and HashMap use the hash function provided by the Object that is stored in the collection. The Object class implements the hashCode and equals methods. You can override them and if you do, you should override both in a consistent manner. First, we will see what they are and then how to override them consistently.