In computing Java util. Description of constants used when hashcode value of hash
Someone can explain the meaning of these constants, why choose them?
static int hash(int h) { // This function ensures that hashCodes that differ only by // constant multiples at each bit position have a bounded // number of collisions (approximately 8 at default load factor). h ^= (h >>> 20) ^ (h >>> 12); return h ^ (h >>> 7) ^ (h >>> 4); }
Source: java-se6 Library
Solution
Knowing what makes a good hash function is tricky because there are actually many different functions used and slightly different purposes
The hash table of Java works as follows:
They ask key objects to generate their hash codes The implementation of the hashcode () method may have significantly varying quality (return a constant value in the worst case!), And it will never apply to the specific hash table you are using. > Then, they use the above function to mix the bits together, so that the information existing in the high bit is also moved to the low bit This is important because next... > they use the hash code mod (the number of w.r.t. hash table array entries) to get the index in the hash table chain array The size of the hash table array is equivalent to a power of 2, so the mixing of bits in step 2 is important to ensure that they are not discarded. > Then they traverse the chain until they reach the entry with equal keys (according to the equals () method)
To complete the picture, the number of entries in the hash table array is a non constant number; If the chain is too long, the array will be replaced by a new larger array, and all the data will be rearranged This is relatively fast and has a good performance impact on normal usage patterns (for example, a large number of put() followed by a large number of get() s)
The actual constants used are quite arbitrary (and may be selected experimentally using some simple corpora, including things such as a large number of integers and string values), but their purpose is not to extend the information in the whole value to most of the low bits. This value ensures that the information existing in the output using hashcode () and possible
(you won't use perfect hashes or encrypted hashes to do this; despite similar names, they have very different implementation strategies. The former needs to know the key space to avoid / reduce conflicts, and the latter needs to move information in all directions, not just low bits.)