Java HashMap detected a collision

Is there a way to detect collisions in Java hash maps? Anyone can point out that there are many collisions where some things happen Of course, if you rewrite the hash code of an object and just return a constant value, collision will certainly occur I'm not talking about that I want to know that in all cases, a large number of collisions mentioned above occur without modifying the default hash code implementation

Solution

I created a project to evaluate these things: http://code.google.com/p/hashingbench/ (for hash tables with link, open addressing, and layout filters)

In addition to the hashcode () of the key, you need to know the "drag and drop" (or "scrambling", as I call it in this project) function of the hash table From this list, the trailing function of HashMap is equivalent to:

public int scramble(int h) {
    h ^= (h >>> 20) ^ (h >>> 12);
    return h ^ (h >>> 7) ^ (h >>> 4);
}

Therefore, in order to generate conflicts in HashMap, the necessary and sufficient conditions are as follows: scramble (K1. Hashcode()) = = scramble (K2. Hashcode()) If K1 hashCode()== k2. Hashcode() (otherwise, the trailing / scrambling function will not be a function), which is always true, so this is a sufficient but not necessary condition for collision

Edit: in fact, the above necessary and sufficient conditions should be compressed (scramble (K1. Hashcode())) = = compress (scramble (K2. Hashcode())) – the compress function needs an integer and maps it to {0,..., n-1}, where n is the number of buckets, so it basically selects a bucket Usually, this is simply implemented as hash% N, or when the hash table size is a power of two (actually a motivation with the size of the power of two hash tables), as hash and n (faster) (compression is the name used by Goodrich and tamassia to describe this step, in data structures and algorithms in Java) Thank ilmtitan for finding my dirty

Other hash table implementations (concurrenthashmap, identityhashmap, etc.) have other requirements and use other drag and drop / scrambling functions, so you need to know which one you are talking about

(for example, HashMap's trailing function is already in place because people use HashMap, and for objects with the worst type of hashcode (), for the old HashMap implementation with function 2, without delay - slightly different objects, or not in the low order they use to select a bucket, such as new integers (1 * 1024), new integers (2 * 1024) *, etc As you can see, HashMap's trailing function will try to make all bits affect the low order)

All of this works well in common situations - specific cases are objects that inherit the hashcode () of the system

PS: in fact, the absolute ugliness of prompting the implementer to insert the drag and drop function is the use of hashcode () of floats / doubles and values as keys: 1.0, 2.0, 3.0, 4.0... They all have the same (zero) low order This is a related old error report: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4669519

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>