Java – what hashing techniques are used when building Bloom filters in clojure?
I want to build a bloom filter in clojure, but I don't know much about all the hash libraries that may be available in a JVM based language
Should I use the fastest (relative to the most accurate) bloom map implementation in clojure?
Solution
So the interesting thing about Bloom filters is that they need multiple hash functions to work effectively
Java struts already has a built-in hash function, which can use – string Hashcode() returns a 32 - bit integer hash For most purposes, this is a good hash code, which may be enough: if you divide it into 2 separate 16 bit hash codes, for example, this may be enough for your bloom filter to work You may encounter some collisions, but that's good - the bloom filter expects some collisions
If not, you may want to scroll by yourself. In this case, I recommend using string Getchars () accesses the original char data and uses it to calculate multiple hash codes
Clojure code lets you start (just summarize character values):
(let [s "Hello" n (count s) cs (char-array n)] (.getChars s 0 n cs 0) (areduce cs i v 0 (+ v (int (aget cs i))))) => 500
Note that using clojure's Java interop to call getchars, and using areduce to give you a very fast iterative character array
You may also be interested in me at GitHub: https://github.com/MagnusS/Java-BloomFilter Interested in the Java bloom filter implementation found on The implementation of hash code is clear at a glance, but it uses a byte array. I think it is less efficient than using characters, because it needs to deal with the overhead of character coding