Locality-sensitive hashing (LSH) is a method of performing probabilistic dimension reduction of high-dimensional data. The basic idea is to hash the input items so that similar items are mapped to the same buckets with high probability (the number of buckets being much smaller than the universe of possible input items). Note how locality-sensitive hashing, in many ways, mirrors data clustering.
References
- http://en.wikipedia.org/wiki/Locality_Sensitive_Hashing
- http://nandaro.tistory.com/entry/min-hash-locality-sensitive-hashing
- http://people.csail.mit.edu/indyk/mmds.pdf
- Min Hash
- http://blog.daum.net/jchern/13756922
- Hamming Metric Space & Locality Sensitive Hashing
- http://blog.naver.com/rupy400/130149421185
- https://github.com/andrewclegg/sketchy