Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

minH and LSH are both algorithms used in similarity search in Python, but they are not directly related.

minH, or minwise hashing, is a locality-sensitive hashing technique used to quickly estimate the Jaccard similarity between two sets. It involves creating a hash function that maps elements of a set to a set of integers, and then selecting the minimum integer value as the hash signature for that set. Sets with similar contents will produce similar hash signatures, which allows for efficient similarity search.

LSH, or locality-sensitive hashing, is a general technique used to perform approximate nearest neighbor search in high-dimensional spaces. It involves mapping data points to hash signatures such that similar points are likely to have similar signatures. This allows for efficient similarity search even in high-dimensional spaces where exact computation of distances can be computationally expensive.

Though both techniques involve hashing, they are used in different contexts and for different purposes. However, it is possible to use minH as a component of an LSH algorithm to improve its performance on data sets with specific properties.