r/MachineLearning - [Project] pgANN Fast Approximate Nearest Neighbor (ANN) searches with a PostgreSQL database.


Hi, we did experiment with ES, using range queries on the vectors and boolean querying them and also tried using LSH/MinHash to save a signature for each vector. Did you have a different approach in mind? Also, you're correct about L-1 & L2 distances being poor metrics in this dimensionality, but our goal was to fetch a subset of (say) a few thousand "good enough" results - from a pool of a tens of millions - that can then be re-ranked with cosine or such metric. Unfortunately, there are no easy wins in ANN and this works well enough for us. We hope others can benefit as well.

Duplicate Docs Excel Report

None found

Similar Docs  Excel Report  more

None found