I'm facing the problem you describe daily. It's especially bad because it's very difficult for me to predict if the set of filters will reduce the dataset by ~1% (in which case following the original vector index is fine) or by 99.99% (in which case you just want to brute force the remaining vectors).
Tried a million different things, but haven't heard of Turbopuffer yet. Any references on how they perform with such additional filters?
Lucene and ES implement a shortcut for filters that are restrictive enough. Since it's already optimized for figuring out if something falls into your filter set, you first determine the size of that. You traverse the HNSW normally, then if you have traversed more nodes than your filter set's cardinality, you just switch to brute forcing your filter set distance comparisons. So worst case scenario is you do 2x your filter set size vector distance operations. Quite neat.
Oh that's nice! Any references on this shortcut? How do you activate that behavior? I was playing around with ES, but the only suggestion I found was to use `count` on filters before deciding (manually) which path to take.
Tried a million different things, but haven't heard of Turbopuffer yet. Any references on how they perform with such additional filters?