Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I agree with the author - introducing a vector database often isn't worth the extra complexity.

Personally, I can vouch for ParadeDB: https://www.paradedb.com/

It adds extra extensions to PostgreSQL which enable vector indexing, full text search and BM25. Works great and developers are helpful!

The major difference is that you must generate the embeddings by yourself, but I consider it an upside - to each their own :)



> I consider it an upside

I'm curious why you consider an upside. Hypothetically speaking, wouldn't it be better if the embeddings could automatically be updated when you want them to be? Is the problem that it's not easy to automated based on the specific rules of when you want updates to happen?


Easier to handle edge-cases - real examples:

- What if certain rows in a table don't need to be embedded?

- What if we use a single API key for embedding database rows and user queries and it hits a rate limit - how to prioritize user queries?

- What if some rows should be vectorized using a different model, depending on an external configuration?


We could add support for something like `pg_vectorize` in order to generate embeddings directly from the database. We simply haven't seen enough demand yet. Perhaps we haven't listened hard enough :')




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: