Wow, actually a good point I haven't seen anyone make. Taking raw embeddings and...

choilive · on Oct 29, 2024

Been using pgvector for a while, and to me it was kind of obvious that the source document and the embeddings are fundamentally linked so we always stored them "together". Basically anyone doing embeddings at scale is doing something similar to what Pgai Vectorizer is doing and is certainly a nice abstraction.

jdthedisciple · on Oct 29, 2024

I used FAISS as it also allowed me to trivially store them together.

Idk how well it scales though, it's just doing it's job on my hobby project scale

For my few 100'000s embeddings I must say the performance was satisfactory.

spmurrayzzz · on Oct 30, 2024

This is how most modern vector dbs work, you usually can store much more than just the raw embeddings (full text, metadata fields, secondary/named vectors, geospatial data, relational fields, etc).