Wow, actually a good point I haven't seen anyone make.
Taking raw embeddings and then storing them into vector databases, would be like if you took raw n-grams of your text and put them into a database for search.
Been using pgvector for a while, and to me it was kind of obvious that the source document and the embeddings are fundamentally linked so we always stored them "together". Basically anyone doing embeddings at scale is doing something similar to what Pgai Vectorizer is doing and is certainly a nice abstraction.
This is how most modern vector dbs work, you usually can store much more than just the raw embeddings (full text, metadata fields, secondary/named vectors, geospatial data, relational fields, etc).
Taking raw embeddings and then storing them into vector databases, would be like if you took raw n-grams of your text and put them into a database for search.
Storing documents makes much more sense.