Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Probably what you want is not an LLM but just the embeddings for clustering. It's much lighter and would work well with new material as well.

I've tested it out for filtering RSS feeds and has worked pretty well [1].

[1] https://github.com/m0wer/rssfilter



I toyed with embedding, and still have an implementation laying around in a branch locally, but that would require a readily available dataset of embedded movies and shows to do comparisons against right?

I'm pretty new to embedding so my understanding may be a off.


The “cool” thing about embeddings is that you can use a generic one that can even support multiple languages. Just with the plot/description of the movie you could see “similar” ones in different aspects such as movies that also talk about dogs or more complex relations. Here is a short post about how it works: https://blog.sgn.space/posts/embeddings_based_recommendation...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: