Try Hey Duggee - it's not as explicitly British-coded, but there's a ton of stuff in there if you were watching Spaced in your late teens and now find yourself a parent…
Seconded, Hey Duggee is a fantastic show. In a way it's the anti-Bluey - same delightful vibes, just as playfully animated, but intentionally ridiculous (and, to me, hilarious) stories.
Ha, same here! It really helped my imposter syndrome, as I overheard a couple of guys talking about the ARM assembly they were doing on their Archimedes on the first day…and I hadn't written anything fancier than QuickBASIC at the time…
I no longer work there, but Lucidworks has had embedding training as a first-class feature in Fusion since January 2020 (I know because I wrapped up adding it just as COVID became a thing). We definitely saw that even with just slightly out-of-band use of language - e.g. in e-commerce, things like "RD TSHRT XS", embedding search with open (and closed) models would fall below bog-standard* BM25 lexical search. Once you trained a model, performance would kick up above lexical search…and if you combined lexical _and_ vector search, things were great.
Also, a member on our team developed an amazing RNN-based model that still today beats the pants off most embedding models when it comes to speed, and is no slouch on CPU either…
(* I'm being harsh on BM25 - it is a baseline that people often forget in vector search, but it can be a tough one to beat at times)
Totally. And this has even happened in search. Open source search engines like Elasticsearch, etc did this... Google etc did this in the early Web days, and so on :)
Well, why wouldn't they sell (license) the rights to make Transformers films (which as far as I know is just extending their existing contract with Paramount)?
They still own the underlying IP[^1], so as long as the contract is a decent one, Paramount has to deal with the actual making/distributing the film, and Hasbro just gets the money, and a toy line off the back of the film. Feels like an easier set up than taking the risk on movie-making yourself (which they did attempt with eOne for other properties, but seemingly have decided that it's probably not a good deal with them)
[1] yes, yes, it's a bit more complicated with Takara in the mix too, but you can essentially view it as a Hasbro-owned property
That paper does a terrible job of making Lucene look useful, though. 10qps from a server with 1TB of RAM is not great (and I know Lucene HNSW can perform better than that in the real world, so I am somewhat mystified that this paper is being pushed by the community).
It definitely depends on your use case. If you are just searching through the entire array at all times, then this is certainly an acceptable option (you could even flip it all onto a GPU too).
But when you start to require filtering or combining the vector search with a lexical search, then something like Pinecone, Vespa, Qdrant, Lucene-based options (e.g. Solr and ES) etc. become a lot more practical than you building all that functionality yourself.