Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah, it's about 5x slower than realtime with the current configuration. The good news is that diffusion models and transformers are constantly benefitting from new acceleration techniques. This was a big reason we wanted to take a bet on those architectures.

Edit: If we generate videos at a lower resolution and with a fewer number of diffusion steps compared to what's used in the public configuration, we are able to generate videos at 20-23 fps, which is just about real-time. Here is an example: https://6ammc3n5zzf5ljnz.public.blob.vercel-storage.com/fast...



Woah that's a good find Andrew! That low-res video looks pretty good


Wowww.. can you buy more hardware and make a realtime websocket API?


It's something we're thinking about. Our main focus right now is to make the model as good as it can be. There are still many edge cases and failure modes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: