I also thought about this video while reading the HN post. The Computerphile video completely omits the latent space, right? But instead spends a lot of time on the iterative denoising. Even though I like Computerphile a lot, I don't think this was the best tradeoff.