Is there any evidence that the reasons Stable Diffusion struggles with these things resembles the reasons humans struggle with these things? It seems like another case of recognizing an apparent pattern and anthropomorphizing it. Would it not be just as likely that such technical gaps are just a reflection of human limitations, not some inherent property?
> It's not a technological brain, but IMO it's the closest we've come.
This may be the case, but “the closest we’ve come” seems like a weak argument for anthropomorphizing that progress, especially when you look at how far the gap still is to human-like consciousness.
Faces are probably because they're full of detail and humans are great at telling if there's something 'wrong' with a face.
Hands and feet are probably due to intricacy, which to me seems more like the same issue that humans would have. Again, I'm not arguing that it's intelligent, just that it's...close. I'm not so sure that the gap to human-like consciousness is all that far anymore. If you were to somehow make one of these models more general instead of just text or images and instead "train" it the same way a human child gets trained I'm not entirely sure we'd be able to tell the difference.
Obviously we aren't there yet, but the rapid advances these past few years makes it feel like it might at least happen in my lifetime.
> It's not a technological brain, but IMO it's the closest we've come.
This may be the case, but “the closest we’ve come” seems like a weak argument for anthropomorphizing that progress, especially when you look at how far the gap still is to human-like consciousness.