I'm not so sure that's the reason. I'm in the field, and trust me, I'm VERY frustrated[0]. But isn't the saying to not attribute to malice what can be attributed to stupidity? I think the problem is that they're blinded by the hype but don't use the passion to drive understanding more deeply. It's a belief that the black box can't be opened, no why bother?
I think it comes from the ad hoc nature of evaluation in young fields. It's like you need an elephant but obviously you can't afford one, so you put a dog in an elephant costume and can it an elephant, just to get in the right direction. It takes a long time to get that working and progress can still be made by upgrading the dog costume. But at some point people forgot that we need an elephant so everyone is focused on the intricacies of the costume and some will try dressing up the "elephant" as another animal. Eventually the dog costume isn't "good enough" and leads us in the wrong direction. I think that's where we are now.
I mean do we really think we can measure language with entropy? Fidelity and coherence with FID? We have no mathematical description of language, artistic value, aesthetics, and so on. The biggest improvement has been RLHF where we just use Justice Potter's metric: "I know it when I see it"
I don't think it's malice. I think it's just easy to lose sight of the original goal. ML certainly isn't the only one to have done this but it's also hard to bring rigor in and I think the hype makes it harder. Frankly I think we still aren't ready for a real elephant yet but I'd just be happy if we openly acknowledge the difference between a dog in a costume proxying as an elephant and an actually fucking elephant.
[0] seriously, how do we live in a world where I have to explain what covariance means to people publishing works on diffusion models and working for top companies or at top universities‽
I think it comes from the ad hoc nature of evaluation in young fields. It's like you need an elephant but obviously you can't afford one, so you put a dog in an elephant costume and can it an elephant, just to get in the right direction. It takes a long time to get that working and progress can still be made by upgrading the dog costume. But at some point people forgot that we need an elephant so everyone is focused on the intricacies of the costume and some will try dressing up the "elephant" as another animal. Eventually the dog costume isn't "good enough" and leads us in the wrong direction. I think that's where we are now.
I mean do we really think we can measure language with entropy? Fidelity and coherence with FID? We have no mathematical description of language, artistic value, aesthetics, and so on. The biggest improvement has been RLHF where we just use Justice Potter's metric: "I know it when I see it"
I don't think it's malice. I think it's just easy to lose sight of the original goal. ML certainly isn't the only one to have done this but it's also hard to bring rigor in and I think the hype makes it harder. Frankly I think we still aren't ready for a real elephant yet but I'd just be happy if we openly acknowledge the difference between a dog in a costume proxying as an elephant and an actually fucking elephant.
[0] seriously, how do we live in a world where I have to explain what covariance means to people publishing works on diffusion models and working for top companies or at top universities‽