I assume it's not possible unless you know the in-memory state of all the other gateway routers on the internet, no? You can know what they advertise, but that's not the same thing as a full description of their internal state and how they will choose to update if a route gets withdrawn.
If you have do a lot of driving, $99/mo. seems like a decent price to have the car drive itself, especially if it got to the Waymo point where absolutely no driver attention was needed and you could watch Netflix the whole time. The issue with FSD isn't the price, it's that no matter what Elon and his fanboys say, it doesn't bloody work and Waymo is blowing them out of the water in capability.
> What we need is an open and independent way of testing LLMs
I mean, that's part of the problem: as far as I know, no claim of "this model has gotten worse since release!" has ever been validated by benchmarks. Obviously benchmarking models is an extremely hard problem, and you can try and make the case that the regressions aren't being captured by the benchmarks somehow, but until we have a repeatable benchmark which shows the regression, none of these companies are going to give you a refund based on your vibes.
It's like the entire software industry is gambling on "LLMs will get better faster than human skills will decay, so they will be good enough to clean up their own slop before things really fall apart".
I can't even say that's definitely a losing bet-- it could very well happen-- but boy does it seem risky to go all-in on it.
On one hand, it’s extremely tiring having to put up with that section of our industry.
On the other, if a large portion of the industry goes all in, and it _doesn’t_ pay off and craters them, maybe the overhyping will move onto something else and we can go back to having an interesting, actually-nice-to-be-in-industry!
The first round of people paid way more for their solar panels though, and those higher prices helped bootstrap the industry. Should people who paid much less for panels get the same reward? I'm having trouble getting outraged about this, it seems to be incentives working exactly as they should.
I agree, and maybe my "ladder pull" comment comes off as too negative. Most early solar buyers were either in it for environmental reasons or for a modest return on investment. I don't think many were expecting a windfall.
I mean I don't exactly have great news for you about the human rights situations in major oil-producing countries either. Not to do whataboutism, but if your energy source is going to implicate you in human rights abuses either way, you might as well take the clean renewable one.
Kinda gives the whole game away, doesn’t it? “It doesn’t actually matter if the citations are hallucinated.”
In fairness, NeurIPS is just saying out loud what everyone already knows. Most citations in published science are useless junk: it’s either mutual back-scratching to juice h-index, or it’s the embedded and pointless practice of overcitation, like “Human beings need clean water to survive (Franz, 2002)”.
Really, hallucinated citations are just forcing a reckoning which has been overdue for a while now.
> Most citations in published science are useless junk:
Can't say that matches my experience at all. Once I've found a useful paper on a topic thereafter I primarily navigate the literature by traveling up and down the citation graph. It's extremely effective in practice and it's continued to get easier to do as the digitization of metadata has improved over the years.
It's tough because some great citations are hard to find/procure still. I sometimes refer to papers that aren't on the Internet (eg. old wonderful books / journals).
But that actually strengthens those citiations. The I scratch your back you scratch mine ones are the ones I'm getting at and that is quite hard to do with old and wonderful stuff, the authors there are probably not in a position to reciprocate by virtue of observing the grass from the other side.
I think it's a hard problem. The semanticscholar folks are doing the sort of work that would allow them to track this; I wonder if they've thought about it.
A somewhat-related parable: I once worked in a larger lab with several subteams submitting to the same conference. Sometimes the work we did was related, so we both cited each other's paper which was also under review at the same venue. (These were flavor citations in the "related work" section for completeness, not material to our arguments.) In the review copy, the reference lists the other paper as written by "anonymous (also under review at XXXX2025)," also emphasized by a footnote to explain the situation to reviewers. When it came time to submit the camera-ready copy, we either removed the anonymization or replaced it with an arxiv link if the other team's paper got rejected. :-) I doubt this practice improved either paper's chances of getting accepted.
Are these the sorts of citation rings you're talking about? If authors misrepresented the work as if it were accepted, or pretended it was published last year or something, I'd agree with you, but it's not too uncommon in my area for well-connected authors to cite manuscripts in process. I don't think it's a problem as long as they don't lean on them.
No, I'm talking about the ones where the citation itself is almost or even completely irrelevant and used as a way to inflate the citation count of the authors. You could find those by checking whether or not the value as a reference (ie: contributes to the understanding of the paper you are reading) is exceeded by the value of the linkage itself.
reply