Hacker Newsnew | past | comments | ask | show | jobs | submit | lqstuart's commentslogin

> What is a compiler?

Might be worth skipping to the interesting parts that aren’t in textbooks


“Principal” at Microsoft (or Oracle) is “Senior” or “Staff” everywhere else just fyi


So it’s some brittle crap built on verl, which is already pretty much train by config (and makes breaking changes with _every single commit_), with no documentation, no examples, and no clear purpose? Heck yeah Microsoft


Hahaha I would have led with this honestly.

I’m the chief legal officer but at the end of the day I’m just like bruh, chill, who gives a shit


The biggest problem with “agents” as originally described is that they don’t exist, will most likely not exist in our lifetimes if ever, and meanwhile the bullshit peddlers keep quietly changing their separate definition of “agent” to the point where now it literally just means “an LLM.”


Like, for example, NVIDIA investing billions back into OpenAI so that they can buy more of NVIDIA’s hardware


That money will still flow elsewhere.


Your house is in flames? Don't worry that energy will flow elsewhere


Money circulates but resources do not. A human hour spent constructing a data center can’t then be used to build an apartment building.


Yes, and??? How is that relevant? We were talking about money, explicitly. Not the resources. When the economy still has room, i.e. people are available to do more, then any missing resources can be obtained by sooner or later by employing people.

This is not about the body or the land, but about the blood or the water flowing through them.


> Yes, and??? How is that relevant

Because it describes exactly the point which GGP tried to make (source: that was me). The assumption is that AI growth is great because without it, look at how low the non-AI growth is! But that argument is flawed because the resources (manpower, materials, manufacturing, energy, ...) absent the AI hypr would not vanish but be used for something else, so the growth in those other areas would be bigger. Granted, perhaps not as big (marginal gains and all that), but the painted picture is still skewed.


I even quoted it! I respond3ed to what I quoted exactly. Again:

> since investor money is spent exactly once

In addition, I even pointed out that I was not posting about the main argument!

Quoting myself, again:

> So I'm nitpicking here


"AI" is a $100B business, which idiot tech leaders who convinced themselves they were visionaries when interest rates were historically low have convinced themselves will save them from their stagnating growth.

It's really cool. The coding tools are neat, they can somewhat reliably write pain in the ass boilerplate and only slightly fuck it up. I don't think they have a place beyond that in a professional setting (nor do I think junior engineers should be allowed to use them--my productivity has been destroyed by having to review their 2000 line opuses of trash code) but it's so cool to be able to spin up a hobby project in some language I don't know like Swift or React and get to a point where I can learn the ins and outs of the ecosystem. ChatGPT can explain stuff to me that I can't find experts to talk to about.

That's the sum total of the product though, it's already complete and it does not need trillions of dollars of datacenter investment. But since NVIDIA is effectively taking all the fake hype money and taking it out of one pocket and putting it in another, maybe the whole Ponzi scheme will stay afloat for a while.


> That's the sum total of the product though, it's already complete and it does not need trillions of dollars of datacenter investment

What sucks there’s probably some innovation left in figuring out how to make these monstrosities more efficient and how to ship a “good enough” model that can do a few key tasks (jettisoning the fully autonomous coding agents stuff) on some arbitrary laptop without having to jump through a bunch of hoops. The problem is nobody in the industry is incentivized to do this because the second this happens, all their revenue goes to 0. It’s the final boss of the everything is a subscription business model.


I've been saying this since I started using "AI" earlier this year: If you're a programmer, it's a glorified manual, and at that, it's wonderful. But beyond asking for cheat sheets on specific function signatures, it's pretty much useless.


I’d disagree. I think there is still so much value it can offer if you really open your mind. For instance, I’ve met very few “programmers” that I’d consider even moderately competent at front-end, so the ability of a programmer to build and iterate a clean and responsive UI is just one example of a huge win for AI tools.


How do I save comments in HN? This sums up everything I feel. Beautiful.


Click the comment timestamp, then “favorite” it


Man you’re really good at that lol


Wait, this isn’t over yet.


I like Chris Lattner but the ship sailed for a deep learning DSL in like 2012. Mojo is never going to be anything but a vanity project.


Nah. There's huge alpha here, as one might say. I feel like this comment could age even more poorly than the infamous dropbox comment.

Even with Jax, PyTorch, HF Transformers, whatever you want to throw at it--the dx for cross-platform gpu programming that are compatible with large language models requirements specifically is extremely bad.

I think this may end up be the most important thing that Lattner has worked on in his life (And yes, I am aware of his other projects!)


Comments like this view the ML ecosystem in a vacuum. New ML models are almost never written—all LLMs for example are basically GPT-2 with extremely marginal differences—and the algorithms themselves are the least of the problem in the field. The 30% improvements you get from kernels and compiler tricks are absolutely peanuts compared to the 500%+ improvements you get from upgrading hardware, adding load balancing and routing, KV and prefix caching, optimized collective ops etc. On top of that, the difficulty even just migrating Torch to the C++11 ABI to access fp8 optimizations is nigh insurmountable in large companies.

I say the ship sailed in 2012 because that was around when it was decided to build Tensorflow around legacy data infrastructure at Google rather than developing something new, and the rest of the industry was hamstrung by that decision (along with the baffling declarative syntax of Tensorflow, and the requirement to use Blaze to build it precluding meaningful development outside of Google).

The industry was so desperate to get away from it that they collectively decided that downloading a single giant library with every model definition under the sun baked into it was the de facto solution to loading Torch models for serving, and today I would bet you that easily 90% of deep learning models in production revolve around either TensorRT, or a model being plucked from Huggingface’s giant library.

The decision to halfass machine learning was made a long time ago. A tool like Mojo might work at a place like Apple that works in a vacuum (and is lightyears behind the curve in ML as a result), but it just doesn’t work on Earth.

If there’s anyone that can do it, it’s Lattner, but I don’t think it can be done, because there’s no appetite for it nor is the talent out there. It’s enough of a struggle to get big boy ML engineers at Mag 7 companies to even use Python instead of letting Copilot write them a 500 line bash script. The quality of slop in libraries like sglang and verl is a testament to the futility of trying to reintroduce high quality software back into deep learning.


Thank you for the kind words! Are you saying that AI model innovation stopped at GPT-2 and everyone has performance and gpu utilization figured out?

Are you talking about NVIDIA Hopper or any of the rest of the accelerators people care about these days? :). We're talking about a lot more performance and TCO at stake than traditional CPU compilers.


I’m saying actual algorithmic (as in not data) model innovation has never been a significant part of the revenue generation in the field. You get your random forest, or ResNet, or BERT, or MaskRCNN, or GPT-2-with-One-Weird-Trick, and then you spend four hours trying to figure out how to preprocess your data.

On the flipside, far from figuring out GPU efficiency, most people with huge jobs are network bottlenecked. And that’s where the problem arises: solutions for collective comms optimization tend to explode in complexity because, among other reasons, you now have to package entire orchestrators in your library somehow, which may fight with the orchestrators that actually launch the job.

Doing my best to keep it concise, but Hopper is like a good case study. I want to use Megatron! Suddenly you need FP8, which means the CXX11 ABI, which means recompiling Torch along with all those nifty toys like flash attention, flashinfer, vllm, whatever. Ray, jsonschema, Kafka and a dozen other things also need to match the same glibc and glibc++ versions. So using that as an example, suddenly my company needs C++ CICD pipelines, dependency management etc when we didn’t before. And I just spent three commas on these GPUs. And most likely, I haven’t made a dime on my LLMs, or autonomous vehicles, or weird cyborg slavebots.

So what all that boils down to is just that there’s a ton of inertia against moving to something new and better. And in this field in particular, it’s a very ugly, half-assed, messy inertia. It’s one thing to replace well-designed, well-maintained Java infra with Golang or something, but it’s quite another to try to replace some pile of shit deep learning library that your customers had to build a pile of shit on top of just to make it work, and all the while fifty college kids are working 16 hours a day to add even more in the next dev release, which will of course be wholly backwards and forwards incompatible.

But I really hope I’m wrong :)


Lattner's comment aside (which I'm fanboying a little bit at), I do tend to agree with your pessimism/realism for what it's worth. It's gonna be a long long time before that whole mess you're describing is sorted out, but I'm confident that over the next decade we will do it. There's just too much money to be made by fixing it at this point.

I don't think it's gonna happen instantly, but it will happen, and Mojo/Modular are really the only language platform I see taking a coherent approach to it right now.


I tend to agree with you, but I hoped the field would start collectively figuring out how to be big boys with CICD and dependency management back in 2017–I thought Google’s awkward source release of BERT was going to be the low point, and we’d switch to Torch and be saved. Instead, it’s gotten so much worse. And the kind of work that the Python core team has been putting into package and dependency management is nothing short of heroic, and it still falls short because PyTorch extends the Python runtime itself, and now Torch compile intercepting Py_FrameEval and NVIDIA is releasing Python CUDA bindings.

It’s just such a massive, uphill, ugly moving target to try to run down. And I sit here thinking the same as many of these comments—on the one hand, I can’t imagine we’re still using Python 3 in 2035? 2050?? But on the other hand I can’t envision a path leading out of the mess making money, or at least continue pretending they’ll start to soon.


And comments like this forget that there is more to AI and ML than just LLMs or even NNs.


Pytorch didn't even start until 2016, taking a lot of market share from Tensorflow.

I don't know if this is a language that will catch on, but I guarantee there will be another deep learning focused language that catches on in the future.


Now that NVidia finally got serious with Python tooling and JIT compilers for CUDA, I also see it becoming harder, and those I can use natively on Windows, instead of having to be on WSL land.


To be fair, triton is in active use, and this should be even more ergonomic for Python users than triton. I dont think it’s a sure thing, but I wouldn’t say it has zero chance either.


You could have said the same about MLX on Apple Silicon, yet here we are.


Tritonlang itself is a deep learning DSL.


> I like Chris Lattner but the ship sailed for a deep learning DSL in like 2012.

Nope. There's certainly room for another alternative that's performant and portable than the rest without the hacks needed to meet it.

Maybe you caught the wrong ship, but Mojo is a speedboat.

> Mojo is never going to be anything but a vanity project.

Will come back in 10 years and we'll see if your comment needs to be studied like the one done for Dropbox.


Any actual reasoning for that claim?


The fact that there’s even a debate about banning smart phones in classrooms tells you all you need to know. Cell phones were de facto banned in school in like 2002, not sure when it became the norm but it seems like a no brainer.


This is what I thought of immediately as well. I remember being shocked to learn that phones were allowed. Of course thats not going to work out well.

There are so many factors to the negative education outcomes but this policy is just obvious. I guess its actually the parents who insist on being able to reach their kid at any moment?


To some extent this is one of the recommendations of the PISA 2022 report, but it comes with a big caveat:

> 4. Limit the distractions caused by using digital devices in class >Students who spent up to one hour per day on digital devices for learning activities in school scored 14 points higher on average in mathematics than students who spent no time. Enforced cell phone bans in class may help reduce distractions but can also hinder the ability of students to self-regulate their use of the devices.

I don't think a simple blanket ban on smartphones in schools is likely to solve much.


Agreed. The argument "I want to know if my child is safe in an emergency" is incredibly flawed. God forbid there is a shooting, kids should be listening to their teachers to be instructed to safety not distracted. And their phones should not make noise when they are hiding.

If the emergency is on the outside of school, parents still need to go through the main office to pull kids out of school, so contacting them is also unnecessary. These helicopter parents smh


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: