Hacker Newsnew | past | comments | ask | show | jobs | submit | hashxyz's commentslogin

The article is okay, but the writing is bad chatgpt. Em-dashes in almost every sentence instead of commas. Out of place analogies and metaphors which barely make sense. Randomly sprinkled slang for an audience of teenage redditors.


These are 'en' dashes. Myself and and other people who care about (micro)typography have been using them since forever.

Presumably the author is one of them. Or they simply use a text editing or blogging software that takes care of it.

E.g. Markdown with smarty-pants feature turned on generates them automatically from '--'. 'Em' dashes require '---'.

Coincidentally Rust's `cargo doc` does this for you -- just for example.

The conclusion that a text containing such micro-typographic niceties must be LLM-generated is a fallacy thusly.

Your other 'evidence' sounds like an interpretation to me. Maybe you should quote the sections you mean?

Otherwise your critique seems superficially limited to form, not contents -- an ad-hominem in disguise one may be tempted to conclude.


If you care about endashes then you'll know that they're not supposed to be used to separate ideas in a sentence, that's what emdashes are for. That makes me think it's not LLM generated because LLMs know how to use emdashes.


Indeed. In English it is an em-dash. Not brackeded by spaces, although you should bracket by thin spaces to make sure you get some separation, optically. And line breaks in apps that are unaware this is a separator.

In many other languages ideas are separated by en-dashes (surrounded by spaces).

In the case at hand the use of en-dashes in English text instead of the correct em-dashes could also be a sign of a non-native speaker caring about micro-typography and just doing the wrong thing. :)

The LLM conclusion doesn't seem well supported either way.


> But the thing is – nalgebra isn't an isolated example. It’s cultural.

classic llmism


It's just the the pacific ring of fire https://en.wikipedia.org/wiki/Ring_of_Fire


Pretty sure this is just vectorization. You can pack some 8bit ints into a machine-length 32bit int and add them together, that is vectorization.


I don't think that's true when the add overflows. You wouldn't want a lane's overflow to carry into an adjacent lane.


Can you back that up with anything? I’ve gotten this as a vague sense, but it seems hard to find much actual background about how he manages to continuously fail upward.


I don’t see how the distinction makes any sense when the Verus project you linked requires you to write correctness specs. It sounded like intrinsic techniques were preferred because they would not require you to write and maintain a separate spec, but this is not the case.


I prefer intrinsic techniques because it prevents the model from being out of sync with the implementation.

The thing that's never made any sense to me about using a marble checker for anything but concurrency issues (which are hard enough to warrant it) is that once you've validated your model you have to go actually implement it, and that's usually the most error prone part of the process.

If the correctness spec has to be written manually but prevents you from diverging from the spec in your implementation, that's a huge step up from extrinsic model checkers.


Lamport's rationale is that after an architect designs a building, the builders may still put electrical sockets in the wrong place and make other mistakes. But that's not a reason to start construction without a plan at all.


That rationale assumes that writing software has a design stage and a build stage. It doesn't—software is the design, the building is done by the compiler or interpreter at runtime. So what's really being proposed is subdividing the design stage into a pre-design and a design.

Pre-design makes sense to me in certain limited circumstances. A limited amount of architecture planning can be valuable (though in most cases formal methods aren't useful for that), and for certain kinds of concurrent algorithms it could even be worth it to validate the design in a different language. But most of the time it's not worth doing the design twice when you can get pretty good guarantees from static analysis on the design (the code) itself.


Agreed. To stretch the analogy, if I'm just replacing a fence panel or putting up a shelf then I'm not going to get an architect to create a blueprint. I'll know if it's right from the execution.

I sometimes work in areas where the error budget is essentially zero, with an element of concurrency, and for those there is a design stage before the build stage. I could see the value of formal methods there. At least I could execute a model with a model checker, which makes it one step closer to the code than a design doc or RFC.

Full disclosure: I haven't actually used formal methods myself, I've just been interested in the idea for a while and have done some reading on it.


I wonder if there's another reason for Sam brokering deals with lots of major news outlet. Surely OpenAI won't try to exert any type of editorial control over these publications. That would be absurd!


Pretty sure they had something in the contract that OpenAI can train on their data, so it is an afterwards legitimation of something already performed (allegedly)


I'd be more worried that OpenAI becomes the default method of accessing their stories, and then Sam has complete editorial control.


This is not going to age well. The current round of AI accelerators are going to flop hard because there is a deep hardware software mismatch. All the accelerators target GEMM and CONV, and get bottlenecked when most of the other extremely common tensor operators get mixed in. It turns out that Nvidia GPUs are already pretty close to the ideal type of chip you need to execute models people actually want to use.

Nobody in the AI chip hypespace seems to understand this, it’s just stupid money running around trying to eat Nvidia’s margins. Sam Altman understands this less than plenty of people.

It’s becoming harder for me to see him as anything besides someone who is very talented at growing power, but not much else. Perhaps he will succeed in misallocating a trillion dollars along the way.


Ok I guess the team with the largest LLM workload in the world and billions in funding won't understand how to optimise a chip for the exact workload they have and near future ones.


Exactly. Present success means the ability to forecast what’s needed for future success — see the Pierce-Arrow Motor Car Company and their dominance in the market to this very day


This person is not saying success -> more success. I think they’re just pointing out that Altman is smart and is surrounded by smart people and a company that understands the demand because they make up the majority of the demand (and they have a strong thesis).


Is he raising for OpenAI or for another venture? If he is using deep knowledge from OpenAI to raise money for another venture, this sounds wrong.


He is rich and powerful, of course it isn’t wrong

/s


Or broke and powerful? Because of spending a fortune on WorldCoin, working at a nonprofit and heavily investing into early AI startups?


No way OpenAI makes up even a plurality of chip demand


OpenAI not itself, but Microsoft is.

For 2022 and 2023, Microsoft bought a significant portion of NVIDIA's available hardware. They spent quite a bit of 2023 trying to figure out how to even power the multiple fleets of GPUs. Just now with the mild to expected wild adoption of Azure OpenAI are they getting around to servicing all their (potential) customers.


[citation needed]

Seriously, this is am outlandish claim just from looking at Microsoft and Nvidias market cap.

I am sure that Microsoft is gonna be one of Nvidias largest customers, but I sincerely doubt it's even a double digit percentage of their revenue.


All of this is public information, its estimated Microsoft bought ~150k H100s from old reports, we also know today that Meta actually bought 500k units

To reach double digit revenue of NVIDIA's 2023 at $26.97 billion, you'd only need to hit ~$2.7B in sales.

H100's are priced anywhere between $20k - $35k, so required to purchase ~77k - ~135k units.

That is singularly H100s, Microsoft also offers lower compute, and they have the rest of Azure to service with a variety of solutions.

Being at #1 or #2 market cap worldwide is not a farfetched position to be a significant controller of chips, especially since they directly work in the space as a platform.


This ignores Google's in house chips and their internal usage. They've been at this much longer. I doubt we have the visibility to know how they compare in terms of available flops and the unit costs


> They've been at this much longer.

.. but is that true?

MSR has been putting out research in all derivatives of modern large neural network architectures (NLP, CV, etc.) for the same amount of time that Google has. If there was a drift between timelines, its not large IMO.

What you could argue is that Google historically was more successful in their research outputs.

However, historical consumption of resources may not compare to current resources consumption.

> I doubt we have the visibility to know how they compare in terms of available flops and the unit costs

Completely agreed, unfortunately, this is all guesswork at best


Perhaps. I have no idea and am not purporting to know.


Can you elaborate on this? Per ChatGPT:

> Using Pierce-Arrow Motor Car Company as an example of such success is historically inaccurate. Pierce-Arrow was an American automobile manufacturer based in Buffalo, New York, which was known for producing luxury cars. It was indeed a dominant and prestigious brand in the early 20th century. However, the company did not manage to maintain its success and ultimately failed to adapt to changing market conditions. It faced financial difficulties during the Great Depression and eventually went bankrupt in 1938. Pierce-Arrow's inability to forecast and adapt to the economic changes and shifts in consumer preferences of the time led to its decline.


From the very answer ChatGPT gave you, it's evident that GP is saying that current success does not imply future success, using that company as an example. What needs elaboration?


It's pretty clear he is trying to make the opposite point, see "dominance in this market to this very day"


While vertical integration is a great boon for a company, it's hard to pull off. Being an expert in industry X doesn't mean you'll do great in industry Y, even if they are complementary.

Training and designing LLMs doesn't mean you understand the semiconductors business.


Vertical Integration? It may not be an OpenAI project, going by the reporting when he was ousted. I wont be surprised if the plans are for a Muskian incestuous/I-swear-its-not-self-dealing setup, wirh Altman being the CEO of both entities


Correct. They're an LLM team, not chip designers.


Yeah, it's not even like they're running the datacenters where the training and tuning are happening. I would hope some of the people understand what current compute requirements are and perhaps they know better than most what future requirements will be. However, MS has been doing most of the backend for OpenAI and they've been in discussions with actual silicon architecture people (not just NVidia), but those are the folks who would do any implementation.

Perhaps they'll pull off an Apple (for ARM) and do their own architecture (either for training/tuning or inference) that will have a significant effect on the industry, but it seems unlikely. They haven't hired the right people.

The real advantage they might have is insight into how the algorithms can be adapted to reduce power consumption/latency while improving performance. It would seem odd to me, if there weren't more than an order of magnitude in new algorithms for LLMs. You're not going to get 10x the transistors or speed from silicon, but you might get an efficient architecture for a significant algorithmic improvement (that might not just be CUDA).


"I know how machine learning and statistical computing works, therefore I am an expert in hardware design" fallacy.


> "I know how machine learning and statistical computing works, therefore I am an expert in hardware design" fallacy.

A typical case of engineer's disease.


I am guessing an incredibly talented team that is incredibly networked and incredibly well funded and proven agile in the tech hub of the world can find hardware experts. Don’t know why anyone would bet against that.


We would have heard if they had hired/bought the size of team necessary to design a system large enough to be a significant impact. Modern (eve sub 28nm much less 2nm) design is hugely complex and the range of things that an AI compute engine needs to do are very broad.

Perhaps they could design a core and license it out? I'm trying to come up with a way they can do something significant without 100 people. Just the memory and serial connections are complex enough ignoring the GPU or heat/power issues.


It took apple like 10 years to go from their first chips to actually using them in laptops, and they are literally the most well capitalized company on the planet. Sorry if I'm skeptical that some relative up starts with a billion in compute from Microsoft can compete with trillion dollar companies that have been around for decades.


Nobody can even define what AI is, why we need it, or how to achieve it. Usually it makes sense to seek funding to execute on a plan. Making a fancy chat bot that scrapes the web to synthesize sometimes accurate and sometimes useful information is not worth trillions of dollars.

What is essentially happening in my opinion is technical innovation has slowed so silicon valley is seeking money to prop up a house of cards that doesn't make much new that is useful or needed.

Can anyone specifically say what trillions of dollars invested in "AI" would buy for society?

It seems to me there are so many higher priorities.


I wouldn't bet against it but that approach has a remarkably low rate of success. We hear about the winners - survivorship bias is real.


How about something along the lines of AWS and their Graviton?


Graviton - you mean the poorly performing solution that only has a space in the market because amazon sells it as a subsidized cost as part of a larger effort to put pricing pressure on amd/intel? That Graviton?


Was Google a chip designer before the first TPU?


Yes. Google had a number of chip products before that. Some made it to A1 and worked. Just cause they don’t advertise it doesn’t make it not so.


> Yes. Google had a number of chip products before that.

Is that true? I can't find anything suggesting it is. In fact, the little I can find suggests you are incorrect. I'll link them for the sake of referencing sources but they're both pretty awful ad-ridden sites...

A 2016 Tech Radar interview [0] with Norm Jouppi has him quoted as saying:

> [The] Tensor Processing Unit (TPU) is our first custom accelerator ASIC [application-specific integrated circuit] for machine learning [ML], and it fits in the same footprint as a hard drive.

And a 2023 Tom's hardware post [1] begins:

> Google has made significant progress in its endeavor to develop its own data center chips, according to a new report. The Information says that a key milestone has just been reached, which means that Google can plan to roll out server systems powered by the new chips starting from 2025.This is not the first processor that Google has successfully put through R&D - the company has previously made an ASIC for servers and an SoC for mobile devices. The search giant started using its internally developed Tensor Processing Unit (TPU) as far back as 2015.

[0]: https://www.techradar.com/news/computing-components/processo...

[1]: https://www.tomshardware.com/news/google-reaches-self-develo...


I guess it depends on what you are defining as a chip and what you are defining as "Google" -- as in if they have contractors design/build to their needs does that count.

1/ https://www.wired.com/2012/03/google-microsoft-network-gear/

2/ I believe they had a few custom chips designed for the youtube workloads that predate the TPU.

I remember in 2010 there was a building in MV that focused on custom chips.


Said the horse factory when automobiles were being built.


I don't remember LLM's claiming to replace GPU's. This is more like arguing with a landowner why your assembly line is so innovative and needs to be built on their land for free. They need the land, the land doesn't necessarily need them yet.


Pullman Company will disagree with you.


Absolutely terrible analogy.


A LLM might "believe" that horses are built in factories.


It makes sense to ASIC-ify the thing to get lower latencies and make the whole thing cheaper, so MS can run GPT-(n+1) cheaper. But this bet only pays off if the LLM industry gets into the mature stage where costs dominate, not innovation.


The workload they have is already optimized for something like an Nvidia GPU.


I apologise if my response was a little snarky.

Even granted that OpenAI are not able to build a chip that is competitive with NVidia's latest GPUs for running LLMs right away (which is an opinion - not backed by any direct evidence, but I agree that it is plausible as they are going up against a lot of prior R&D) is it not possible that:

a) The unit economics could be so much better that the result is still a major win, e.g. 50% of the performance at 20% of the price.

b) OpenAI is decoupled from existing supply constraints and is able to grow faster and deliver more value. A "worse" chip that you can actually get (in insane volume) may be strategically better than a "superior" chip that is limiting your growth.

c) That the plan might include some elements you are not expecting - at the $trillions investment level they might be looking at doing some surprising things e.g. (I am just making this up but there are a lot of possibilities) buy a memory manufacturer and work directly on increasing memory bandwidth.


From a lay observer point of view of the semiconductor industry of the last two decades, it seems entirely implausible they could do that quickly without just buying a company that was already working on it. And then, unless that company was big enough to already have a significant defensive patent portfolio, it's likely their efforts would be stymied in court for years if it was remotely successful.

The idea that even with expertise, the wins would be so much over what other companies that have hired/bought these companies have been designing for the last 10 years based on very similar requirements (the ones that wrote so much of the foundational research) also seems implausible.

c) It's not actually possible to plan investments at that level with anything more than a very vague direction you're aiming. If it is long term, then everything is changing in unpredictable ways before you get even 25% there, but if you throw so much money at the problem in order to try to solve it much more quickly you are disrupting global economic and geopolitical forces in ways that also can't be planned for.


"50% of the performance at 20% of the price" is wildly implausible even if you can somehow start fabbing perfect chips for openai's workloads tomorrow. Especially if they don't have access to the fabrication processes that nvidia, amd etc are using, since more modern (read: expensive) processes reduce power draw and enable higher clocks. 80% of nv's datacenter die space is not wasted, not close to that much.

It seems more likely to me they'd get 20% of the performance at 50% of the price, and that might still work out for them if it allows them to scale faster without being bottlenecked on supply of existing GPUs. But there's no magic bullet here.

They also still need to source a bunch of other stuff, like RAM, even if they can source their own processors.


Nobody is able to build a chip that is competitive with NVidia's latest GPUs, not even AMD who would be next in line. Look at Google's TPU for a glimpse at a likely outcome of such an endeavor.

What it tells me is that Altman seems to believe that OpenAI can only make the next step if they can throw even more compute at the problem but that that isn't feasible at today's prices.


"The current round" of AI accelerators you are referring to are things that were designed 2015-2022; There are a number of startups (including my own) that are actually designing for the real bottlenecks that differentiate Transformers (plus SSMs and other emerging architectures) from "old" CNNs, RNNs, etc.

Obviously I think my company is doing this in an unique and "correct" way, but I know of half a dozen other companies founded in the past ~18 months that are focused on the memory capacity and bandwidth bottlenecks that exist... the massive failures of the previous decade do not mean that they are going to be repeated.


What can you actually do hardware wise with memory bottleneck except for use faster memory?


Is there any startup which is ready to compete with this: https://www.redsharknews.com/nvidia-wants-to-increase-comput... ?


It is known for electronics designers, that specialized circuits outperforms GPUs for few times.

Before appear Tensor cores, GPUs was about 4 times worse (speed, power consumption).

With Tensor cores, GPUs become better, but they still need to carry video hardware (ramdac, video connectors, 3D processing units, network to connect all this stuff), so they still late.

Really GPUs are interest just because current AI applications are not achieve enough revenue to pay for large scale production of special chips.

I don't know, if Altman have something Big to get revenue to pay for special chips.

Exists speculations that GPT-5 will be enough to replace human at work. If this is real, AI chips will be worth it.


We are indeed talking about a 10^6 factor here ... It's not just 10x or 100x, or even 1000x ... If NVIDIA strips away everything not required from their chips, adds more SDRAM and HBM, it won't improve performance by 100x, maybe they'll make it 10x-15x with this. But they claim they are going to achieve a 10^6x improvement in performance. Even if they end up delivering an ARM-compatible CPU with built-in Tensor core, built-in HBM, and vast SDRAM, without DDR RAM at all, how fast can it be? This promise of 10^6x performance improve is a paradigm shift. They know something that we are not. Or they are just bluffing.


For about tech questions you asked. You asked right questions, but you missing context.

What really main bottlenecks of NN hardware are neither number crunching, nor memory.

Real bottleneck is that GPT-2 is may be last LLM for which was possible train on one machine (even on one card).

About GPT-3 usually people said about 32-GPUs installations (possible to install into one machine), for GPT-4 scale said about clouds.

And modern clouds are NUMA beasts. I could say, modern clouds networking is slow, but it is not right words, as they are slow as hell.

What all these mean, NN are good target for parallel processing in clouds, but not good enough. Real benchmarks said, mentioned 32-cards machine is about 10 times faster than 1 card with such amount of memory, and when on GPT-4 things scaled, benchmarks become much worse. So, just improve network to move bottleneck to something else and will got additional 50-100x improve.

And with good team of AI scientists, it is more real to make special hardware network for NN processing, or to tune algorithms, than with team of GPU video processing specialized team.


> GPT-2 is may be last LLM

This is not true. You have tones of models those are even better than GPT-3.5 and really close in performance to GPT-4 and you still can train them on a single GPU with 24GB video memory. There is a hint at yet better models published last year which you can train on a single GPU and have a model comparable in performance to LLaMA2 34B. The horizontal scaling which you appeal here, may fit into 10^6 performance increase, but in general I expect single node to be at least 1000 times faster than now. And it is totally feasible that you can't scale with 0.99 vertically and of course not horizontally, but I honestly expect the scaling per GPU get better than 0.75 in next 5 years.


> you still can train them on a single GPU with 24GB video memory

It depends, on what target. For pure science (or for enjoy), I could train GPT-4 class model on C64, but this method will not fit on concurrent market, where need fast check hypotheses and fast deliver tuned models.

- Concurrent market is very sensitive for speed - for example, if MS present something on December 10, Google after New Year should present not equal, but significantly better, to just appear equal for customers.

So, horizontal scale is a must, not just my wish, even when speed increase is far from linear.

> I honestly expect the scaling per GPU get better than 0.75 in next 5 years

Could you give explanation, or even speculations, how this is possible, when we already hit Silicone limits (about 5GHz core, 1nm, etc)?


> Could you give explanation, or even speculations, how this is possible

Nope. But i'm so desperate to give you a hint right now, it is almost impossible to hold myself... Stop looking into horizontal scalability. The vertical one is not exhausted yet. Btw that was not the hint.


> Stop looking into horizontal scalability.

Sure. B-747 officially need about 700 man-years so assemble, lets make them with small but highly motivated teams, with classics 3 pizza rule, world will wait :)


BTW I was not joking, when said about train LLM on C64. I lot of time seen scientists, who run their tasks on desktop, waiting days or even weeks for results. But they usually have reasons for such behavior, for example, to keep secret from colleagues, on what working now and what calculations show. Or to run something so original, that tops not happy to see on special numbers crunching machine.


Exists one important thing, many people don't aware of. When some good smart team (business or not it is not much important), focus on some task and have corresponding resources, it really could make things, impossible for universal team, targeted for some wide outcome.

What I see, NVIDIA is good, strong team, they bet very high stakes, when made great acquisitions in 2000s and they won. But NVIDIA made wide targeted product, they cannot made very narrow focus on just neural net. So it is possible to make NN product better then NVIDIA.

Real question is to predict, if Altman team could achieve so good economy, to pay expenses for hardware development.


> But they claim they are going to achieve a 10^6x

Classics of management, to ask people more then they could, and they will do most possible, so I don't bother much on such claims.

And also this is teambuilding bs, to motivate people claiming impossible targets.

Will see, how Jensen Huang will use all his diplomatic skills and rhetoric art, to round corners, when become clear, that claimed things impossible.

And this is not first time, such things happen, there are near infinite number of examples. I just few days ago read about IBM 7030 fail, which delivered ~1/10 of claimed, and yesterday people remembered me about Itanium and i960.


Will your arch work for SSMs?


Yes; Mamba was a very easy match, with Hyena also being a good match, but could be greatly optimized with some minimal changes to the model architecture or hardware design.


NVIDIAs margin is someone's money. I wouldn't say they don't understand it. I would say they need a good enough competition to get the margin down.

E.g. FB saying they want to buy 350k H100. That's just a whopping $14B price tag. With a >85% profit margin. While a fab is $20B.

Trillion? Sounds like anchoring to me. Nvidia has a market cap of $1.7T. You could literally buy NVIDIA for that. I read that as "a billion won't cut it, we need quite a few billions".

But it's not unreasonable that those hyperscalers throw in a few billion each.

Usually it's horrible business not to be best (see Intel/AMD). Because the margins are at the top. In this case though they want a whole range of products to go down in margin. Even a slightly worse chip might be worth it if it comes at a significant cost reduction. Especially if the optimal design is known!

In a sense the whole thing can fail at reaching the top or making lots of money and still succeed in bringing total cost down, potentially by 50% or more.


There are probably a lot of optimizations in the silicone and software to find. It's not necessarily obvious what corners can get cut or where, the tradeoffs of tapping out new chips is worth it. Yet Another Matrix Multiplication Chip is not going to set the world on fire. Nvidia has that market pretty well captured.

But perhaps it turns out that subnets can be trained independently or swapped with semantically equivalent but qualitatively different ones. The routing network would effectively "Standardize" and could in principle be well enough understood to "hand optimize" the routing network into hardware. Or maybe back propagation has some novel physical analogue that can be exploited in scales we can access. The real question is if Altman is capable of finding the right path in the notoriously dead end filled field of chip design. His backing of helion [1] didn't bode well in my view. But with enough R&D maybe he will flail into something useful trillions is enough for a lot of flailing.

[1] https://youtu.be/3vUPhsFoniw Edit: more derisive link


Could it be that for today's workloads are perfect for Nvidia GPUs? Not because it is an ideal chip, but rather because of the availability of them, the current workloads are made to take advantage of Nvidia GPUs' architecture.


Most of the workloads have not yet caught up with Nvidia Hopper optimizations. The key are the Tensor Cores.

Google came up with the TPU (2015) for GEMM. Nvidia just took the idea and ran with it (Turing 2018). So it wasn't that Nvidia had a head start on this.

Now Nvidia Hopper is ahead of everybody else by far. They have things like async memory management for the tensor cores (Tensor Memory Accelerator), mixed precission, and even FP8 support.

Most of the software out there has not yet caught up with that. And even Nvidia's own Tensor Engine software is not making the best use of it (Microsoft Research October 2023, backward pass and cross-device communication).

Last year FlashAttention was a game changer for performance by doing memory load optimizations. Nobody was optimizing properly for Nvidia in Transformer models.


Systolic arrays for matrix multiplication go back farther than TPU.


The scale of this should tell us it's not just about building an alternative to Nvidia.

$7 trillion is like adding TSMC, Intel and AMD together, and multiplying that combination by seven.

This is about sheer capacity, not just circumventing CUDA.


Why not just give like a fraction of that to NVidia and tell them "make us more please, we will buy in bulk"?


What they are highly optimized for is mixed-precision GEMM (like all other accelerator manufacturers). What distinguishes Nvidia for now (imo) is that CUDA cores are also quite good at normal code (with control flow etc). I used to think that being close to optimal in one of them would contradict being close to optimal in the other but it turns out they share a lot of resources (SRAM) and the overhead in chip surface if one or the other is laying dormant seems negligible. I'm pretty sure that AMD et al will be sufficiently successful at blatantly copying the CUDA API that we will see serious competition in the next years. The bigger source of uncertainty might actually be fabbing capacity.

I find it hard to argue that this mode supports a 1.7T valuation. I find it hard to believe that for a couple of billions + TSMC credits no one would be able to recreate the CUDA ecosystem + hardware in the medium term.


Doesn't nvidia have huge margins? so if someone just makes a clone of the nvidia gpu then it can erode their margins and drive down the cost of compute


AMD will succeed at this as long as they keep it together.


Everytime I'm tempted to think software is easy compared to hardware, I just remember that AMD is leaving about a trillion dollars worth of market cap on the table, because they haven't figured out a good alternative to CUDA.


They are definetly putting a lot of effort into ROCm & HIP, but definetly accelerating.

ROCm 6 was out Dec 16 (2023), 5.5 was May (2023). 5 was Feb 10 (2022). 4 was Dec 19 (2020)


Fred Brooks wrote in The Mythical Man-Month that it's harder (more time-consuming) to produce the software that corresponds to a given hardware. In 1975.


Hardware was much simpler and less complex then than now. I wonder how or if that's changed by going from hundreds or thousands of transistors to billions.


They’ll need to either reverse engineer CUDA or incentivize reimplementation of everything out there to use ROCm/OpenCL and forgo all the work load optimization done for Nvidia GPUs. I think that’s a non trivial moat.


This has been my perception of AMD for the past 20 years. First against Intel, then ARM, now NVIDIA. "If only ..."


The real bitch is you also need to replicate both the software and convince some large projects (eg, pytorch) to use and support your implementation, and it’s just all rough, very complicated, very fine-grained stuff. The hurdles here are very high.

And if you fuck that part up in any one of a dozen places, no one will use it, because the adoption cost is too high, or your implementation was 20% slower and so everything costs 20% more to use and no one uses it.

This is why you see things like TPUs never really damage NVIDIA, but why basically everyone is focused on open standards and open software. Basically the entire tech industry is using this approach as a way to slowly peel away the layers of this software until enough has been removed that NVIDIA can no longer use it as a moat.


While I doubt OpenAI will be a good fit for semiconductors, my understanding is PyTorch and TensorFlow have been really good at embracing new accelerators, largely due to XLA.

PyTorch, TF, and JAX work great on TPUs. Adoption is low bc they are not really available outside the Google cloud.


AWS uses tricks to accelerate PyTorch with Inferentia/Trainium. Haven’t used it, but I have tried the equivalent for Apple silicon and rage quit after wasting half a day.


I mean, it took almost a decade to get there.


Right, but that was for XLA no? I think (not an expert) that it compiles code from franeworks into a lower-level IR.

That's gotta be way easier, no?


If you are going to go vertical then do it properly.

OpenAI could just build their own framework for internal use that works well on their silicon (see Jax+tpu)

Their starting point? Triton plus some triton libs. Jax chipped away at TF like this, and no reason why Triton can’t do the same to PyTorch.


Competitors don't have access to the process node. You'll get competitors, but they won't be as fast or able to run the latest models. That means they'll compete with older versions of NVIDIA's chips.


Agreed. commoditizing the complement of OpenAIs models.


As far as I know Sam has no technical expertise besides taking money from other non experts who happen to be rich. It is unclear to me why existing GPU manufacturers are not up to the challenge of meeting the needs of "AI" software as you said.


Accelerators have nothing to do with it as we're mostly memory bound by HBM <> SRAM data transfer rather than compute bound.


It depends. Right now once we hit 6-8 bit precision inference, H100s/A100s are not memory-bound, but compute-bound.


This is wrong, being memory bound or not has to do with the dimensions of the matrices being multiplied (if you’re on tensor cores). https://docs.nvidia.com/deeplearning/performance/dl-performa...

Some of the things being done to improve quality of 6-8 bit inference use extra compute and push it a little in the other direction but it’s still pretty memory intense until the batch size gets quite large


It'll help, but GPU crunch isn't caused by people running 6-8bit inference on a single card, but by all the large scale pre-training + fine-tuning runs.


Can you link to an actual performance analysis on this?


Easy. I made tests on desktop core i7-7700 with 64G DDR4-2400. And I've tested 13B..30B..70B models on it, and you may imagine, how easy to manage how many CPU cores used.

Answer is - it is really works, but slow (about 0.5..1 tokens per second, with near 100% CPU usage).

i7-7700 is good weighted machine, but before I few times achieved memory speed bounds with highly optimized software. And it looks very different. When use all cores, I got somewhere about 50% of CPU usage.

BTW Llama.CPU is very good software.


If I’m not mistaken, for parallel inference requests and for prompt preprocessing it’s compute bound.

Also, if you have just a single model you want to optimise (and not the training), you could build an array of asics that do specific matrix computations - then you don’t need to read weights from memory at all.


> Perhaps he will succeed in misallocating a trillion dollars along the way.

Must be really hard, being only a half-billionaire and trying to keep up with Elon's "success"...


It is always a weird take, it happens with Elon Musk all of the time too. Clearly, some people believe they should both be consulting hacker news before making any decisions, because we know better.


This is a PR piece for Casey Newton‘s substack.


Do you even know how many years “an earth’s worth of biomass” could power our civilization? What if it’s like 1,000 years? I think you are just doomposting.


I am getting sick of seeing the same edgy fatalistic comments every time a climate article is posted. If you are really such a master of the universe that you can predict the future and know it’s gonna be so horrible, then you must also know that the only morally justifiable action for you to take is to end your life right away to become a net-zero hero.

Otherwise why don’t you calm down and entertain the thought that maybe the future won’t suck, and might even be normal or good, just like the present is right now.

Why are you working on technology if you don’t think the future will be great?


Every people is in my opinion eligible for some sane amount of resource usage on Earth - that is not a problem, the Earth is rich and would have enough to provide for the current population.

But we should not have people that travel with private jets at all, neither should we transport locally easily produced goods across the world due to salary arbitrage, should build small, walkable cities, communities. There are plenty appliances/tools we could easily share and thus not opt for cheaper, low lifetime ones — e.g. how often do you use you drill? Or power washer? Wouldn’t renting/lending it between a few houses be more environmentally responsible? Also, multi-generational houses are not a bad thing if the relation between people is not bad — the elderly get to experience the joy of their grandchildren, parents get “free babysitters”, later on the parents can care for the elderly. It’s how humanity has been since forever, and I believe it would have positive effects on everyone, not just ecology. Also, there should be barely any need for single use plastic packaging besides medicinal use — create standard container dimensions for various kinds of products and create dispensers in supermarkets that take in your previous container, and give you your product in a new one. If you break it/don’t bring it back, you have to buy that again. Yoghurts, drinks, huge swathes of supermarkets could be replaced that way.

The problem is the huge inequality — my Western ass is in huge luxury, when my African counterpart is hungry with barely any potable water. Wouldn’t they also deserve equal treatment? But if everyone were to live in Western lifestyle we would be even worse off. We should strive to cut back a bit on the Western side, and bring up the rest of the world in a sustainable way. We have the tech for that, mostly.


I agree with some of your points about excesses of American lifestyle (where I live), but there is another side to this. A pretty good proxy for wealth is basically how much stuff you can waste and not care about. I can leave all the lights on in my house because electricity is cheap, or I can afford to eat so much delicious food I become overweight and get heart disease.

You could argue that I should turn off the lights, and on that point you might have some ground. But more generally there are many things in my life that I save time on by basically throwing a little money in the trash, and I am very fortunate to have this opportunity. This is on a spectrum and we should obviously not cause obscene long term damage so I can save 2 seconds per day, but we are not at that point yet.

And regarding inequality, it is also on a spectrum, and at some point it is definitely too much, but I don’t think we have necessarily crossed that line. 99% of the people in my city have enough food to eat. And the right amount of inequality is certainly not 0, the only way to achieve that is by killing all life on earth. Some inequality is a natural consequence of the different branches of possibility that people explore with their lives. To the extent that we have anything in our society it’s because we are able to share ideas and cooperate. A society will necessarily always have the have’s and the have-not’s.


Also why are they having kids, why are they buying things, etc. If someone is truly concerned: vote that way, go work on making floating houses and thermal isolation and carbon recapture and leave the rest of us that live normal lives with average carbon usage alone.

Anytime someone that had kids mentions anything about reducing other people's carbon output they should just be ignored. You just created a potentially forever chain of carbon usage that will be exponential. Meanwhile here I am with my "constant" carbon usage to your O(2^n) in perpetuity.


What if you actually can’t predict the future and climate change is actually good? What if having kids is actually a great thing for them and for the world? These are reasonable ideas if you don’t just spend your time doomscrolling on twitter all day.

People talk about the AI singularity but the whole world is a singularity all the time. To model the entire world and predict how everything is going to play out on a social and economic level is obviously impossible. To make impoverishing top-down prescriptions based on that seems criminal. To utilize this storyline as a political mechanism is cruel.


> What if you actually can’t predict the future and climate change is actually good?

Is this how rational people react, or is some spiritual response talking from fear?

How can be the depletion of biodiversity, the increase of temperatures and the disappearance of ecosystems that we need to survive "good"?

As a community we do no have a crystal ball to predict the future, but we have science and technology and the predictions from there are clear: it is not good for us, and it is not good for the current species.

The far future, the very far one -- sure -- the are good chances that new ecosystems will appear adapted to the new environments, but those will not be "nice" for our current expectations.

A doing-nothing-and-hoping-for-the-best strategy is a guarantee for massive wars, hunger and suffering, as happen many times in the past (but never in a scale of 7.000.000 population)


Your entire framing of the world is engineered by pessimistic news articles which only tell a small part of the story.

Is loss of biodiversity bad? Maybe. Will we have resurrected most extinct species using jurassic park DNA within the next 500 years? I dunno but if that happened, it would make the current loss of biodiversity into more of a blip than an apocalyptic thing.

The science and technology enterprise is economically motivated. It is pretty good at creating value out of fewer and fewer resources. It is not good at making godlike insights into the far future about the late stage interactions between itself and the physical world. Any definite predictions provided to you are more likely driven by short-term incentives of some political figures.

There is nothing indicating massive wars are coming due to climate change, most of the world is lifting out of poverty not slipping back into it. If the world does heat by 5-10 degrees, we should be focused on making sure indians and africans have enough economic resources to afford air conditioning like we do in rich countries.


What an absolutely bullshit take, honestly I have a hard time reasonably reacting to it, it is so dumb.

We can’t predict the future down to a point, but we can make good, large-scale predictions with good accuracy — global average temperatures will increase and by every prediction, that will have catastrophic results. Period. That’s not some doomer news, that is reality. We can’t predict how individual countries will react, but that is not the question.


I’m sorry man but based on the stuff I’ve worked on and the situations I’ve seen, it just seems more plausible to me that the scientific enterprise tasked with scaring the shit out of everyone is not sampling and unbiased distribution.


What scientific enterprise? You do realize there are plenty of countries/institutions with widely different monetary/political incentives that finance absolutely separate groups to do research?

So unless you believe in some Secret Government that controls everything, it is just a completely naive take that has no basis in reality. We can’t even coordinate a single country’s various, independent incentives even in very authoritative governments — and you believe that every study made is financed with the same incentives? If not, you can surely list reputable studies that call out the fake research, right?


> you can surely list reputable studies

Dude neither of us have ever read a scientific study about climate change. We are both going on bullshit we read online and related to experiences we had in corporate land. Maybe I read more right-leaning stuff so I have a more counter opinion. Don’t pretend any of us are fact checking the magical climate science, we aren’t.

The one thing I actually have firsthand experience with is computational complexity. And after thinking about it for a while, it seems plausible to me that we cannot know the scientific prescriptions we see in negative climate coverage with nearly the definitiveness that they claim.

And it also seems plausible that a lot of people gradually managed to form a doom and gloom committee to find doom and gloom in a noisy world where there aren’t that many definitive answers.


> The one thing I actually have firsthand experience with is computational complexity.

If that's the case then I expect you're well across the Dzhanibekov effect and the fact that the long term arc trajectory of (say) a spinning wingnut can be extremely predictable as its CoG follows the usual equations of motion while its short term tumblings are utterly unpredictable and chaotic.

The key point being that the corase climate model is pretty damn simple in terms of basic thermodynamics.

Heat from below (core), light from above (sun), energy absorbtion in the sea and land, energy transform to heat, heat radiation outwards, some heat entrapment by insulation.

Increased insulation ==> greater heat entrapment, etc.

To be sure the fine details of interplay within and between climatic cells are challenging .. but the long term arc of more and more energy being trapped leading to more heat, more storm energy etc is straighfrward enough that it was first done as a back of an envelope calculation more than a century ago.

If you're demanding an exact time table on what and where will reach what tempreture when .. then you'll be sorely disapointed.

Otherwise its a simple case of we're in the direct path of a massive fully laden train that is ever so slowly derailing.


This makes sense and gives me some things to learn about, I really appreciate it.

If climate science is correct, the one hole I still see is that it doesn’t take into account future improvement in technology, which I think might have some solutions especially when the problem actually threatens an economic player like a 1trillion dollar company. It is basically a choice to believe something good like this can/will happen, so you could call this out as semi religious.

Thank you for engaging with me in the end.


You're welcome, athough to be honest I hadn't been paying attention to individual names, just watching a newest comment scroll and responding to various comments re: climate science (I've been in exploration geophysics for some time).

Looking at some of your responses to my comments:

> The problem is that you are recommending we turn off capitalism

I made no direct recomendation although I suggested various approaches - I'm not sure where on the globe is practicing fully informed free market capitalism but I'd certainly want to regulate it in the same manner as we regulate engines of power to avoid them becoming unbalanced and walking across the floor just as first generation unregulated steam engines did.

From a complex dynamics PoV Adam Smith was a first order basic bitch (as I believe the young people of the day might say).

> we will doom the world’s poor people to lives of certain poverty with no hope of upward mobility.

Mighty white of you to say so, maybe you might want to ask some of the worlds poor what they want.

The people I know that have nothing (I grew up in very outback Australia) want their land back, they want the "developed advanced nations" to stop dumping waste and shit on the land they've cared for the past 40+ thousand years, and other such novel ideas. eg: [1] [2] [3]

Not one has mentioned wanting yourself or others to speak on their behalf.

Of course there are many people across the planet, I wouldn't assume to speak on their behalf - although I did gain some perspective travelling through roughly 2/3rds of the worlds countries in many of the more undeveloped areas.

[1] https://www.youtube.com/watch?v=_UKu3bCbFck

[2] https://www.youtube.com/watch?v=J7h9V4aKlJw

[3] https://www.youtube.com/watch?v=YaoSi6RFqsc

It's a bit of a bone of contention that so many of these massive copper and lithium deposits are on indigenous lands.

Perhaps the question you best dwell on is whether endless growth and increasing consumption is really all that certain groups of people seem to think it is.

> If climate science is correct ...

It's 2023, the time for "if" was 40 years ago - are you questioning "If GPS science is correct", "If Magnetotellurics is correct", "If the James Webb telescope actually works", "If the Finite Element Method is real", etc.

> ... it doesn’t take into account future improvement in technology

Nor should it, but of course climate models can be tweaked with "what if scenarios" - what if the suns input could be reduced (outward reflecting bubbles in space twixt earth and sun), what if X million tonnes per annumn of C02 can be captured and sequested 'somehow', what if we build out thousands of acres of solar PV and mass produce green ammonia to offset the climate effects of the Haber–Bosch process, etc.

'Climate science' contains multitudes after all: https://phys.org/news/2023-07-family-trees-relationships-cli...

Still, it's good that you seem to want to learn more.

Nobody has ever realistically thought that real world problems are easy.


Okay, so I'll try to explain how does global/troposphere warming works:

The earth must radiate all the energy it gets from the sun (else we break physics).

A third of the sun's radiation is reflected directly.

The rest have to be radiated. Calculation show that the temperature needed to radiate the excess energy is 155K. Also, plank's law make that this radiation is mostly infrared.

The atmosphere (GHG concentration especially) makes that the point of emission of the energy is not the earth surface, but high up in the troposphere. The temperature this point need to reach is 155K. To reach that, the point below has to reach 156K, the one below 157... (it's not really discrete values, and it's not really temperature but energy, but I'm both simplifying and explaining in a language I never used for physics or math before).

So, rising the number of Co2 molecules at that altitude (and especially higher) will move the point of emission up. But that new point of emission isn't at 155K yet! So for a while, the earth will absorb more energy than what it's emitting, until the new point of emissions reach 155K.

Its high school physics tbh.


So basically the balancing point at which CO2-based heating stops is bounded by this 155K number, which I’m assuming relates to something really hot and bad on the surface. This means that with just this model we will all die.

Can we not just seed clouds to reflect lots of energy back into space though? Maybe that is a stupid idea, but I could come up with 10 ideas like that and maybe someone could come up with an idea like that, but which actually works.


The temperature on the surface depends on the altitude the surplus heat is emitted. This altitude is variable but is constantly increasing as Co2 concentration at high altitude rises (it rises by osmosis, and basically the current concentration is what was on the surface 20 years ago).

We have to issues:

we don't know how the feedback loops work: we are jumping into the unknown, we have no data to predict the risks: the only thing we know is that between the last ice age (1km ice sheet over Canada, Sweden) and the climate we had for 10000 years, until 1850, the difference in global temperature was 3,5K (2% increase).

The second issue: the transition lasted 10000 to 20000 years. We are on a real, really fast track and making the same transition in 100/200 years.

Now my opinion : we aren't doomed as a specie. In fact, I'm pretty sure mammals can thrive in jurassic era temperatures. But I do expect a lot of death because the transition is way, way to fast. We are unprepared. And a lot of species will disappear. Also Zooplankton is dying because marine CO2 concentration is too high and the ocean is too acid for zooplankton to form shells from CaCO, which is an issue caused by CO2 but unrelated to tropospheric warming.

Climate engineering is way harder than that (we want to avoid acid rain, reducing direct sunlight will reduce photosynthesis, so as long as we have famine, we want to avoid it too. Augmenting the albedo on the surface is an OK option in hot countries).


> To model the entire world and predict how everything is going to play out on a social and economic level is obviously impossible.

Sure.

But that's a trite non sequitur entirely orthoganal to the simplicity of the thermodynamics underpinning the cause of climate change.

Ever increasing levels of trapped energy that will increase at an even faster rate once methane and water vapor get seriously involved in the mix unless action is taken to either block the sun or reduce the insulating effects of atmospheric gases.


I agree with your last statement, although it is describing a very simple interaction and I don’t feel certain that we are observing that in action yet. I also don’t know if there are counter-balancing interactions that we aren’t aware of yet.

If humanity does end up accidentally terraforming the planet to get to hot in a positive feedback loop, I believe we will be within a line-of-sight to terraforming it back in the other direction.

Technology has solved all our physical problems so far. Maybe we will seed clouds in the upper atmosphere to reflect a lot of energy back into space. Maybe we will find something useful to do with captured carbon (which doesn’t just release it). It is insane to suggest capitalism won’t solve this problem by the time it actually needs solving, and therefore needs to be shut down (it’s the thing that actually works).


If it takes > 100 years and all the energy of tens of billions of tonnes of fossil fuel to put us in a state of extreme peril then yes, it will at least * take another 100+ years and all the energy of tens of billions of tonnes of fossil fuel to reverse out of that position.

If that's your suggested strategy then I would suggest that we can do better by not going there in the first instance.

* Thanks to the arrow of time, the issue of unbreaking a glass, methane release and other factors it very realistically could take more time and energy to get out of the hole we seem intent upon driving into.

> Technology is for solving problems.

It's not 'magic' though and I have little time for green washing or praying to the technology fairy.


> I would suggest that we can do better by not going there

The problem is that you are recommending we turn off capitalism, which will have serious consequences that are easier to predict than climate change: we will doom the world’s poor people to lives of certain poverty with no hope of upward mobility.


Who recommends turning off capitalism?

But un-restricted capitalism is literally the stereotypical paper clip AI. It will seek out maximal profit at every cost. As per the very Adam Smith, it only works well in small, welll-regulated markets. Let the government do its job, and capitalism its own.


> Who recommends turning off capitalism

The article is about degrowth.

The paperclip thought experiment is a stupid thought experiment. An AI that had the executive functioning to improve its entire substrate (e.g. rewrite the entire TSMC Nvidia chip and software supply chain) would require the same higher order agency that would stop it from destroying the world to turn it into paperclips. It only seems plausible if you spend all your time disembodied from reality browsing lesswrong.


> It is insane to suggest capitalism won’t solve this problem by the time it actually needs solving

That time is now, and it has failed.

But it is crystal clear that we should decrease our current CO2 usage as much as we can, even at the (imo small) price of personal luxury — even if some new technical advantage can solve the problem, we should stop at least speeding towards the cliff, and try to break as much as possible.

You are forgetting just how much bigger the planet is then us.


> That time is now

We diverge on this point and I talked about it in other comments.

> we should decrease our current CO2 usage as much as we can, even at a small price of personal luxury

It’s luxury because we live in the first world. In the 3rd world where most people live (and most people are poor), these policies have catastrophic consequences, such as the Sri Lankan organic farming affair.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: