The bar for the AI ecosystem to get models running is git clone + <20 lines of P...

Const-me · on July 26, 2023

> undercut Nvidia so deeply that it makes sense for the big players to hire a team

At least for some use cases, I think that has already happened. nVidia forbids using GeForce GPUs in data centers (in the EULA of the drivers), AMD allows that. Cost efficiency difference between AMD high-end desktop GPUs, and nVidia GPUs which they allow to deploy in data centers, is about an order of magnitude now. For example, L40 card costs $8000-9000, and delivers performance similar to $1000 AMD RX 7900 XTX.

For this reason, companies which run large models at scale are spending ridiculous amount of money on compute costs, often by renting nVidia’s data center targeted GPUs from IAAS providers. OpenAI’s CEO once described compute costs of running ChatGPT as “eye watering”.

For companies like OpenAI, I think investing money in development of vendor-agnostic ML libraries makes a lot of sense in terms of ROI.

mrguyorama · on July 26, 2023

But you can't run most models on the consumer AMD GPUs, so even though AMD "allows" it, nobody except supercomputer clusters uses AMD GPUs for compute, because all the expensive data scientists you hired will bitch and moan until you get them something they can just run standard CUDA stuff on.

Const-me · on July 26, 2023

> expensive data scientists

Different people estimate compute costs of ChatGPT to be between $100k and $700k per day. Compared to these numbers, data scientists aren't that expensive.

> just run standard CUDA stuff

I doubt data scientists have skills to write CUDA, or any other low-level GPGPU code. It's relatively hard to do, and takes years of software development experience to become proficient.

Pretty sure most of these people only capable of using higher-level libraries like TensorFlow and PyTorch. For this reason, I don't think these data scientists need or care about standard CUDA stuff, they only need another backend to these Python libraries. Which is much easier problem to solve.

And one more thing. It could be that most of ChatGPT costs are unrelated to data scientists, and caused by end users running inference. In that case, the data scientists won't even notice, because they will continue using these CUDA GPUs to develop new versions of their models.