It shocks me how much payroll and cap-ex is spent on the M1 and how little is invested in getting TensorFlow/Pytorch to work on it. I could 10x my M1 purchases for our business if we could reliably run TensorFlow on it. Seems pretty shortsighted.
The GPU claims wouldnt even need to be on parity with NVIDIA, it would just need to offer a vertically integrated alternative to having to use EC2.
Having beaten my head on this for a while (and shipped the first reasonably complete ML framework that runs on Metal) Apple's opinion as expressed by their priorities is that it's just not important.
We've followed five different instructional and documentation pages to make it happen and none seem to consistently install. Throw in a corporate system where you need IT for root access to make changes and it is game over. So i've got an M1-max fully loaded and cant get TF running on it.
Now i've got a team of data scientists in a fully MBP shop and we're holding off upgrades to M1 until this all gets resolved.
On my personal M1, I managed to make it work, but its hard to know the layers of changes made and what exactly allowed it to work.
Deep learning support for Mac is not going to happen at a level of quality you can rely on for research & dev work (like PyTorch + TensorFlow). The underlying problem is no big company cares about Mac platform and the work to maintain framework support for a specific piece of hardware is way beyond a hobby project. If you want your own on-prem hardware just buy Nvidia.
The GPU claims wouldnt even need to be on parity with NVIDIA, it would just need to offer a vertically integrated alternative to having to use EC2.