Intermediate Graphics Library, a cross-platform GPU abstraction library by Meta

a_e_k · on Oct 10, 2023

Previous discussion: https://news.ycombinator.com/item?id=36635526

dang · on Oct 10, 2023

Thanks! Macroexpanded:

Meta releases Intermediate Graphics Library - https://news.ycombinator.com/item?id=36635526 - July 2023 (209 comments)

modeless · on Oct 10, 2023

IGL doesn't translate shaders, so you still need a separate solution for that.

Other options would be bgfx, or sokol_gfx, or WebGPU, or Vulkan+moltenVK, or even full game engines: Unreal or Unity or Godot.

FireInsight · on Oct 10, 2023

So, like wgpu, but in C++ and by Meta?

itissid · on Oct 10, 2023

Is the idea here to also compile pytorch to this(other than CUDA) so they can leverage those APIs(Vulkan, OpenCL etc) for other platforms?

adamnemecek · on Oct 10, 2023

This seems to be mostly for rendering not GPGPU.

KRAKRISMOTT · on Oct 10, 2023

Are there any translation layers for GPU? E.g. from Vulkan to WebGPU

Const-me · on Oct 10, 2023

There are some.

DXVK implements D3D 9 and 11 on top of Vulkan, it’s an essential software in SteamDeck and Wine. There’re rumors modern Windows GPU drivers for Intel Ark GPUs are using that thing as well, for the implementation of D3D.

MoltenVK implements Vulkan on top of Metal, MoltenGL implements GLES 2.0 on top of Metal.

KRAKRISMOTT · on Oct 10, 2023

Are there any that targets WebGPU? Emscripten won't translate the GPU calls unfortunately :( ANGLE does the opposite.

adamnemecek · on Oct 10, 2023

Why not use webgpu?

MegaDeKay · on Oct 10, 2023

They gives these reasons at the Github link for its existence...

---

Intermediate Graphics Library (IGL) is a cross-platform library that commands the GPU. It encapsulates common GPU functionality with a low-level cross-platform interface. IGL is designed to support multiple backends implemented on top of various graphics APIs (e.g. OpenGL, Metal and Vulkan) with a common interface.

There are a lot of good options for abstracting GPU API's; each making different trade-offs. We designed IGL around the following priorities:

- Low-level, forward-looking API. IGL embraces modern abstractions (command buffers, state containers, bindless, etc) and is designed to give more control than OpenGL's state machine API. As a result, IGL can have leaner backends for modern API's (e.g. Metal, Vulkan).

- Minimal overhead for C++. IGL supports new or existing native rendering code without overhead of language interop or the need for other language runtimes.

- Reach + scale in production. IGL has been globally battle-tested for broad device reliability (especially the long-tail of Android devices as well as Quest 2/3/Pro compatibility for OpenGL/Vulkan) and performance-tuned on our apps.

bmitc · on Oct 10, 2023

Having researched WebGPU a little, I'm not seeing anything there that explains why this over WebGPU. As far as I understand, those reasons are identical to those for WebGPU.

Can anyone explain?

swatcoder · on Oct 10, 2023

Meta has bet the farm on their VR/AR/XR platform and smartly wants control over the tooling that developers will use to target it.

The last thing they want is for some Quest innovation to be hamstrung waiting for a standard body to approve support for some new feature or for a third-party open source maintainer to incorporate some fix/addition/compatibility PR.

Owning a project like this, if they can get developers to use it, is a huge business security asset and competitive advantage for them. That's why this exists and why it's being promoted.

Whether developers see it as worth using instead of other options is a different question. Meta would probably say the answer to "why" for developers is: "to make sure your project gets the most out of the Quest products (as well as other Android platforms), now and in the future"

adamnemecek · on Oct 10, 2023

They can’t get users to use it as it provides little over webgpu.

jayd16 · on Oct 10, 2023

Is Android support solid for Dawn/webgpu? Seems like it's marked as work in progress compared to the confident support laid out here.

adamnemecek · on Oct 10, 2023

Right but if they needed that, they could have just contributed android support to wgpu.

adamnemecek · on Oct 10, 2023

WebGPU goes beyond the web. wgpu is a rust project.

skavi · on Oct 10, 2023

The first two points easily apply to WebGPU. Maybe not the last one yet.

diggan · on Oct 10, 2023

At a glance, IGL (this submission) seems more similar to wgpu than WebGPU.

But then the question just becomes "Why not use wgpu?" instead.

swatcoder · on Oct 10, 2023

Because projects have all sorts of requirements. Shipping and running an embedded browser engine, or running in a user's active browser session, is not going to fit the requirements for many many projects.

sluijs · on Oct 10, 2023

WebGPU is - despite the name - not designed to run in a browser per se. You can use wgpu and Dawn as native crates/libraries as well.

rossant · on Oct 10, 2023

Yes, it is weird that WebGPU is not even mentioned on the README. Maybe this project was initiated before WebGPU was mature enough (for the record, WebGPU is still mentioned as an "experimental technology" by MDN (https://developer.mozilla.org/en-US/docs/Web/API/WebGPU_API).

foota · on Oct 10, 2023

I'm guessing it predates webgpu internally.

shortrounddev2 · on Oct 10, 2023

The samples are just hideous. Platform selection #if preprocessors every 5 lines. I would never want to maintain code like this

edvinbesic · on Oct 10, 2023

This was the first thing that stood out to me looking at the samples. How is it abstracted if you have to constantly do #if windows or #if opengl etc.

bagels · on Oct 10, 2023

Seems pretty disqualifying. https://github.com/facebook/igl/blob/main/samples/desktop/Ti...

ReactiveJelly · on Oct 10, 2023

The GPU and ML libraries, frameworks, abstractions, and drivers have long since felt to me like the famed "Linux audio jungle" of old. At least I understand why ffmpeg, gstreamer, OSS, ALSA, PulseAudio, and PipeWire all exist.

CUDA, OpenCL, Vulkan, Metal, PyTorch, TensorFlow, I don't understand. They all build on CUDA and AMD / Intel / CPUs are treated as 2nd-class citizens, right?

kmeisthax · on Oct 10, 2023

CUDA, OpenCL, Vulkan, and Metal are all different graphics APIs provided by GPU vendors. GPU vendors can implement all four of them if they like, but in practice, CUDA is an NVidia exclusive, Metal is a Apple exclusive and vice versa, and OpenCL is kind of old and unused (and also used to be an Apple technology as well). There's also other APIs provided by non-GPU hardware vendors.

PyTorch and TensorFlow are training and inference libraries. In theory, PyTorch at least (probably also TF) is properly abstracted and can work with all APIs. In practice CUDA gets the most testing and has all the operator fusion kernels you need for good performance, so everyone trains on CUDA. Inference is less resource-intensive than training, and there's a lot of vendors that provide inference-only hardware (e.g. "Edge TPUs") that's better supported, so on the inference end there's more variety in API usage.

You can of course train and inference on CPU but that means terrible performance.

KeplerBoy · on Oct 10, 2023

CUDA and OpenCl are not a graphics APIs, they're designed for computing only.

Const-me · on Oct 10, 2023

Devices like nVidia H100 and AMD MI250 are also designed for computing only.

They don’t support D3D, GL or Vulkan, yet both manufacturers still call these things “a GPU”.

swatcoder · on Oct 10, 2023

When innovative hardware/firmware introduces a new feature that wasn't captured in a previous hardware abstraction layer (if any such HAL exists for that class of hardware), the manufacturer will somehow need to make that feature available to their customers.

Depending on the scope of the feature and the investment the manufacturer wants to make, that's going to be through their driver, through some opaque handle-passing API in the HAL, or through a whole SDK. This not only lets them show off their new functionality in a way that they wouldn't be able to do otherwise, but gives them an opportunity to "lock in" customers (as NVidia did quite effectively with CUDA).

Between crypto, ML, and the already-immense gaming industry, a huge amount of money has gone into "graphics" hardware over the last 10+ years and it's led to lots of novel innovations. And since those innovations weren't anticipated by existing HAL's like DirectX and OpenGL, there's a lot of these new manufacturer-specific tools as well as numerous attempts to fight "lock in" by wrangling them into new repurposable abstractions.

It's just a cyclical process. It'll gradually pass as innovations in this sector plateau again, but that still looks to be some ways out.

shwaj · on Oct 10, 2023

Nope, they don’t all build on CUDA. OpenCL and Vulkan are cross-platform standards developed by Khronos group, the same as who developed OpenGL. OpenCL aimed to be roughly the equivalent of CUDA, but in practice the developer experience of CUDA has far outpaced it. Vulkan is targeted at graphics, unlike CUDA. Metal is a similar API to Vulkan, i.e. a low-overhead graphics API targeted at modern GPUs, but only on Apple platforms. PyTorch and Tensorflow are high-level APIs targeted at deep learning, which have multiple execution backends, one of the most popular being CUDA, but they can also run on CPUs or other processors such as TPUs.

ska · on Oct 10, 2023

You are mixing up several levels of abstraction. It probably helps to realize that the GPU accelerated ML stuff was originally built on top of dedicated graphics libraries/APIs but diverged as it has different needs. You have here three somewhat distinct groups of things, but the boundaries are a bit fuzzy.

Graphics interfaces: Vulcan, Metal ( and older OpenGL, D3D) Non-graphics GPU Compute interfaces: CUDA/(and RocM etc.) Computational Frameworks: Pytorch/Tensorflow/etc.

The graphics interfaces are somewhat distinct from the ML/compute stuff, but as they use the same kind of underlying hardware to do work it's not that simple.

The frameworks are trying to abstract computations from the underlying hardware, e.g. you can run your tensorflow/pytorch code against CPU only, etc. In practice though for "serious" work NVIDIA gpus are the likely target.

The GPU interfaces are trying to do the same thing, roughly, but NVIDIA is far ahead in terms of hardware and dev support, so mostly that is what is deployed.

FireInsight · on Oct 10, 2023

Well, you're basically correct that the ML GPU libraries basically all build on CUDA with non-Nvidia being a second class citizen, but then stuff like OpenGL, Vulkan, DirectX and Metal are more general "os-level" or low level GPU graphics and compute abstractions. This, and the popular rust project `wgpu` which is the base for Firefox's WebGPU implementation abstract over these lower-level abstractions that may or may not be available such that the same, often a bit simpler code is usable accross multiple types of devices. Basically the opposite of needing CUDA.