WebGPU is the new WebGL. That means it is the new way to draw 3D in web browsers. It is, in my opinion, very good actually. It is so good I think it will also replace Canvas and become the new way to draw 2D in web browsers. In fact it is so good I think it will replace Vulkan as well as normal OpenGL, and become just the standard way to draw, in any kind of software, from any programming language. This is pretty exciting to me. WebGPU is a little bit irritating— but only a little bit, and it is massively less irritating than any of the things it replaces.
WebGPU goes live… today, actually. Chrome 113 shipped in the final minutes of me finishing this post and should be available in the "About Chrome" dialog right this second. If you click here, and you see a rainbow triangle, your web browser has WebGPU. By the end of the year WebGPU will be everywhere, in every browser. (All of this refers to desktop computers. On phones, it won't be in Chrome until later this year; and Apple I don't know. Maybe one additional year after that.)
If you are not a programmer, this probably doesn't affect you. It might get us closer to a world where you can just play games in your web browser as a normal thing like you used to be able to with Flash. But probably not because WebGL wasn't the only problem there.
If you are a programmer, let me tell you what I think this means for you.
Sections below:
- A history of graphics APIs (You can skip this)
- What's it like?
- How do I use it?
- Typescript / NPM world
- I don't know what a NPM is I Just wanna write CSS and my stupid little script tags
- Rust / C++ / Posthuman Intersecting Tetrahedron
A history of graphics APIs (You can skip this)
Let's talk about shaders. The original OpenGL was a "fixed function renderer", meaning someone had written down the steps in a 3D renderer and it performed those steps in order.
Each box in the "pipeline" had some dials on the side so you could configure how each feature behaved, but you were pretty much limited to the features the card vendor gave you. If you had shadows, or fog, it was because OpenGL or an extension had exposed a feature for drawing shadows or fog. What if you want some other feature the ARB didn't think of, or want to do shadows or fog in a unique way that makes your game look different from other games? Sucks to be you. This was obnoxious, so eventually "programmable shaders" were introduced. Notice some of the boxes above are yellow? Those boxes became replaceable. The (1) boxes got collapsed into the "Vertex Shader", and the (2) boxes became the "Fragment Shader"². The software would upload a computer program in a simple C-like language (upload the actual text of the program, you weren't expected to compile it like a normal program)³ into the video driver at runtime, and the driver would convert that into configurations of ALUs (or whatever the card was actually doing on the inside) and your program would become that chunk of the pipeline. This opened things up a lot, but more importantly it set card design on a kinda strange path. Suddenly video cards weren't specialized rendering tools anymore. They ran software.
The way out came from, this is how I see it anyway, a short-lived idea called "AZDO". This technically consisted of a single GDC talk⁵, and I have no reason to believe the GDC talk originated the idea, but what the talk did do is give a name to the idea that underlies Vulkan, DirectX 12, and Metal. "Approaching Zero Driver Overhead". Here is the idea: By 2015 video cards had pretty much standardized on a particular way of working and that way was known and that way wasn't expected to change for ten years at least. Graphics APIs were originally designed around the functionality they exposed, but that functionality hadn't been a 1:1 map to how GPUs look on the inside for ten years at least. Drivers had become complex beasts that rather than just doing what you told them tried to intuit what you were trying to do and then do that in the most optimized way, but often they guessed wrong, leaving software authors in the ugly position of trying to intuit what the driver would intuit in any one scenario. AZDO was about threading your way through the needle of the graphics API in such a way your function calls happened to align precisely with what the hardware was actually doing, such that the driver had nothing to do and stuff just happened.
So this is all very cool. There is only one problem, which is that with all this fine-grained complexity, Vulkan winds up being basically impossible for humans to write. Actually, that's not really fair. DX12 and Metal offer more or less the same degree of fine-grained complexity, and by all accounts they're not so bad to write. The actual problem is that Vulkan is not designed for humans to write. Literally. Khronos does not want you to write Vulkan, or rather, they don't want you to write it directly. I was in the room when Vulkan was announced, across the street from GDC in 2015, and what they explained to our faces was that game developers were increasingly not actually targeting the gaming API itself, but rather targeting high-level middleware, Unity or Unreal or whatever, and so Vulkan was an API designed for writing middleware. The middleware developers were also in the room at the time, the Unity and Epic and Valve guys. They were beaming as the Khronos guy explained this. Their lives were about to get much, much easier.
My life was about to get harder. Vulkan is weird— but it's weird in a way that makes a certain sort of horrifying machine sense. Every Vulkan call involves passing in one or two huge structures which are themselves a forest of other huge structures, and every structure and sub-structure begins with a little protocol header explaining what it is and how big it is. Before you allocate memory you have to fill out a structure to get back a structure that tells you what structure you're supposed to structure your memory allocation request in. None of it makes any sense— unless you've designed a programming language before, in which case everything you're reading jumps out to you as "oh, this is contrived like this because it's designed to be easy to bind to from languages with weird memory-management techniques" "this is a way of designing a forward-compatible ABI while making no assumptions about programming language" etc. The docs are written in a sort of alien English that fosters no understanding— but it's also written exactly the way a hardware implementor would want in order to remove all ambiguity about what a function call does. In short, Vulkan is not for you. It is a byzantine contract between hardware manufacturers and middleware providers, and people like… well, me, are just not part of the transaction.
Khronos did not forget about you and me. They just made a judgement, and this actually does make a sort of sense, that they were never going to design the perfectly ergonomic developer API anyway, so it would be better to not even try and instead make it as easy as possible for the perfectly ergonomic API to be written on top, as a library. Khronos thought within a few years of Vulkan⁸ being released there would be a bunch of high-quality open source wrapper libraries that people would use instead of Vulkan directly. These libraries basically did not materialize. It turns out writing software is work and open source projects do not materialize just because people would like them to⁹.
With Apple out, the scene looked different. Suddenly there was a next-gen API for Windows, a next-gen API for Mac/iPhone, and a next-gen API for Linux/Android. Except Linux has a severe driver problem with Vulkan and a lot of the Linux devices I've been checking out don't support Vulkan even now after it's been out seven years. So really the only platform where Vulkan runs natively is Android. This isn't that bad. Vulkan does work on Windows and there are mostly no problems, though people who have the resources to write a DX12 backend seem to prefer doing so. The entire point of these APIs is that they're flyweight things resting very lightly on top of the hardware layer, which means they aren't really that different, to the extent that a Vulkan-on-Metal emulation layer named MoltenVK exists and reportedly adds almost no overhead. But if you're an open source kind of person who doesn't have the resources to pay three separate people to write vaguely-similar platform backends, this isn't great. Writing Vulkan your code can technically run on all platforms, but you're writing in the least pleasant of the three APIs to work with and you get the advantage of using a true-native API on neither of the two major platforms. You might even have an easier time just writing DX12 and Metal and forgetting Vulkan (and Android) altogether. In short, Vulkan solves all of OpenGL's problems at the cost of making something that no one wants to use and no one has a reason to use.
The way out turned out to be something called ANGLE. Let me back up a bit.
But nobody actually wants to use WebGL, right? We want a "modern" API, one of those AZDO thingies. So a W3C working group sat down to make Web Vulkan, which they named WebGPU. I'm not sure my perception of events is to be trusted, but my perception of how this went from afar was that Apple was the most demanding participant in the working group, and also the participant everyone would naturally by this point be most afraid of just spiking the entire endeavor, so reportedly Apple just got absolutely everything they asked for and WebGPU really looks a lot like Metal. But Metal was always reportedly the nicest of the three modern graphics APIs to use, so that's… good? Encouraged by the success with ANGLE (which by this point was starting to see use as a standalone library in non-web apps¹¹), and mindful people would want to use this new API with WebASM, they took the step of defining the standard simultaneously as a JavaScript IDL and a C header file, so non-browser apps could use it as a library.
I feel like WebGPU is what I've been waiting for this entire time.
What's it like?
I can't compare to DirectX or Metal, as I've personally used neither. But especially compared to OpenGL and Vulkan, I find WebGPU really refreshing to use. I have tried, really tried, to write Vulkan, and been defeated by the complexity each time. By contrast WebGPU does a good job of adding complexity only when the complexity adds something. There are a lot of different objects to keep track of, especially during initialization (see below), but every object represents some Real Thing that I don't think you could eliminate from the API without taking away a useful ability. (And there is at least the nice property that you can stuff all the complexity into init time and make the process of actually drawing a frame very terse.) WebGPU caters to the kind of person who thinks it might be fun to write their own raymarcher, without requiring every programmer to be the kind of person who thinks it would be fun to write their own implementation of malloc
.
The Problems
There are three Problems. I will summarize them thusly:
- Text
- Lines
- The Abomination
Text and lines are basically the same problem. WebGPU kind of doesn't… have them. It can draw lines, but they're only really for debugging– single-pixel width and you don't have control over antialiasing. So if you want a "normal looking" line you're going to be doing some complicated stuff with small bespoke meshes and an SDF shader. Similarly with text, you will be getting no assistance– you will be parsing OTF font files yourself and writing your own MSDF shader, or more likely finding a library that does text for you.
This (no lines or text unless you implement it yourself) is a totally normal situation for a low-level graphics API, but it's a little annoying to me because the web browser already has a sophisticated anti-aliased line renderer (the original Canvas API) and the most advanced text renderer in the world. (There is some way to render text into a Canvas API texture and then transfer the Canvas contents into WebGPU as a texture, which should help for some purposes.)
Then there's WGSL, or as I think of it, The Abomination. You will probably not be as annoyed by this as I am. Basically: One of the benefits of Vulkan is that you aren't required to use a particular shader language. OpenGL uses GLSL, DirectX uses HLSL. Vulkan used a bytecode, called SPIR-V, so you could target it from any shader language you wanted. WebGPU was going to use SPIR-V, but then Apple said no¹². So now WebGPU uses WGSL, a new thing developed just for WebGPU, as its only shader language. As far as shader languages go, it is fine. Maybe it is even good. I'm sure it's better than GLSL. For pure JavaScript users, it's probably objectively an improvement to be able to upload shaders as text files instead of having to compile to bytecode. But gosh, it would have been nice to have that choice! (The "desktop" versions of WebGPU still keep SPIR-V as an option.)
How do I use it?
You have three choices for using WebGPU: Use it in JavaScript in the browser, use it in Rust/C++ in WebASM inside the browser, or use it in Rust/C++ in a standalone app. The Rust/C++ APIs are as close to the JavaScript version as language differences will allow; the in-browser/out-of-browser APIs for Rust and C++ are identical (except for standalone-specific features like SPIR-V). In standalone apps you embed the WebGPU components from Chrome or Firefox as a library; your code doesn't need to know if the WebGPU library is a real library or if it's just routing through your calls to the browser.
Regardless of language, the official WebGPU spec document on w3.org is a clear, readable reference guide to the language, suitable for just reading in a way standard specifications sometimes aren't. (I haven't spent as much time looking at the WGSL spec but it seems about the same.) If you get lost while writing WebGPU, I really do recommend checking the spec.
Most of the "work" in WebGPU, other than writing shaders, consists of the construction (when your program/scene first boots) of one or more "pipeline" objects, one per "pass", which describe "what shaders am I running, and what kind of data can get fed into them?"¹³. You can chain pipelines end-to-end within a queue: have a compute pass generate a vertex buffer, have a render pass render into a texture, do a final render pass which renders the computed vertices with the rendered texture.
Here, in diagram form, are all the things you need to create to initially set up WebGPU and then draw a frame. This might look a little overwhelming. Don't worry about it! In practice you're just going to be copying and pasting a big block of boilerplate from some sample code. However at some point you're going to need to go back and change that copypasted boilerplate, and then you'll want to come back and look up what the difference between any of these objects is.
Some observations in no particular order:
- When describing a "mesh" (a 3D model to draw), a "vertex" buffer is the list of points in space, and the "index" is an optional buffer containing the order in which to draw the points. Not sure if you knew that.
- Right now the "queue" object seems a little pointless because there's only ever one global queue. But someday WebGPU will add threading and then there might be more than one.
- A command encoder can only be working on one pass at a time; you have to mark one pass as complete before you request the next one. But you can make more than one command encoder and submit them all to the queue at once.
- Back in OpenGL when you wanted to set a uniform, attribute, or texture on a shader, you did it by name. In WebGPU you have to assign these things numbers in the shader and you address them by number.¹⁴
- Although textures and buffers are two different things, you can instruct the GPU to just turn a texture into a buffer or vice versa.
- I do not list "pipeline layout" or "bind group layout" objects above because I honestly don't understand what they do. I've only ever set them to default/blank.
- In the Rust API, a "Context" is called a "Surface". I don't know if there's a difference.
Getting a little more platform-specific:
TypeScript / NPM world
The best way to learn WebGPU for TypeScript I know is Alain Galvin's "Raw WebGPU" tutorial. It is a little friendlier to someone who hasn't used a low-level graphics API before than my sandbag introduction above, and it has a list of further resources at the end.
Since code snippets don't get you something runnable, Alain's tutorial links a completed source repo with the tutorial code, and also I have a sample repo which is based on Alain's tutorial code and adds simple animation as well as Preact¹⁵. Both my and Alain's examples use NPM and WebPack¹⁶.
If you don't like TypeScript: I would recommend using TypeScript anyway for WGPU. You don't actually have to add types to anything except your WGPU calls, you can type everything "any". But building that pipeline object involves big trees of descriptors containing other descriptors, and it's all just plain JavaScript dictionaries, which is nice, until you misspell a key, or forget a key, or accidentally pass the GPUPrimitiveState table where it wanted the GPUVertexState table. Your choices are to let TypeScript tell you what errors you made, or be forced to reload over and over watching things break one at a time.
I don't know what a NPM is I Just wanna write CSS and my stupid little script tags
If you're writing simple JS embedded in web pages rather than joining the NPM hivemind, honestly you might be happier using something like three.js¹⁷ in the first place, instead of putting up with WebGPU's (relatively speaking) hyper-low-level verbosity. You can include three.js directly in a script tag using existing CDNs (although I would recommend putting in a subresource SHA hash to protect yourself from the CDN going rogue).
But! If you want to use WebGPU, Alain Galvin's tutorial, or renderer.ts from his sample code, still gets you what you want. Just go through and anytime there's a little : GPUBlah
wart on a variable delete it and the TypeScript is now JavaScript. And as I've said, the complexity of WebGPU is mostly in pipeline init. So I could imagine writing a single <script>
that sets up a pipeline object that is good for various purposes, and then including that script in a bunch of small pages that each import¹⁸ the pipeline, feed some floats into a buffer mapped range, and draw. You could do the whole client page in like ten lines probably.
Rust
So as I've mentioned, one of the most exciting things about WebGPU to me is you can seamlessly cross-compile code that uses it without changes for either a browser or for desktop. The desktop code uses library-ized versions of the actual browser implementations so there is low chance of behavior divergence. If "include part of a browser in your app" makes you think you're setting up for a code-bloated headache, not in this case; I was able to get my Rust "Hello World" down to 3.3 MB, which isn't much worse than SDL, without even trying. (The browser hello world is like 250k plus a 50k autogenerated loader, again before I've done any serious minification work.)
If you want to write WebGPU in Rust¹⁹, I'd recommend checking out this official tutorial from the wgpu project, or the examples in the wgpu source repo. As of this writing, it's actually a lot easier to use Rust WebGPU on desktop than in browser; the libraries seem to mostly work fine on web, but the Rust-to-wasm build experience is still a bit rough. I did find a pretty good tutorial for wasm-pack here²⁰. However most Rust-on-web developers seem to use (and love) something called "Trunk". I haven't used Trunk yet but it replaces wasm-pack as a frontend, and seems to address all the specific frustrations I had with wasm-pack.
I do have also a sample Rust repo I made for WebGPU, since the examples in the wgpu repo don't come with build scripts. My sample repo is very basic²¹ and is just the "hello-triangle" sample from the wgpu project but with a Cargo.toml added. It does come with working single-line build instructions for web, and when run on desktop with --release
it minimizes disk usage. (It also prints an error message when run on web without WebGPU, which the wgpu sample doesn't.) You can see this sample's compiled form running in a browser here.
C++
If you're using C++, the library you want to use is called "Dawn". I haven't touched this but there's an excellently detailed-looking Dawn/C++ tutorial/intro here. Try that first.
Posthuman Intersecting Tetrahedron
I have strange, chaotic daydreams of the future. There's an experimental project called rust-gpu that can compile Rust to SPIR-V. SPIR-V to WGSL compilers already exist, so in principle it should already be possible to write WebGPU shaders in Rust, it's just a matter of writing build tooling that plugs the correct components together. (I do feel, and complained above, that the WGSL requirement creates a roadblock for use of alternate shader languages in dynamic languages, or languages like C++ with a broken or no build system— but Rust is pretty good at complex pre-build processing, so as long as you're not literally constructing shaders on the fly then probably it could make this easy.)
I imagine a pure-Rust program where certain functions are tagged as compile-to-shader, and I can share math helper functions between my shaders and my CPU code, or I can quickly toggle certain functions between "run this as a filter before writing to buffer" or "run this as a compute shader" depending on performance considerations and whim. I have an existing project that uses compute shaders and answering the question "would this be faster on the CPU, or in a compute shader?"²² involved writing all my code twice and then writing complex scaffold code to handle switching back and forth. That could have all been automatic. Could I make things even weirder than this? I like Rust for low-level engine code, but sometimes I'd prefer to be writing TypeScript for business logic/"game" code. In the browser I can already mix Rust and TypeScript, there's copious example code for that. Could I mix Rust and TypeScript on desktop too? If wgpu is already my graphics engine, I could shove in Servo or QuickJS or something, and write a cross-platform program that runs in browser as TypeScript with wasm-bindgen Rust embedded inside or runs on desktop as Rust with a TypeScript interpreter inside. Most Rust GUI/game libraries work in wasm already, and there's this pure Rust WebAudio implementation (it's currently not a drop-in replacement for wasm-bindgen WebAudio but that could be fixed). I imagine creating a tiny faux-web game engine that is all the benefits of Electron without any the downsides. Or I could just use Tauri for the same thing and that would work now without me doing any work at all.
Could I make it weirder than that? WebGPU's spec is available as a machine-parseable WebIDL file; would that make it unusually easy to generate bindings for, say, Lua? If I can compile Rust to WGSL and so write a pure-Rust-including-shaders program, could I compile TypeScript, or AssemblyScript or something, to WGSL and write a pure-TypeScript-including-shaders program? Or if what I care about is not having to write my program in two languages and not so much which language I'm writing, why not go the other way? Write an LLVM backend for WGSL, compile it to native+wasm and write an entire-program-including-shaders in WGSL. If the w3 thinks WGSL is supposed to be so great, then why not?
Okay that's my blog post.
¹ 113 or newer
² "Fragment" is OpenGL for "Pixel".
³ I am still trying to figure out whether modern video cards are simply based on the internal architecture of Quake 3.
⁴ And those coordinates HAD to describe triangles, now. Want to draw a rectangle? Fuck you, apparently!
⁵ (And a series of OpenGL techniques and extensions no one seems to have really got the chance to use before OpenGL was sunset.)
⁶ Why is a "push constant" different from a "uniform", in Vulkan/WebGPU? Well, because those are two different things inside of the GPU chip. Why would you use one rather than the other? Well, learn what the GPU chip is doing, and then you'll understand why either of these might be more appropriate in certain situations. Does this sound like a lot of mental overhead? Well, sometimes, but honestly, it's less mental overhead than trying to understand whatever "VAO"s were. (EDIT: It turns out WebGPU doesn't have push constants. I only thought they did because the Rust WebGPU library offers them as a desktop-specific extension. Anyway, it's an example.)
⁷ Require
⁸ By the way, have you noticed the cheesy Star Trek joke yet? The companies with seats on the Khronos board have a combined market capitalization of 6.1 trillion dollars. This is the sense of humor that 6.1 trillion dollars buys you.
⁹ There are decent Vulkan-based OSS game engines, though. LÖVR, the Lua-based game engine I use for my job, has a very nice pared-down Lua frontend on top of its Vulkan backend that is usable by beginners but exposes most of the GPU flexibility you actually care about. (The Lua API is also itself a thin wrapper atop a LÖVR-specific C API, and the graphics module is designed to be separable from LÖVR in principle, so if I didn't have WebGPU I'd actually probably be using LÖVR's C frontend even outside Lua now.)
¹⁰ This made OpenGL's fragmentation problem even worse, as the "final" form of OpenGL is basically version 4.4-4.6 somewheres, whereas Apple got to 4.1 and simply stopped. So if you want to release OpenGL software on a Mac, for however longer that's allowed, you are targeting something that is almost, but not quite, the final full-featured version of the API. This sucks! There is some important stuff in 4.3.
¹¹ Microsoft shipped ANGLE in Windows 11 as the OpenGL component of their Android compatibility layer, and ANGLE has also been shipped as the graphics engine in a small number of games such as, uh… [checking Wikipedia] Shovel Knight?! You might see it used more if ANGLE had been designed for library reuse from day one like WebGPU was, or if anyone wanted to use OpenGL.
¹² If I were a cynical, paranoid conspiracy theorist, I would float the theory here that Apple at some point decided they wanted to leave open the capability to sue the other video card developers on the Khronos board, so they are aggressively refusing to let their code touch anything that has touched the Vulkan patent pool to insulate themselves from counter-suits. Or that is what I would say if I were a cynical, paranoid conspiracy theorist. Hypothetically.
¹³ If you pay close attention here you'll notice something weird: Pipelines combine buffer interfaces with specific shaders, so you can use a single pipeline with many different buffers but only one shader or shader pair. What early users of both WebGPU and Vulkan have found is that you wind up needing a lot of pipeline objects in a fair-sized program, and although the pipeline objects themselves are lightweight, creating the pipeline objects can be kind of slow, especially if you have to create more than one of them on a single frame. So this is an identified pain point, having to think ahead to all the pipeline objects you'll need and cache them ahead of time, and Vulkan has already tried to address this by introducing something called "shader objects" like one month ago. Hopefully the WebGPU WG will look into doing something similar in the next revision.
¹⁴ This annoys me, but I've talked to people who like it better, I guess because they had problems with typo'ing their uniform names.
¹⁵ This sample is a little less complete than I hoped to have it by the time I posted this. Known problems as of this second: It comes with a Preact Canvas wrapper that enforces aspect ratio and integer-multiple size requirements for the canvas, but it doesn't have an option to run full screen; there are unnecessary scroll bars that appear if you open the sample in a non-WebGPU browser (and possibly under other circumstances as well); there is an unused file named "canvas2image.ts", which was supposed to be used to let you download the state as a PNG and ought to be either wired up or removed; if you do add canvas2image back in it doesn't work, and I don't know if the problem is at my end or Chrome's; the comments refer to some concepts from 2021 WebGPU, like swapchains.
¹⁶ If you don't like WebPack, that implies you know enough about JavaScript you already know how to replace the WebPack in the example with something else.
¹⁷ Not a specific three.js endorsement. I've never used it. People seem to like it. There (BabylonJS) are (RedGPU) alternatives (PlayCanvas, which by the way is incredibly cool).
¹⁸ Wait, do JS modules/import
just work in browsers now? I don't even know lol
¹⁹ If you're using Rust, it's quite possible that you are using WebGPU already. The Rust library quickly got far ahead of its Firefox parent software and has for some time now already been adopted as the base graphics layer in emerging GUI libraries such as Iced. So you could maybe just use Iced or Bevy for high-level stuff and then do additional drawing in raw WebGPU. I haven't tried.
²⁰ Various warnings if you go this way: If you're on Windows I recommend installing the wasm-pack binary package instead of trying to install it through cargo. If you're making a web build from scratch instead of using my sample, note the slightly alarming "as of 2022-9-20" note here in the wgpu wiki.
²¹ This sample also has as of this writing some caveats: It can only fill the window, it can't do aspect ratios or integer-multiple restrictions; it has no animation; in order to get the fill-the-window behavior, I had to base it on a winit PR, so the version of winit used is a little older than it could be; there are outstanding warnings; I am unclear on the license status of the wgpu sample code I used, so until I can get clarification or rewrite it you should probably follow the wgpu MIT license even when using this sample on web. I plan to eventually expand this example to include controller support and sound.
²² Horrifyingly, the answer turned out to be "it depends on which device you're running on".