The Untold Revolution Beneath iOS 26: WebGPU Is Coming Everywhere

3 months ago 2

In the above code, device.importExternalTexture takes a VideoFrame (which WebCodecs can provide from decoding video, or directly from a media element) and creates a GPUExternalTexture that WebGPU shaders can read from. This allows for a near zero-copy pipeline: the video frame stays in GPU memory. The WebGPU shader might, for example, sample this texture, apply a filter, and render to another texture or canvas. After processing, you could even use GPUDevice.copyExternalImageToTexture() to copy the processed frame to a new VideoFrame and feed it to a WebCodecs VideoEncoder for output. This tight integration enables high-performance video editors, AR effects, and other media apps on the web platform, rivaling native software.

Relation to FFmpeg

To appreciate how far web media processing has come, it’s worth mentioning FFmpeg – the ubiquitous open-source multimedia framework that underpins much of the internet’s video and audio. FFmpeg (and its libraries like libavcodec) is essentially the backbone of video processing in many environments, providing encoding/decoding for virtually every format. WebCodecs is, in spirit, a browser-friendly manifestation of FFmpeg’s capabilities, and in fact many codecs in WebCodecs are influenced by or even directly built on decades of codec development that FFmpeg represents.

WebCodecs is fundamentally based on the same principles as FFmpeg, offering low-level access to encode/decode operations for video and audio. This means web developers can now do, in the browser, many tasks that previously required native applications or server-side processing using FFmpeg. For example, YouTube’s and Netflix’s servers use FFmpeg to transcode videos; with WebCodecs, a web app could transcode or manipulate video streams on the client-side, using the user’s hardware acceleration.

FFmpeg’s power lies in its support for an enormous array of codecs and formats, making it a universal toolkit for multimedia. Similarly, WebCodecs provides a standardized way to access common codecs (like H.264, VP8/VP9, AV1 for video, AAC/Opus for audio, etc.) directly. By leveraging the browser’s built-in media codecs (often hardware-accelerated), WebCodecs can approach FFmpeg’s efficiency, but within the sandboxed, portable context of the web.

The maintainers of FFmpeg are often called unsung heroes of the modern digital world – they have encoded into software an incredible wealth of multimedia knowledge: highly optimized codec implementations (many written in low-level C or assembly for performance, algorithms for filtering and processing, and more. It’s worth acknowledging that without FFmpeg’s legacy, WebCodecs likely wouldn’t be as robust as it is. Many codec implementations in browsers are built on the same concepts or even share lineage with FFmpeg’s codec libraries. (Of course, codecs come with licensing considerations, which the web platform navigates by using system-provided codecs when possible.)

With WebGPU and WebCodecs now in the browser, many capabilities traditionally associated with FFmpeg can be achieved on the client side. Developers can build web apps that do video editing, format conversion, streaming with adaptive bitrates, etc., all in-browser. This reduces server load and allows for more interactive, low-latency experiences. For instance, a web-based video editor might use WebCodecs to decode a video file to frames, allow the user to trim or add effects via WebGPU shaders, then encode the result – entirely without uploading the footage to a server or requiring any installation. The performance is boosted by hardware acceleration at every step, akin to what a native app using FFmpeg might do with a GPU-enabled codec.

Moreover, WebCodecs gives developers granular control: you can get each raw frame as a VideoFrame object, and even access the bytes if needed (for advanced custom processing or feeding into WebAssembly, etc.). This granularity means custom effects or filters can be implemented by the app (using WebGPU for speed, or WASM if needed), rather than being limited to what a video element can do. It truly puts the power in developers’ hands, analogous to what FFmpeg offers to native developers.

In summary, the synergy between WebCodecs and WebGPU empowers web developers to create rich, media-intensive applications that are both powerful and efficient. They build on proven concepts from FFmpeg’s world, but bring those to the client side. By doing so, the web becomes a competitive platform for multimedia processing, capable of meeting the performance demands of modern video and audio applications.

Framework Support: Three.js and Beyond

The arrival of WebGPU is also a boon for popular web graphics frameworks like Three.jsBabylon.jsPlayCanvas, and others. These libraries, which previously sat on WebGL, are in the process of adopting WebGPU to leverage its advanced features and performance. In fact, Safari’s developers noted that many frameworks already work great with WebGPU on iOS 26, including Three.jsBabylon.js, Unity’s web export, PlayCanvas, and even AI libraries like Transformers.js and ONNX Runtime.

Here’s how WebGPU enhances these frameworks and what it means for developers and users:

  1. Enhanced 3D Rendering (Three.js, Babylon.js): WebGPU opens the door for more detailed and realistic graphics in the browser. Three.js can now utilize compute shaders and more flexible rendering pipelines. For example, advanced techniques like real-time global illumination, physics-based rendering with many lights, or even ray tracing effects can be more feasible. With WebGL, doing these at scale was hard due to API limitations; WebGPU’s flexibility (GPUs can be used for general computation tasks like light bouncing calculations) means Three.js scenes can reach new levels of fidelity. Early demos show complex scenes with tens of thousands of objects or high-resolution textures running more smoothly under WebGPU. And because WebGPU gives more control over memory and scheduling, frameworks can optimize resource usage better, leading to higher frame rates.
  2. Improved Performance and Efficiency: WebGPU’s API is closer to how modern GPUs natively operate, which means less overhead and more consistent performance. Under heavy workloads, a WebGPU-based app can maintain smoother animations and lower CPU usage compared to WebGL. For users, this means richer visuals that don’t drain the battery as fast (especially on mobile devices where efficiency is critical). For developers, it means the headroom to add more features – for instance, a game can have more complex physics or AI running alongside rendering, since the GPU can take more of the rendering burden.
  3. Expanded Capabilities (Compute & GPGPU): Unlike WebGL, WebGPU is not limited to drawing triangles. It treats the GPU as a general compute device. Frameworks can now incorporate GPGPU (general-purpose GPU computing) for things like particle systems, physics simulations, or AI right into the graphics pipeline. For example, a physics engine could run a compute shader to update particle positions or rigid body simulations each frame entirely on the GPU, then the rendering step just reads the results. This was possible in WebGL with hacks (using textures as data buffers), but WebGPU makes it straightforward and much faster with proper storage buffers and compute passes. This also benefits non-graphics frameworks – for instance, TensorFlow.js can offload neural network ops to WebGPU, speeding up AI in the browser significantly.
  4. Cross-Platform Consistency: With all major browsers moving toward WebGPU (and now Safari joining), developers using frameworks like Three.js can expect consistent behavior across devices. A Three.js app utilizing WebGPU features should run similarly on Chrome/Edge (with Direct3D or Vulkan under the hood), Firefox (when enabled), and Safari (with Metal under the hood). This unified target is a relief – no more downgraded experience on iOS. It also means the community can focus on WebGPU optimizations and not maintain parallel WebGL code for iOS. As an example, Three.js’s upcoming TSL (Typed Shader Language), a new system to generate shaders, is being designed with WebGPU in mind (to output WGSL shaders). This will simplify writing custom shaders that work across platforms, and provide TypeScript type-checking for shader code. WebGPU’s adoption ensures these efforts are worthwhile since it will be widely supported.

In addition to graphics engines, game engines and other tools that export to the web (like Unity or Unreal via WebAssembly) are leveraging WebGPU to get near-native performance. Unity’s web export in 2024 has an option for WebGPU, yielding much better performance than the old WebGL path in complex 3D scenes. We can expect web content to increasingly blur the line with native apps: richer 3D games, VR/AR experiences (with WebXR + WebGPU), and heavy-duty applications like CAD or video editing apps running in-browser with the responsiveness users expect from native software.

Interoperability Between WebGPU Libraries: TypeGPU and the Cross-Library Puzzle

With multiple frameworks and libraries now running on WebGPU (for graphics, AI, etc.), a new challenge arises: interoperability. How can these different libraries work together efficiently on the GPU? For example, imagine you want to use the output of a Three.js rendering (a texture) as the input to a TensorFlow.js model (which expects a tensor). Both operate on the GPU via WebGPU, so in theory this should be smooth – just share the data in GPU memory. However, in practice, their internal representations and APIs might not line up, so connecting them is tricky.

Traditionally, there have been two ways to make disparate libraries interoperate:

  • Copy data through the CPU: Read the GPU data back to JavaScript (CPU memory), then hand it to the other library to upload again to its GPU context. This is simple from a developer standpoint (just use library APIs to get and set data) but extremely slow, often hundreds of times slower than staying on the GPU. For high-bandwidth data like images or video frames, this is usually a non-starter for real-time performance.
  • Write custom GPU glue code: Write a WebGPU compute shader that takes one library’s GPU buffer/texture and converts or feeds it into the format needed for the other library, without ever leaving VRAM. This is fast but hard to do in practice – it requires low-level knowledge of both libraries’ data structures, dealing with untyped memory buffers, and writing custom shader code. It’s also error-prone, because you’re essentially doing manual interoperability with little help from the type system or tooling.

Enter TypeGPU, a new TypeScript library designed to address this problem. TypeGPU aims to act as a bridge between JavaScript and WebGPU, making it easier to write GPU-side code in a type-safe, ergonomic way. One of its killer features is the ability to write “glue code” in TypeScript that gets compiled to WebGPU’s shading language (WGSL) behind the scenes. This means you could, for instance, write a function in TypeScript that takes a texture from Three.js and produces a buffer suitable for TensorFlow.js, and TypeGPU will convert that to a GPU shader that does the transfer – no CPU round trip, and with compile-time checks that you’re using the correct types/formats.

TypeGPU essentially treats the GPU like another “platform” similar to how one might treat a server in fullstack development. The analogy given by its creators is writing server and client in one language – here it’s CPU and GPU in one language (TypeScript). With modern build tooling (esbuild, Vite, etc.), TypeGPU provides a plugin where you mark certain functions with a “kernel” annotation or similar (as shown in their examples) to indicate they should run on the GPU. At build time, those functions are extracted and compiled to WGSL, and the library provides the runtime hooks to call them as GPU kernels.

Why is this powerful? Because it gives developers type-checked, high-level access to writing GPU glue code. Instead of manually aligning bytes and indices, TypeGPU can automatically handle data layout, ensure that the TypeScript view of a structure matches the shader’s view, and catch mistakes early. It’s even possible to unit test these “GPU functions” on the CPU (since they can also run as plain JS for testing). Ultimately, this lowers the barrier to having different WebGPU-using libraries interoperate. A library like TensorFlow.js could expose a TypeGPU-powered API to accept textures directly, or a game engine could let you inject custom GPU logic via TypeGPU functions without requiring you to write raw WGSL.

TypeGPU also emphasizes incremental adoption and compatibility. You don’t have to rewrite your whole app to use it – you can wrap one buffer or one part of your pipeline in TypeGPU’s typed abstractions and get immediate benefits. Under the hood, its objects correspond 1:1 with WebGPU objects (you can always “unwrap” a TypeGPU object to get the underlying GPUBuffer or GPUTexture etc.) This means it plays nicely with existing frameworks: you can take a GPUTexture from Three.js, wrap it (or just pass it) into a TypeGPU function, and do something with it, then return a result that TensorFlow.js can consume.

A concrete scenario that TypeGPU addresses was described by its developers: pulling a texture out of Three.js and using it as a tensor in TensorFlow.js. With TypeGPU, one could write a kernel function in TS that knows the texture format from Three (say an RGBA8 unorm) and the tensor format needed (maybe an array of floats), and it could perform the conversion on GPU. The TypeGPU system would ensure the shader is correctly parameterized and that the data fed in/out matches the expected types, catching errors at compile time rather than during runtime. This approach provides the best of both worlds – the efficiency of GPU-to-GPU transfer and transformation, and the productivity of writing in a high-level language with type safety.

Moreover, TypeGPU opens the door to more modular GPU ecosystem: libraries can become more customizable by allowing user-defined GPU behavior. For example, a plotting library might allow users to pass in a TypeGPU kernel to define a custom point size or color calculation (as shown in the TypeGPU demo). This way, instead of just giving static configuration options, a library could say “give me a function that computes this attribute,” and that function will run on each vertex/pixel in the GPU pipeline. This is akin to dependency injection, but for GPU code – enabling powerful extensibility without sacrificing performance.

In summary, TypeGPU and similar efforts (like Shadeup, another project for writing JS and WGSL together) are making it possible for the burgeoning world of WebGPU libraries to interoperate smoothly. They provide the glue that allows, for example, a WebGPU-based game engine, an AI library, and a video processing toolkit to all share data in GPU memory and call each other’s capabilities. As these tools mature, we can expect a WebGPU ecosystem to form where different libraries can be mixed and composed to build complex applications (think of combining a physics engine with a graphics engine with an AI decision-making module, all in the browser). And importantly, developers can do this without diving into the lowest-level details for each connection, because the type-safe layers handle it.

TypeGPU is still evolving (with an active roadmap for things like full shader language integration in TypeScript), but its existence signals that the community is aware of the interoperability challenge and is actively solving it. The end goal is that you, as a developer, can focus on your application’s logic and let these libraries handle the boilerplate of GPU communication and data marshaling. In other words, you get to use plain WebGPU and WGSL where it matters, and rely on TypeGPU to ensure everything connects correctly and efficiently.

Unlocking New Possibilities

The convergence of WebGPU, WebCodecs, and on-device AI is unlocking possibilities that were pure fantasy in the past. Here are just a few examples of what’s now feasible to build on the web, given these advancements:

  • Real-Time Video Editing: Imagine a full-featured video editor running in the browser. With WebGPU and WebCodecs, such an app can provide real-time previews of edits, apply complex filters or transitions instantly via GPU shaders, and export the final video – all without server support. This could democratize video editing by making it available on any device with a modern browser, no installations needed.
  • Interactive Live Streaming: Live streaming platforms can move beyond passive video. Using WebGPU, a streamer (or the platform) could embed real-time graphics and stats into the stream, do greenscreen background removal locally, or even incorporate viewer interactions (like showing live poll results or animations triggered by chat) rendered via GPU. Since WebCodecs allows low-latency access to frames and encoding, the streamer’s browser could handle mixing these sources and producing a single encoded stream to send out – effectively doing the job of a native streaming software like OBS, but in a web app.
  • AI-Enhanced Media Experiences: All sorts of media apps can integrate AI enhancements on the fly. For instance, a web photo editor could include an “AI auto-tune” that, with one click, analyzes and enhances a photo using a model – done locally with WebGPU. Video players might offer on-the-fly subtitle generation or translation through AI models (no more waiting for the content creator to provide subtitles). Educational videos could have an integrated AI that answers questions or generates quizzes about the video content. Augmented reality (AR) web apps can use AI to recognize objects or environments in camera input and overlay information, without needing cloud vision APIs.
  • Collaboration and Cloud Offload: Paradoxically, bringing more power to the client can also improve cloud applications. With powerful client-side processing, we might see architectures where the heavy tasks are split: the user’s browser does GPU-accelerated processing on local data, while cloud services coordinate and add lighter weight operations. For example, a cloud gaming service could potentially use the client’s GPU to do some simple prediction or input processing, reducing latency. Or a collaborative design tool could let each client run AI analyses on their piece of the project and then share results via WebRTC Data Channels – keeping raw data local but insights global.

In conclusion, the integration of WebGPU into all major platforms – crowned by iOS 26’s adoption – and its combination with APIs like WebCodecs and libraries like Transformers.js and TypeGPU represents a significant leap forward for web technology. We are witnessing the browser evolve into a platform capable of high-performance computing tasks that used to be reserved for native apps or powerful servers. This evolution not only enhances what existing web apps can do but also democratizes access to advanced tools: anyone with a browser can leverage GPU computing, AI, and rich media processing without special hardware or software installs.

The web community is rapidly building on these capabilities. As developers, we now have the tools to create a new class of web-native media applications that are both powerful and accessible. For users, it means the convenience of the web (instant access, no installs, cross-device availability) with the power and richness of native experiences. The puzzle pieces – WebGPU, WebCodecs, on-device AI, and interoperability frameworks – have finally come together, paving the way for innovative digital experiences that will define the next era of the web.

Read Entire Article