Volumetric 3D video streaming explained for game developers

What is volumetric video?

Imagine capturing not just the visual appearance of a person or object, but its complete three-dimensional form in space. That’s the essence of volumetric video—a technology that records subjects from multiple angles simultaneously, creating true 3D representations that viewers can experience from any perspective.

Unlike traditional 3D or 360° video which only provides fixed viewpoints, volumetric video enables six degrees of freedom (6DoF), allowing users to move around, behind, or even through captured content in virtual environments. This creates a fundamentally different experience from conventional video formats.

Think of it like the difference between watching a basketball game on TV versus having the ability to teleport anywhere in the arena during live play—you’re no longer bound to the camera director’s choices but free to explore the space however you wish.

As explained by AdTech, “Volumetric video captures the full three-dimensional space, including depth and spatial data, enabling truly dynamic perspective shifts that respond to viewer movement.”

How volumetric video capture works

Volumetric video creation involves several sophisticated technical processes:

Multi-camera capture - Arrays of synchronized cameras (typically 30-200) record subjects from multiple angles simultaneously. These cameras must be precisely calibrated to ensure spatial accuracy.

3D cartoon-style illustration of a multi-camera array setup surrounding a person standing on a marked stage platform, cameras placed on tripods in a circular formation with visible depth sensors and glowing lines showing the capture angles; deep blue-to-purple gradient background, all equipment and characters in soft, rounded forms, main elements outlined in neon pink and cyan.

Depth sensing - Specialized sensors (LiDAR, infrared, structured light, etc.) capture spatial information by measuring distances to surfaces, creating point clouds that represent the 3D structure of the scene.
Data processing - Specialized algorithms reconstruct 3D geometry and textures from the combined camera and depth data. This process involves complex computational techniques like photogrammetry, stereo reconstruction, and surface meshing.
Compression - Raw data (which can exceed 1 Tbps) is compressed to manageable streaming rates using specialized codecs designed for volumetric content. This is where much of the innovation in the field is happening today.

3D cartoon-style diagram of a data pipeline: on the left, raw volumetric video data visualized as bright, dense 3D point clouds flowing through a series of glowing, softly rounded compressors, and on the right, compressed data streams entering a cloud network and flowing toward a gaming console and a VR headset; blue-to-purple gradient background, neon orange and green data flows.

The Volumetric Format Association notes that properly compressed volumetric streams typically range from 30-80 Mbps per subject—still significantly higher than conventional video but feasible for modern networks.

Key challenges in volumetric streaming

Data volume and bandwidth

The sheer amount of data in volumetric video presents the biggest challenge. A single uncompressed volumetric capture can generate over 1 terabit per second of raw data. Even after compression, streams require 30-80 Mbps per subject—much higher than the 5-15 Mbps needed for high-quality 2D video.

For game developers, this means carefully considering:

Network requirements for multiplayer experiences where volumetric content might be shared
Storage needs for volumetric assets, which can quickly consume gigabytes for even short sequences
Optimization techniques to reduce data footprint without sacrificing visual quality
Streaming strategies that prioritize data for the user’s current viewpoint

Real-time rendering demands

Rendering volumetric content in real-time requires significant computational resources. As users change perspective, the engine must continuously recalculate what’s visible from their viewpoint, managing complex 3D data structures.

For technical artists, this necessitates:

Optimized mesh and texture compression to reduce memory footprint
Level-of-detail (LOD) systems for volumetric assets that gracefully reduce complexity at distance
Performance profiling to maintain frame rates while handling dynamic volumetric content
Balancing visual fidelity against performance constraints, especially on mobile devices

Latency concerns

For interactive applications, latency between user movement and visual updates must be minimized to prevent discomfort. This is especially critical in VR/AR environments where motion sickness can occur if visuals lag behind head movement by even milliseconds.

When streaming volumetric content over networks, this challenge compounds as both network and rendering latency must be minimized simultaneously—a significantly harder problem than traditional video streaming.

Solutions and approaches

Adaptive streaming techniques

Modern volumetric streaming platforms use adaptive bitrate techniques similar to video streaming services, but adapted for 3D data:

Delivering higher resolution for areas in the user’s direct field of view while reducing detail for peripheral content
Dynamically adjusting mesh complexity and texture resolution based on available bandwidth
Using predictive algorithms to anticipate viewer movement and pre-fetch relevant data
Progressively loading detail as bandwidth allows, starting with rough representations that refine over time

This approach, similar to techniques used in game level streaming, prioritizes the data most relevant to the current viewpoint.

Edge computing integration

To reduce latency, processing can be distributed between client devices and edge servers:

Heavy reconstruction calculations happen on powerful edge servers positioned in data centers close to end users
Local devices handle final rendering based on current viewpoint, requiring much less processing power
Pre-cached common viewpoints reduce computational needs for predictable viewing scenarios
Hybrid approaches can fall back to locally-stored simplified versions when network conditions deteriorate

Edge computing effectively moves the most demanding computational work closer to users without requiring it to run on their devices.

Optimized compression formats

The industry is developing specialized formats for volumetric content. Canon’s Volumetric Video System, for example, uses proprietary compression to enable real-time streaming for applications like sports broadcasting with multiple viewing angles.

These formats typically employ:

Separating geometry from texture data for more efficient compression of each
Temporal compression that identifies and tracks changes over time rather than storing complete frames
Perceptual optimization that preserves detail where human vision is most sensitive
Hardware-accelerated decompression for real-time playback on consumer devices

Affordable options for indie creators

While professional volumetric capture studios can cost millions, indie developers have increasingly accessible options:

DIY capture solutions

RGB-D camera arrays - Systems using multiple depth cameras (like Azure Kinect) can be assembled for under $10,000, providing decent quality for smaller subjects
Photogrammetry-based approaches - Using standard cameras with specialized software like Reality Capture or Agisoft Metashape can achieve volumetric results at lower cost
Hybrid solutions - Combining depth sensors with standard cameras offers a middle ground, improving quality while managing costs
Mobile phone-based captures - New techniques leveraging smartphones with LiDAR (like recent iPhones) enable basic volumetric capture with equipment you may already own

Software tools

Several platforms now offer end-to-end solutions for smaller studios:

Capture software - Tools for synchronizing multiple cameras and processing data into volumetric formats
Compression utilities - Optimizing volumetric assets for streaming while preserving visual quality
Playback SDKs - Integration with game engines like Unity and Unreal Engine to render volumetric content in real-time
Asset marketplaces - Pre-captured volumetric content available for licensing, allowing developers to use professional captures without creating their own

According to HoloEdit, “The democratization of volumetric video tools is enabling indie creators to experiment with the medium without enterprise-level budgets.”

Real-world applications

Gaming and interactive media

Volumetric video enables unprecedented realism for character performances in games:

NPCs that appear genuinely three-dimensional from any angle, with authentic human performances
Realistic human movements without uncanny valley issues often seen in traditional animation
Dynamic environmental storytelling through volumetric scenes that players can explore freely
Mixed reality game elements that blend seamlessly with the physical world in AR applications

3D cartoon-style scene illustrating a user in a VR headset interacting with three volumetric 3D characters performing lifelike gestures; user is moving around and viewing characters from different angles, with curved neon arrows showing perspective changes and dynamic movement; background is a deep blue-to-purple gradient, characters and arrows rendered in vivid neon colors.

For game developers, this means creating more convincing characters with less animation work, as natural performances can be captured rather than painstakingly animated.

AR/VR experiences

For immersive applications, volumetric video creates compelling presence:

Virtual guides that appear physically present in space, making educational content more engaging
Historical reenactments viewable from any perspective, bringing the past to life in unprecedented ways
Training simulations with realistic human demonstrations that trainees can walk around and observe from any angle
Social applications where distant users appear as three-dimensional presences rather than flat video feeds

The sense of presence volumetric video creates is particularly valuable in VR, where the brain expects to be able to move around objects naturally.

Live events and broadcasting

Canon’s system demonstrates how volumetric video is transforming broadcasting:

Sports events viewable from any seat in the stadium, or even from positions impossible for physical spectators
Concert performances that viewers can experience from multiple perspectives, including onstage with performers
Live events with “virtual attendance” capabilities that far exceed traditional streaming
Fashion shows where viewers can examine garments from any angle, revolutionizing virtual shopping experiences

Difference from motion capture

While both technologies digitize movement, they differ fundamentally:

Motion capture records only movement data (skeletal animations) that must be applied to pre-existing 3D models. It captures how something moves, but not its appearance. Think of it as recording the movements of a puppet, but not the puppet itself.

Volumetric capture records both movement and complete visual appearance, creating self-contained 3D video that doesn’t require separate models. It captures both how something looks and how it moves as a unified dataset, like recording the entire puppet along with its movements.

This distinction is crucial for game developers deciding which technology to use. Motion capture requires 3D character models and animation rigging but offers more flexibility for modification. Volumetric capture delivers instant photorealism but with less ability to modify the performance after capture.

Getting started with volumetric video

For developers looking to explore this technology:

Start with existing assets - Several marketplaces now offer pre-captured volumetric videos that you can experiment with before investing in capture equipment
Experiment with playback - Integrate volumetric video SDKs into your game engine to understand the rendering requirements and creative possibilities
Test streaming performance - Evaluate how different compression settings affect quality and bandwidth in your target environment
Consider hybrid approaches - Combine traditional animation with volumetric elements for key moments to balance file size, quality and production costs
Begin with small-scale capture - Test simple setups before committing to complex multi-camera arrays, using even smartphone-based capture to understand the workflow

The future of volumetric streaming

As we move toward the spatial internet and more immersive digital experiences, volumetric video will become increasingly central to how we create and consume content. For game developers and technical artists, understanding this technology now provides a competitive advantage in creating next-generation experiences.

The convergence of AI-powered optimization, improved compression techniques, and more powerful consumer hardware is rapidly making volumetric video more accessible. What once required specialized studios can increasingly be achieved with consumer-grade equipment and the right software tools.

As you explore the possibilities of volumetric 3D video streaming, consider how this technology might enhance your projects’ immersion, realism, and interactivity—opening new creative possibilities beyond what traditional animation and video can achieve.