When we talk about “Massive Models,” we basically mean 3D models that are too large to drop right into Cesium as glTF. The model may have too many triangles to render at interactive rates on a common GPU, or the model may be so large that it won’t even fit in memory, never mind bog down rendering. Since Cesium is built for the web, we also need to incrementally stream massive models over the Internet and minimize bandwidth.
As part of cesium.com, we have been working on a tiling pipeline for converting massive models from glTF into 3D Tiles. The vision is to be able to fly into a city, walk into a building, and inspect the individual bolts on the superstructure or admire a high resolution 3D capture of the interior marble work.
This tileset of the San Miguel Hacienda model provides a glimpse at what we can do with our current tiling pipeline. The source model as untextured glTF takes up approximately 200 megabytes and would be impractical to download and difficult to render on a lot of common hardware, but as a tileset it can be streamed and rendered interactively for a tour of the building.
We accomplished this by recursively subdividing the model’s 7.8 million triangles into a spatial data structure called an octree. Each interior tile in the octree contains a simplification of all the triangles it encompasses, with subsequent child tiles covering smaller areas at higher detail. These simplifications are the first things we see as the model starts streaming in and if we approach the model from far away.
For simplification we used a variant on vertex clustering that supports arbitrary vertex attributes for broader glTF support (including textured models, stay tuned!). Vertex clustering algorithms generally minimize how much original data needs to be in memory at any given moment, which improves the efficiency of our pipeline but also makes building a 3D Tileset from data that must be read straight from disk into a realistic possibility - this is called out of core processing, and will vastly expand the limits of what we can simplify and tile.
Rendering low detail versions of data for far-off viewers and cutting up the high detail data for nearby viewers together form a technique called Hierarchical Level of Detail (HLOD), which is the foundation for visualizing any kind of very large dataset as a 3D Tileset.
Above, the blue volume represents the camera frustum, and the red boxes are bounding volumes enclosing individual tiles. Observe how the density of bounding boxes around the plants increases for the closer camera frustum.
Below, note that tiles of the model are culled when the camera moves forward - bounding volumes in 3D Tiles are used for view frustum culling. This allows us to do things like comfortably examine individual place settings at original resolution if we want to.
Above, as the camera zooms in, more detailed tiles are streamed.
There’s still work to be done on getting the San Miguel model to stream and look even better - which will always be true as long as the textured version graces the cover of Physically Based Rendering, Second Edition. In terms of triangle count, however, the San Miguel Hacienda is only the tip of the iceberg in terms of what we can achieve today. Our tiling pipeline is hungry for large and diverse massive models. If you want to see how your massive models stream with 3D Tiles, send me a note at email@example.com or tweet to @CesiumJS.
Finally, expect upcoming blog posts diving a little deeper into our tiling pipeline. For further reading in the meantime, check out Patrick Cozzi’s Master’s Thesis, which brought together the groundwork for this project way back before WebGL was even a thing.