Skip to main content

Fine Grained Metadata in 3D Tiles Next

One of the key improvements in 3D Tiles Next, which was announced late last year, is efficient streaming of semantic metadata. Much of that efficiency is tied to EXT_mesh_features, a proposed glTF extension that optimizes 3D assets for rendering on all devices, while also preserving detailed metadata needed for interactivity.

Don McCurdy, contributor to EXT_mesh_features, provided this post to share the rationale, approach, and uses of EXT_mesh_features.  

Problem: Performance vs. Interactivity

Real time 3D renderers must do their work in as few GPU instructions as possible, and that often means batching things a viewer or a content creator would consider to be distinct 3D objects.

This gets challenging for 3D content where the important details aren't strictly visual. Consider a designer who needs to share a rendering of a 3D home plan with a client, including clickable details about every element. Or a researcher concerned with invisible properties of a scanned 3D surface, like its chemical composition or wear.

Many solutions have accepted this tradeoff. Either content will be batched so it is optimized for real time display (with reduced context and interactivity), or the full structures of original data will be preserved (with dire implications for real time performance). 

Street level photogrammetry of San Francisco Ferry Building from Aerometrex.

Street level photogrammetry of San Francisco Ferry Building from Aerometrex. Left: per-texel colors showing feature classification. Right: alternate classification derived from AI model identifying roof, windows, window frames, lamp posts, and AC units.

Solution: Mesh Features

3D Tiles was specifically designed to transmit the full structure of large datasets, including associated metadata, as efficiently as possible over the network and on the GPU. The Batched 3D Model tile format introduced the concept of batch ids that enabled features to be identified at per-vertex granularity within a glTF model, and the batch table for storing per-feature metadata. This allowed multiple features to be rendered in a single draw call while still preserving interactivity—a key part of 3D Tiles runtime performance.

EXT_mesh_features generalizes this concept and brings it to the broader glTF community. With EXT_mesh_features individual vertices and texels may be identified as particular features, assigned IDs, and given metadata.

Levels of feature and metadata granularity can include:

  • Per vertex: Vertices may be grouped into features each having associated properties.
  • Per texel: Textures storing property values and feature IDs allow detailed metadata even for optimized, low-poly geometry.
Diagram of metadata types in EXT_mesh_features

Illustration of metadata types in EXT_mesh_features, including per vertex and per texel.

For example, an architectural plan composed of hundreds of thousands of individual features could be optimized into just a few mesh objects for efficient rendering, with original features identified within those objects by IDs. Metadata for each of the original features (e.g. name, description, part number) is stored in a compact binary lookup table indexed by feature IDs. With those IDs, an application can select features, display feature metadata as tooltips, and work with large and complex scenes in other ways.

Furthermore, multiple feature layers are supported. A dataset may have any number of per-vertex or per-texel feature ID sets. This is useful, for example, when comparing the results of different classification algorithms, as shown in the San Francisco Ferry Building above.

Feature metadata is stored in binary property tables, as defined in the companion extension EXT_structural_metadata. A property table is functionally similar to the batch table but with additional types and semantics. Feature identification and feature metadata are intentionally decoupled to allow for different metadata encoding to be used with EXT_mesh_features.

Feature IDs indexing into a property table defined in EXT_structural_metadata

Illustration of feature IDs indexing into a property table defined in EXT_structural_metadata.

These extensions are backward-compatible. Anyone opening the asset will find an optimized scene that renders efficiently. An advanced viewer can do more, providing interactivity based on the metadata available in the scene.

Note that glTF does have other methods of storing details that could similarly be described as metadata or properties, including KHR_xmp_json_ld, Extras, and Extensions. While those methods associate data with discrete objects in a glTF asset—nodes, materials, etc.,—EXT_mesh_features is uniquely suited for properties of more granular conceptual features in subregions composed of vertices or texels.

Use Cases

BIM picking

Powerplant CAD model from Bentley Systems with per-feature interactivity.

An important design goal of EXT_mesh_features is flexibility, without constraining use cases to traditional design patterns like tooltips and info windows. Some examples of other use cases:

3D Tiles: Sometimes glTF assets are part of a larger scene, and that scene is too large to be loaded and rendered all at once. Here, 3D Tiles provides the necessary spatial subdivision and streaming structures around glTF content. As described in Introducing 3D Tiles Next, the same metadata sources can be shared by a 3D Tiles scene hierarchy and any number of glTF assets within it.

Architecture, Engineering and Construction (AEC): Use of 3D models in the construction industry has grown over time, and these models are often composed of countless individual parts. Each of those parts are important, and feature metadata allows glTF assets to render efficiently without losing the details.

Accessibility: Scenes might be annotated with metadata to support assistive technologies without compromising the GPU-optimized structure of the scene.

Extensibility and Semantics

When there's a strong need for standardized data types—like ARIA attributes for accessibility and Structured Data for products—it's not enough for metadata to simply offer flexibility to custom applications. Metadata should provide ways to formalize proven conventions to ensure consistency and reliable data exchange.

For that reason, EXT_structural_metadata allows properties to be associated with both low-level data types and high-level definitions called semantics. Semantics define meanings of properties, so that "manufacturer" or "price" are not just arbitrary names but identifiable characteristics. An intentionally small set of semantics are provided by default by the 3D Metadata Semantic Specification—ID, NAME, and DESCRIPTION—with the intent that additional semantics will be defined and standardized by interested stakeholders, or inherited from other sources such as ARIA, Schema.org, or common XMP Namespaces.

Similarly, we hope to see collaboration among 3D formats and tools in defining and supporting interchangeable metadata. With 3D Tiles Next we've proposed a baseline for feature identification and metadata in glTF and 3D Tiles, and that's just the start. Compression, new encoding options, and support in other formats and tools are ahead.

Base globe with land cover classification from Maxar. Semantics such as land cover classification can be used to populate environments with procedural foliage.

Get Involved

While we believe EXT_mesh_features provides a promising solution to a widespread problem, we're still looking for help. We need your feedback, implementations, and review for this extension to become better for more people, and available sooner. Head over to KhronosGroup/glTF#2082 for the draft specification, and let us know what you think.