# Fast Hierarchical Culling

As part of adding streaming 3D buildings to Cesium, we implemented some interesting view frustum culling optimizations for bounding volume hierarchies (BVHs).

In particular, we implemented *plane masking* as described by Sýkora & Jelínek in Efficient View Frustum Culling (Section 2.5). In this article, we explain the algorithm, including pseudocode, and share performance results.

**Overview**

Generally, when frustum culling nodes in a BVH, each node is checked against the six clipping planes of the frustum to determine whether it is potentially visible, i.e., inside the frustum. For each plane (which divides 3D space in half), the node can be either completely *inside* (on the camera-visible side), completely *outside* (invisible to the camera), or *intersecting* (crossing the plane).

BVHs, and other spatial data structures such as octrees and k-d trees, are spatially coherent, meaning child nodes are fully contained by their parents. A common optimization for hierarchical view frustum culling with a spatial data structure is to exploit spatial coherence.

If a node is completely outside of the frustum, so are its children, so traversal can stop. If a node is completely inside of the frustum, so are all of its children, so they can be traversed without frustum culling checks. Frustum checks for child nodes are only needed when the parent is intersecting the frustum.

We can improve performance even further by considering individual frustum planes instead of the entire frustum. Plane masking exploits spatial coherence for each culling plane that defines the frustum. For each node which has not been eliminated completely (i.e., the node is not outside), it must either be inside or intersecting each of the six culling planes. When checking frustum intersection of a node, the inside/intersecting state for each plane is returned to be passed down to the child nodes. The child nodes use this to skip checks against any planes that the parent node has already been determined to be inside. The elimination of these additional plane intersection tests provides the speedup.

In order to pass this state efficiently down the BVH, a bit mask is used. For plane number k, the kth bit of the bit mask is 1 if the parent node is intersecting plane k, and 0 if it is inside. If the parent node is outside, the children will not be traversed at all. Since we are using JavaScript, we use a 32-bit integer as our bit mask. (A frustum has only 6 planes, so this is plenty.)

The benefit of avoiding plane intersection checks is just a few operations, but they are repeated many times (6 bounding planes times n bounding boxes) and have a significant performance impact (see the Results section below). When testing frustum intersection with oriented bounding boxes, one plane-box intersection test requires four dot products and several comparisons.

### Examples

These examples show the culling process for different camera locations in a simple scene with the following hierarchy:

In the following examples, all plane intersection checks in *italics* are those which can be avoided because the result is predetermined. Those which are struck through are irrelevant because the node is already known to be outside. Intersections are evaluated in reading order in the table.

**Parent node inside:**

In this example, the parent node is completely inside the frustum. During hierarchical view frustum culling, the frustum planes are checked only against the parent node.

**Parent node intersecting:**

In this example, the parent node is intersecting the frustum at Plane 2. During hierarchical view frustum culling, the children check only this plane.

**Parent node outside:**

In this example, the parent node is completely outside the frustum so its children are not checked.

```
// frustum: defined as a list of clipping planes.
// bvh: can be tested against a plane (outside, inside, or intersecting).
void traverse(frustum, bvh) {
bool fullyContained = true;
for (plane in frustum.planes) {
if (bvh root bound is OUTSIDE plane) {
// This entire subtree is removed by this plane
return;
} else if (bvh root bound is INTERSECTING plane) {
// The subtree intersects this plane, so it
// isn't fully contained.
fullyContained = false;
}
}
if (fullyContained) {
// The node is fully contained, so its descendants are all fully contained.
bvh.drawWithAllDescendants();
} else {
// Need to recurse to determine what is contained
for (subtree in bvh.children) {
traverse(frustum, subtree);
}
}
}
```

**Hierarchical Culling with Plane Masking**

```
// frustum: defined as a list of clipping planes.
// bvh: can be tested against a plane (outside, inside, or intersecting).
void traverse(frustum, bvh) {
bool mask = [true] * frustum.planes.length;
traverseMasked(frustum, bvh, mask.clone());
}
// mask: mask[planeIndex] is true iff the node _might_ intersect that plane.
// (False means the node is _definitely_ inside a plane.)
// In practice, this is implemented with a bit mask (int32).
void traverseMasked(frustum, bvh, mask) {
for (planeIndex, plane in frustum.planes) {
if (bvh root bound is OUTSIDE plane) {
// This entire subtree is removed by this plane
return;
} else if (bvh root bound is INSIDE plane) {
// The subtree definitely will never intersect this plane,
// so it will never need to be checked again.
mask[planeIndex] = false;
}
}
if (mask.allFalse) {
// The node is fully contained, since it is inside every plane.
bvh.drawWithAllDescendants();
} else {
// Need to recurse to determine what is contained.
for (subtree in bvh.children) {
traverseMasked(frustum, subtree, mask.clone());
}
}
}
```

**Results**

To see the performance gains of plane masking, we profiled our selectTiles function, which traverses a BVH and selects nodes for rendering based on view frustum culling and level of detail (LOD). Using the JavaScript profiler, selectTiles sees speedups of around 15% in Chrome and 20-34% in Firefox Nightly.

These results were captured at steady state in a 1920x1080 viewport, with the globe turned off to reduce profiling noise.

### Washington, DC buildings

Percentage of Total CPU Time within selectTiles | Chrome 43 | Firefox Nightly 42.0a1 |
---|---|---|

Without plane masking | 8.70% | 0.87% |

With plane masking | 7.68% | 0.65% |

speedup | 15% | 34% |

```
viewer.scene.globe = undefined;
viewer.camera.setView({
position : Cesium.Cartesian3.fromDegrees(
-77.04004663620222, 38.889234150474174, 540.0498030854308),
heading : Cesium.Math.toRadians(33.58734678346582),
pitch : Cesium.Math.toRadians(-32.33726064420708),
roll : Cesium.Math.toRadians(0.12325983351023434)
});
```

Percentage of Total CPU Time within selectTiles | Chrome 43 | Firefox Nightly 42.0a1 |
---|---|---|

Without plane masking | 6.55% | 0.60% |

With plane masking | 5.83% | 0.50% |

speedup | 13% | 20% |

```
viewer.scene.globe = undefined;
viewer.camera.setView({
position : Cesium.Cartesian3.fromDegrees(
-71.09781914618496, 42.361741066001535, 193.97270561145885),
heading : Cesium.Math.toRadians(306.33076014819653),
pitch : Cesium.Math.toRadians(-26.284861607374907),
roll : Cesium.Math.toRadians(359.8277222802287)
});
```

**Footnotes**

- Speedup is calculated in a way that eliminates factors outside of selectTiles, such as the removal of the globe rendering: for two profile percentages, the speedup is (1/x2 - 1) / (1/x1 - 1) - 1
- The difference in magnitude between Chrome and Firefox is because Firefox includes a "Graphics" component in its profiling results (~85%), while Chrome does not.
- Firefox Nightly was used for its newly improved profiling tools.