Abstract
The uniform grid is a well-known acceleration structure for ray tracing. It is fast to build, but slow to traverse. In this paper, we propose a novel micro 64-tree structure to speed up grid traversals on a GPU. A micro 64-tree is a compact 64-way full tree that summarizes the occupancy of an underlying uniform grid in a hierarchy. A node of the tree stands for a voxel, whose occupancy is represented by a single bit. A node is subdivided into a 64-subgrid that is stored in a 64-bit word. The micro 64-tree is built on the top of a uniform grid. We improve the GPU grid construction algorithm by computing precise triangle-cell intersections and precluding non-overlapping triangle-cell pairs before sorting. The micro 64-tree is then built bottom-up from the uniform grid by reductions in parallel. The top levels of the micro 64-tree are pre-loaded into the shared memory of a GPU, which support on-chip traversals across the coarse levels. The traversal algorithm navigates the ray through the 64-subgrids at different levels, with a concise context for each level stored in the GPU's registers to facilitate vertical moves. With a small overhead in memory and a small overhead in building time, the micro 64-tree can reduce traversal steps, decrease memory bandwidth consumption, and hence significantly improve the efficiency of ray tracing on a GPU.