Proceedings: GI 2013

A micro 64-tree structure for accelerating ray tracing on a GPU

Xin Liu , Jon Rokne

Proceedings of Graphics Interface 2013: Regina, Saskatchewan, Canada, 29 - 31 May 2013, 165-172

DOI 10.20380/GI2013.22

  • Bibtex

    @inproceedings{Liu:2013:10.20380/GI2013.22,
    author = {Liu, Xin and Rokne, Jon},
    title = {A micro 64-tree structure for accelerating ray tracing on a GPU},
    booktitle = {Proceedings of Graphics Interface 2013},
    series = {GI 2013},
    year = {2013},
    issn = {0713-5424},
    isbn = {978-1-4822-1680-6},
    location = {Regina, Saskatchewan, Canada},
    pages = {165--172},
    numpages = {8},
    doi = {10.20380/GI2013.22},
    publisher = {Canadian Human-Computer Communications Society},
    address = {Toronto, Ontario, Canada},
    }

Abstract

The uniform grid is a well-known acceleration structure for ray tracing. It is fast to build, but slow to traverse. In this paper, we propose a novel micro 64-tree structure to speed up grid traversals on a GPU. A micro 64-tree is a compact 64-way full tree that summarizes the occupancy of an underlying uniform grid in a hierarchy. A node of the tree stands for a voxel, whose occupancy is represented by a single bit. A node is subdivided into a 64-subgrid that is stored in a 64-bit word. The micro 64-tree is built on the top of a uniform grid. We improve the GPU grid construction algorithm by computing precise triangle-cell intersections and precluding non-overlapping triangle-cell pairs before sorting. The micro 64-tree is then built bottom-up from the uniform grid by reductions in parallel. The top levels of the micro 64-tree are pre-loaded into the shared memory of a GPU, which support on-chip traversals across the coarse levels. The traversal algorithm navigates the ray through the 64-subgrids at different levels, with a concise context for each level stored in the GPU's registers to facilitate vertical moves. With a small overhead in memory and a small overhead in building time, the micro 64-tree can reduce traversal steps, decrease memory bandwidth consumption, and hence significantly improve the efficiency of ray tracing on a GPU.