Fun with KD-Trees
I've been playing around with KD-trees at home, trying to come up with something novel. I started out trying to fit an Oddworld style "loose-kdtree" (a concept too boringly esoteric to get into here) into 4 bytes per node, but I can't convince myself that the tree can handle large models with only 14 bits of offset for the child pointers, so that's right out. It may be possible to do this with a short/long offset approach, but that's a test for another day. It requires a more convoluted packing process, and an extra conditional inside the node-traversal loop, which is definitely A Bad Thing. Still, the gains in cache-efficiency and tree size might be worth it, there should be only a small subset of nodes that have child offsets that exceed 16k.
Next up, cache-efficient subtrees, as outlined by Crister Ericson (in this talk). I was trying 3 element subtrees, so 3 nodes of the KD-tree would be packed into 16 bytes. This turns out to be a royal PITA to code nicely, dealing with internal nodes in a way to preserves traversal orders while not requiring explicit recursive function calling is too annoying for my tastes. The capper was that when I looked at the average memory cost of an internal tree node for some real test models, it wound up being almost exactly 8 bytes. So much for subtrees, I can write much easier to deal with traversal code for simple 8-byte non-packed nodes.
It's possible that I'd get proper space savings and better cache-efficiency for 7-element subtrees, but the hassle simply doesn't seem to be worth it at this point. We'll see how the 8-byte nodes turn out, I think it should be comparable in cache-efficiency, so long as the layout process is sensible. The builder is generating some fun images at this point at least, the image to the right is a section of the KD-tree built for the Bunny. (Stop groaning back there!) Click for the full deal...