Talk: The Holy Grail - A Hash Array Mapped Trie for C++

C++ has a handful of associative containers. We started with set and map, both based on node-based red-black trees. These are fine but are not the most efficient and, in particular, suffer from more cache misses than we’d like. If we want to build persistent versions of them it’s achievable but aggravates the problems even more and adds considerable extra complexity (I know - I’ve done it!)

C++11 brought the hash-map based unordered_set and unordered_map, which are generally much faster, with better cache locality - but can be less memory efficient and also don’t translate so easily into persistent versions.

But there exists another general purpose data structure that combines many of the characteristics of trees and hash tables into something that in many important ways is superior to both, and with minimal downside (they are close but not quite as fast as pure hash tables). Hash Array Mapped Tries are more memory efficient than hash tables and, as a bonus, are trivially made persistent - with big implications for concurrency, functional programming, and other applications that benefit from being able to treat them immutably (as well as share large amounts of common state in memory at once).

This talk will describe what this data structure is and how it works,, then look at an implementation I am proposing for boost, and possibly towards the standard. With this implementation we can explore some of the benefits and characteristics.