As counter point, the way that Zig implements hash maps is absolutely instrumental for TigerBeetle.
Zig hashmaps _are_ quite a bit more cumbersome than, eg, in Rust --- you need to decided whether you want the allocator to be bundled with the hasmap, and its also up to the user of the hash map to provide equality and hash code (and, of course, there's manual defer instead of RAII).
However, this flexibility and verbosity comes with a super-power --- you can pass an allocator to a hash map at creation time, and then _not_ pass an allocator when you actually use the hash map. This is huge. This means that we get compile-time guarantee that we follow rule 3 of NASA's 10 rules
3. Do not use dynamic memory allocation after initialization.
And we _still_ can enjoy using a hashmap from the standard library! No other language I know has an equivalent tool.
You can achieve the same thing in Rust or any other language that offers some sort of field privacy (which Zig doesn't btw) by wrapping the hash map with your own type and exposing the interface you want.
I don't quite understand how do you change the size of a zig hashmap in runtime by, for example, inserting 1M items at some point but at the same time being able to enforce in compile-time that such operation will not result with dynamic allocation?
One of them requires passing an allocator (and can allocate), the other doesn’t have an allocator argument (and thus can’t allocate). If you only pass allocator to `init` method of your application, and don’t store it anywhere, only init will be able to call allocating methods, the rest of the app will be allocation free, by construction.
If the second one can't allocate then how does it handle the case where you don't have enough capacity to insert the new (k,v) pair?
I can see that the difference between the two is in self.growIfNeeded() call, https://github.com/ziglang/zig/blob/0dffab7356685c7643aa6e3c..., which the one that doesn't allocate really doesn't have. Does it assume that the predefined capacity will not be reached?
> Does it assume that the predefined capacity will not be reached?
Yes, the function name says exactly that as well, though admittedly what this actually means and what consequences it might have if the assumption is false are probably fairly opaque to a novice user.
> how does it handle the case where you don't have enough capacity to insert the new (k,v) pair?
Breaking the invariant results in safety-checked undefined behavior, that's what the "asserts" in the doc comment signifies[0]. Basically, if something goes awry at runtime you'll get a crash along with a nice error trace in Debug/ReleaseSafe mode or with the appropriate @setRuntimeSafety call.
If instead you'd like to have errors that you can handle, you could use your real allocator where you expect to actually use it and pass a failing allocator[1] everywhere else (but that's sort of abusing the API IMO, I don't know if I'd actually recommend you do this).
Right, that's what I thought is happening under the hood. Thanks for confirming.
This also means that there is nothing novel about this approach that cannot be achieved in other programming languages as one of the parent comments claimed.
This is simply a hashmap with pre-allocated pool of memory.
By "pretty much", I meant that the bucket lists are already allocation free thanks to intrusive lists; of course the hash table itself can uses allocation when it's resized, or not, depending on user choice. It's trivial to preallocate it or use an arena or whatever, since it's just a flat array.
Zig hashmaps _are_ quite a bit more cumbersome than, eg, in Rust --- you need to decided whether you want the allocator to be bundled with the hasmap, and its also up to the user of the hash map to provide equality and hash code (and, of course, there's manual defer instead of RAII).
However, this flexibility and verbosity comes with a super-power --- you can pass an allocator to a hash map at creation time, and then _not_ pass an allocator when you actually use the hash map. This is huge. This means that we get compile-time guarantee that we follow rule 3 of NASA's 10 rules
And we _still_ can enjoy using a hashmap from the standard library! No other language I know has an equivalent tool.