So lesson is don’t use safe Rust code when writing something like a memory alloc...

pavon · on Sept 19, 2023

I don't think Rust is the issue here. The same integer overflow can occur in any language and should be checked. Integer overflow and underflow is one of the most common security bugs after memory access errors. It is more that bump allocators are so efficient that a relatively inexpensive overflow check can be a significant fraction of their runtime.

kazinator · on Sept 19, 2023

It's not integer overflow but pointer overflow.

If you're doing small bump allocations and you're anywhere near pointer overflow, it means you're way out of bounds already; the bump allocations you already made were wrong.

You need to check whether the allocation increment is out of the zone from which you're allocating, and you need that no matter which direction you go.

If the requests are small, you will hit the end of your arena long before you worry about pointer overflow at the zero address or at address 0xFF..FF!

That said, aligning down is a shade faster than up. To align down to a boundary divisible by ALIGN_MASK + 1 we just truncate some low order bits to zero:

  addr &= ~ALIGN_MASK;

If the bits are already zero, the address doesn't move; all is cool.

but aligning up, where we don't care about overflow, requires handling the case where the address is already aligned and doesn't have to move:

  if (addr & ALIGN_MASK) {
    addr |= ALIGN_MASK;
    addr++;
  }

Or a trick like this where we bias the address with an offset of -1 during the masking calculation:

  addr = ((addr - 1) | ALIGN_MASK) + 1;

(We could get underflow here if addr is the zero address (null pointer on most systems), but it's reversible if the pointer arithmetic is done as unsigned. Zero aligns to zero: it goes to 0xFFF..FFF which stays the same after | 7, and then increments back to zero.)

Either of these is worse than just addr &= ~7.

tel · on Sept 19, 2023

I would definitely be in favor of an even faster `SmallBump` variant which assumes that the allocations are small w.r.t. usize::MAX for some additional speed.

I also wouldn't mind the a default "fast" bump allocator library to do all tricks it can without sacrificing safety.

lowbloodsugar · on Sept 19, 2023

And maybe the heap has fixed alignment so you have one that only needs 4 and one that needs 16. Indeed maybe you grow from top and bottom depending. And maybe this is all a giant premature optimization - including the panic about up vs down. My previous career was console video games and maybe this kind of handwringing isn’t required for whatever the fuck this is.

tel · on Sept 20, 2023

Eh, sure it is. Until it isn't. I'm not using this libraries, but I'm glad they exist.

wrsh07 · on Sept 20, 2023

Since we're both thinking about this...

Is rust warning you about pointer overflow? My understanding is that rust is complaining about integer overflow but perhaps that's insufficient?

(I don't actually know which solution is better/worse from perf perspective besides author's post)

lowbloodsugar · on Sept 19, 2023

Or you just add ALIGNMENT-1 and then mask. So if alignment is 16 then you add 15.

kazinator · on Sept 19, 2023

Oh right; that's how I've always done it, just forgot. Right. Still, it's an extra addition compared to just masking down.

cratermoon · on Sept 19, 2023

No, the lesson is to work with the safety mechanisms to accomplish both fast and safe result. If I'm writing code in language A, I don't choose to throw out one of the foundational aspects of that language on the grounds that it's merely inconvenient.

wrsh07 · on Sept 19, 2023

I think the lesson is "safety comes at a cost. If you bump downwards you can avoid those safety checks"

I don't think the lesson is "let's just write unsafe code"

kazinator · on Sept 19, 2023

Caller asks for a 500 megabyte downward bump allocation in a 32 bit system. Do you check for underflow or not?

bitwize · on Sept 19, 2023

You check to make sure the size requested is less than or equal to current pointer - chunk start and fail if it is bigger.

Let's face facts, the allocator, even with this additional check, is still going to be blazing fast compared to malloc() or whatever. If you're allocating so much that the handful of extra instructions is going to be a significant slowdown, maybe structure your allocations differently?

saagarjha · on Sept 20, 2023

Most allocators will do this, yes.