After trying and failing over several days to track down a squirrely segfault in a C project about 15 years ago, I taught myself Valgrind in order to debug the issue.
Valgrind flagged an "invalid write", which I eventually hunted down as a fencepost error in a dependency which overwrote their allocated stack array by one byte. I recall that it wrote "1" rather than "2", though, haha.
> Lesson learnt, folks: do not throw exceptions out of asynchronous procedures if you’re inside a system call!
The author's debugging skills are impressive and significantly better than mine, but I find this an unsatisfying takeaway. I yearn for a systemic approach to either prevent such issues altogether or to make them less difficult to troubleshoot. The general solution is to move away from C/C++ to memory safe languages whenever possible, but such choices are of course not always realistic.
With my project, I started running most of the test suite under Valgrind periodically. That took took half an hour to finish rather than a few seconds, but it caught many similar memory corruption issues over the next few years.
Switching to memory safe languages would not necessarily prevent this issue because this is an undefined behavior issue, not a memory safety issue.
Pointers to C++ functions that can throw exceptions should not be passed to C functions as callback pointers. Executing the exception in the callback context is undefined behavior, since C does not support stack unwinding.
Presumably, any language that has exception handling would have an issue on Windows when doing select() in one thread and QueueUserAPC() in another thread to interrupt it with a callback function that throws an exception. What happens then depends on how the stack unwinding works.
No programming language can avoid this because they need to use the underlying operating system's system calls in order to function. They also need to use the C functions in ntdll to do system calls since Microsoft does not support doing system calls outside of calling the corresponding functions in ntdll and friends.
Similar experience, spending one week debugging memory corruption issues in production back in 2000, with the customer service pinging our team every couple of hours, due to it being on an high profile customer, has been my lesson.
Doesn't C++ already support everything you need here? It supports the noexcept keyword which should have been used in the interface to this syscall. That would have prevented throwing callbacks from being used at compile time. My guess is that this is a much older syscall than noexcept though.
noexcept doesn’t prevent any throws at compile-time, it basically just wraps the function in a `catch(...)` block that will call std::terminate, like a failed assert. IMHO it is a stupid feature for this very confusion.
This was true until c++17. It was changed in 17 to make noexcept part of the function type meaning a noexcept(false) function can't be used in a context where a noexcept is needed as they're unrelated types. I don't know if compilers actually implement this but according to the standard it should be usable.
Yes this helps specifically when passing functions as pointers or something like std::function (edit: or overriding methods), it will at least inform the developer that they need to add noexcept to the function declaration if they want to use it there, and hopefully due to that they recursively audit the function body and anything it calls for exceptions. And hopefully all future developers also notice the noexcept and keep up the practice. But it changes nothing about checking plain function calls. So I think adding this to the function type helps some cases but still does not move noexcept toward the behavior most people want/expect.
This just feels important to point out because this feature is 15 years old and still commonly misunderstood, and each time people are wanting the same thing (actual compile-time prevention of `throw`) which it is not.
Edit: OK I finally just went and tried it on godbolt.org. C++17 GCC, Clang, and MSVC all give 1 warning on this code for `bar` and that's all.
It is a C interface. It is implicitly noexcept. I filed bugs against both GCC and LLVM requesting warnings when someone passes a non-noexcept C++ function pointer to either a C function or to a C++ noexcept function:
No the solution isn't to rewrite it in Rust. The solution is to have the option of compiling your C/C++ program with memory safety whenever things go loopy. ASAN, MSAN, and UBSAN are one great way to do that. Another up and coming solution that promises even more memory safety is Fil-C which is being made by Epic Games. https://github.com/pizlonator/llvm-project-deluge/blob/delug...
Ubsan is fantastic, but ASAN and the rest have serious caveats. They're not suitable for production use and they have a tendency to break in mysterious, intermittent ways. For example, Ubuntu 24.04 unknowingly broke Clang <=15ish when it increased mmap_rnd_bits. ASAN on Windows will actually check if you have ASLR enabled, disable it, and restart at entry. They interact in fun ways with LD_PRELOAD too.
I'm not in a position to look up exactly when it was merged, but I'm pretty confident that shouldn't be needed anymore. The entry point on 19 should do the same restart juggling it does on Windows if the environment isn't correct for some other reason. I can double check later if you want to provide details.
I encountered the issue when our (not Ubuntu, not 24.04) LTS upstream backported security fixes that included the mmap changes without updating universe to include a clang version with the fixes. Any developers diligent enough to update and run sanitisers locally started seeing intermittent crashes.
The solution is usually not to do a rewrite, but I think for greenfield projects we should stop using C or C++ unless there is a compelling reason to do so. Memory-safe systems languages are available today; IMO it's professionally irresponsible to not use them, without a good reason.
MSAN, ASAN, and UBSAN are great tools that have saved me a lot of time and headaches, but they don't catch everything that the compiler of a memory safe language can, at least not today.
Rust isn't standardized. Last time I checked, everyone who uses it depends on its nightly build. Their toolchain is enormous and isn't vendorable. The binaries it builds are quite large. Programs take a very long time to compile. You need to depend on web servers to do your development and use lots of third party libraries maintained by people you've never heard of, because Rust adopted NodeJS' quadratic dependency model. Choosing Rust will greatly limit your audience if you're doing an open source project, since your users need to install Rust to build your program, and there are many platforms Rust doesn't support.
Rust programs use unsafe a lot in practice. One of the greatest difficulties I've had in supporting Rust with Cosmopolitan Libc is that Rust libraries all try to be clever by using raw assembly system calls rather than using libc. So our Rust binaries will break mysteriously when I run them on other OSes. Everyone who does AI or scientific computing with Rust, if you profile their programs, I guarantee you 99% of the time it's going to be inside C/C++ code. If better C/C++ tools can give us memory safety, then how much difference does it really make if it's baked into the language syntax. Rust can't prove everything at compile time.
Some of the Rust programs I've used like Alacrity will runtime panic all the time. Because the language doesn't actually save you. What saves you is smart people spending 5+ years hammering out all the bugs. That's why old tools we depend on every day like GNU programs never crash and their memory bugs are rare enough to be newsworthy. The Rust community has a reputation for toxic behavior that raises questions about its the reliability of its governance. Rust evangelizes its ideas by attacking other languages and socially ostracizing the developers who use them. Software development is the process of manipulating memory, so do you really want to be relinquishing control over your memory to these kinds of people?
Exactly why I don't use it. I don't really feel like including the source for the entire toolchain as part of my project and building it all myself. At least if I write standards conforming C++ there are multiple compiler implementations that can all handle it. I also have a reasonable expectation that a few decades from now I will be able to `apt get somecompiler` and the code will still just work (aside from any API changes at the OS level, for which compatibility shims will almost certainly exist).
If I can't build something starting from a repo in a network isolated environment then I want absolutely nothing to do with it. (Emscripten I am looking at you. I will not be downloading sketchy binary blobs from cloud storage to "build from source" that is not a source build that is binary distribution you liars.)
> One of the greatest difficulties I've had in supporting Rust with Cosmopolitan Libc is that Rust libraries all try to be clever by using raw assembly system calls rather than using libc.
I’m sorry, this is coming from Justine “the magic syscall numbers are my god given right to use” Tunney?
Seems like it depends entirely on context. I'd expect code which intends to be portable to use some sort of dynamically linked wrapper, even if that wrapper isn't libc.
You may want to refresh your familiarity with Rust, I haven't touched nightly in ages and much of what you mention doesn't really resonate with what I've seen in practice. Not saying the language doesn't have issues and things that aren't frustrating but in my experience unless you're going to go to the nines in testing/validation/etc (which is the first thing that's cut when schedules/etc are in peril) I've seen Rust code scale better than C++ ever did.
More tools in the C/C++ realm are always welcome but I've yet to see more than 50% of projects I've worked on be able to successfully use ASAN(assuming you've got the time to burn to configure them and all their dependencies properly). I've used ASAN, CBMC and other tools to good effect but find Rust more productive overall.
Valgrind flagged an "invalid write", which I eventually hunted down as a fencepost error in a dependency which overwrote their allocated stack array by one byte. I recall that it wrote "1" rather than "2", though, haha.
> Lesson learnt, folks: do not throw exceptions out of asynchronous procedures if you’re inside a system call!
The author's debugging skills are impressive and significantly better than mine, but I find this an unsatisfying takeaway. I yearn for a systemic approach to either prevent such issues altogether or to make them less difficult to troubleshoot. The general solution is to move away from C/C++ to memory safe languages whenever possible, but such choices are of course not always realistic.
With my project, I started running most of the test suite under Valgrind periodically. That took took half an hour to finish rather than a few seconds, but it caught many similar memory corruption issues over the next few years.