Generic Containers in C: Safe Division Using Maybe

nly · 2025-08-11T07:45:52 1754898352

GCC generates essentially the same assembly for C++'s std::optional<int> with the exception that the result can't be returned in a register (a problem which will go away if the function is inlined)

https://gcc.godbolt.org/z/vfzK9Toz4

pjmlp · 2025-08-11T10:25:46 1754907946

Proving the point that better type system and abstractions don't necessarly produce bad Assembly, and if constexpr is used, there won't be anything on the final executable other than the actual value.

https://gcc.godbolt.org/z/31o75W5xx

https://gcc.godbolt.org/z/av6a43WeY

uecker · 2025-08-11T12:32:32 1754915552

You do not need constexpr, just remove the "volatile" which I put into my example only to prevent the optimizer from specializing it: https://godbolt.org/z/fs8q3sdxK

But what matters is run-time behavior for any input. And sorry, C++'s complexity still has the effect that the code is terrible: https://godbolt.org/z/vWfjjf8rP

In C, the compiler can remove all error paths: https://godbolt.org/z/GGMcc6bxv

aw1621107 · 2025-08-11T20:26:24 1754943984

> But what matters is run-time behavior for any input. And sorry, C++'s complexity still has the effect that the code is terrible: https://godbolt.org/z/vWfjjf8rP

> In C, the compiler can remove all error paths: https://godbolt.org/z/GGMcc6bxv

For what it's worth, if you actually use the same flags for the C++ example (in particular, -fsanitize-trap) the C++ compilers improve quite a bit, with Clang getting fairly close to the GCC C output (https://godbolt.org/z/M3z5EaGKj ; note that GCC doesn't seem to eliminate the std::optional::value() check unless you use -O3). But perhaps more interestingly, if you make a simplified std::optional that still uses C++-specific features both Clang and GCC produce output that is very close to that from C/GCC: https://godbolt.org/z/PhEW7eT8x

Removing I/O-related stuff makes this clearer:

C: https://godbolt.org/z/e71s9E63f

C++: https://godbolt.org/z/YnhdxYxY4 (Clang seems unable to eliminate the UBSan overflow check for some reason?)

Does make me wonder what exactly it is about std::optional that confuses the optimizers, and whether a similarly complex C maybe implementation can suffer from the same issues that std::optional appears to.

nly · 2025-08-11T19:25:57 1754940357

Not sure what you're talking about here. The cruft the assembly is because you enabled UBSAN. If you disable UBSAN then the exception throwing code paths go away.

uecker · 2025-08-11T19:45:34 1754941534

The point is that the error code paths should go avoid even with UBSAN because they are dead, but the optimizer is not able to see this anymore in C++.

account42 · 2025-08-11T11:36:52 1754912212

> with the exception that the result can't be returned in a register (a problem which will go away if the function is inlined)

It's still a damned shame. Same for not being able to pass optional/unique_ptr/etc in a register.

We really need a trivially_relocatable attribute.

uecker · 2025-08-11T08:44:43 1754901883

Yeah it a amazing a far C++ has come, soon it will be almost as good as C.

JonChesterfield · 2025-08-11T09:51:24 1754905884

In this instance C++ is far ahead of C, because what you want for maybe<> is typestate, and that's only available in C++ because it's wedded to member functions. See https://awesomekling.github.io/Catching-use-after-move-bugs-... or similar

uecker · 2025-08-11T11:36:35 1754912195

Sure.

pjmlp · 2025-08-11T10:13:19 1754907199

Except that it is much more safer to use the type system than pre-processor glue based on text replacements, with more interesting error messages than templates.

uecker · 2025-08-11T11:55:33 1754913333

I don't see how it safer. I think this is just a random claim C++ people like to make without any evidence. In terms of error message, the problem is that C++ often can not produce them because due to overloading it is entirely unclear to the compiler what the intention of the code actually way. A macro-based solution also does not have ideal error message, but I do not think it is worse than C++.

pjmlp · 2025-08-11T12:06:21 1754913981

It isn't a random claim, is based on years of experience fixing pre-processing code gone wrong, because whoever wrote it in first place forgotten that it is nothing more than text expansion, and then someone else completly unaware that it is a macro, ends up giving a bad set of parameters.

uecker · 2025-08-11T12:26:32 1754915192

I spent also countless amount of hours fixing template code, so no I do not let your anecdotes count. There is certainly a lot of problematic macro code in C, but I do not think it is worse than C++ templates, and one can also write robust macros.

pjmlp · 2025-08-11T15:07:34 1754924854

One can also write robust template code, if that is the reasoning we're going for, enable_if, static_assert and concepts have been available for a couple of years now.

At least that is something we can agree on, C and C++ are both languages where developers could write robust code, and the large majority seldom does it.

uecker · 2025-08-11T15:48:26 1754927306

I can agree with this. The difference is that C++ approach is to solve problems with features people need to understand while C solves problems by only providing basic building blocks and expects people to use them correctly. In both cases it needs expertise to do this. But Rust also has this problem.

wahern · 2025-08-11T09:11:13 1754903473

Is this because std::optional isn't trivially constructible?

pwdisswordfishz · 2025-08-11T09:29:46 1754904586

Destructible. Opportunistically making it trivially destructible does the trick.

teo_zero · 2025-08-11T15:28:46 1754926126

Clever, but it misses the very goal of option/maybe types: forcing the user to check the result. In this implementation nothing stops the user from omitting the "if (p.ok)" part and directly using "p.value".

It could work if "maybe(T)" is completely opaque to the user; both checking and accessing its payload must happen through helper macros; the checking macro ticks an invisible flag if ok; the accessing macro returns the payload only if the invisible flag is ticked, otherwise it triggers a runtime error/exception.

Not impossible. However, you would need to replace all "p.ok" with "maybe_check(p)", which is not unreasonable, and all "p.value" with "maybe_value(p)", which might be too much for the final user...

rowanG077 · 2025-08-11T10:03:08 1754906588

At this point I always wonder why people who write stuff like this don't just move to a different language. You are introducing insane amounts of hidden complexity(see also the other posts on that blog). For something that just exists in other languages. Or maybe this is just a fun puzzle for the author, in which case it's totally fine.

dazzawazza · 2025-08-11T13:38:27 1754919507

Other languages internalise that complexity. C leaves it bare, human scale and understandable.

The speed at which you can write great C code often far outstrips other languages which are applicable to the problem domain.

All languages are a compromise, there are no silver bullets.

throw-qqqqq · 2025-08-11T10:09:04 1754906944

You don’t always get to choose your language. Especially in the embedded/firmware area of software development, C is the most widely available option, if not the only option besides ASM shrugs

fuhsnn · 2025-08-11T10:44:02 1754909042

The said library is a bit farther away from the C that is widely available. It relies on C23 features, GNU statement expression, GNU nested function, sanitizer runtimes, VLA types and a very niche pattern of sizeof executing its statement-expression argument; only platforms that provide latest GCC/Clang would be able to use this.

uecker · 2025-08-11T12:39:14 1754915954

In the library I experiment with various things, and C23 and nested functions are not really required. And for running the code, it only relies on GNU statement expression. For bounds checking, you need need the sanitizers.

Overall, it is still far more portable than C++ or any other new language.

throw-qqqqq · 2025-08-11T13:19:22 1754918362

Fair point. I did not notice that. Most C-only compilers only support C99 or a subset thereof

pjmlp · 2025-08-11T11:20:26 1754911226

Unless you are talking about PIC and similar CPUs, there is hardly a modern 16 bit CPU that doesn't have a C++ compiler available as well, assuming that we still consider 16 bit modern for whatever reason.

Heck I learned to program in C++, back when DR-DOS 5 was the latest version, and all I had available was 640 KB to play around, leaving aside MEMMAX.

Nowadays the only reason many embedded developers keep using C is religous.

uecker · 2025-08-11T12:27:44 1754915264

Religion and the garbage code C++ compilers produces, incomprehensible error message, unstable tooling, and very long compilation times.

pjmlp · 2025-08-11T15:01:22 1754924482

I assume you are not using GCC or clang for compiling C code, given that are garbage compilers written in C++.

Still using tcc, or stuck in GCC 5 ?

uecker · 2025-08-11T16:01:28 1754928088

GCC was written in C but changed to C++ later. A lot of the code still looks a lot like C. And as a contributor, I would much prefer it was purely C (and compilation times for GCC itself are a pain - although I think this is not because of C++ but because some files grew to big and some refactoring would be in order)

pjmlp · 2025-08-12T09:16:22 1754990182

Doesn't the change the fact that by now being written in C++, generates garbage code, as per your own words, thus unsuitable for consumption in C projects.

simonask · 2025-08-11T13:08:45 1754917725

What year is it, 1998? Am I going crazy?

I think you're making some extraordinary claims. I'd love to see some receipts. :-)

throw-qqqqq · 2025-08-11T19:58:28 1754942308

Purely guessing, but the "unstable tooling" could perhaps refer to the fact that C++ as a language has evolved a lot.

I have had trouble compiling older C++ code bases with newer compilers, even when specifying C++98 as source standard. I gave up trying to get Scott McPeak's Elkhound C++ parser to compile, last I had to attempt it.

C is a bit more forgiving on that topic (it hasn't changed as much, for better or worse).

uecker · 2025-08-11T16:04:54 1754928294

What specific claim do you think is extraordinary: That embedded programmers still often use C, that compilation times for C++ are often very long, that the languages is less stable than C, or that code produced by C++ can be worse?

simonask · 2025-08-12T14:51:46 1755010306

Especially the last claim seems (that code generation for C++ is worse). It's the same compiler backend in all major compilers. Do you have any examples?

throw-qqqqq · 2025-08-11T13:17:57 1754918277

> … there is hardly a modern 16 bit CPU that doesn't have a C++ compiler

There are quite a few besides various PICs AFAIK, how modern they are is subjective I guess, and it IS mostly the weaker chips. Keil, Renesas, NXP, STMicro (STM8 MCUs used to be C only, not sure today) all sell parts where C++ is unsupported.

> Nowadays the only reason many embedded developers keep using C is religous.

I don’t completely agree, but I see where you are coming from.

The simplest tool that gets the job done is often the best IMO.

In my experience, it is much more difficult to learn C++ well enough to code safely, compared to C.

Many embedded developers are more EE than CS, so simpler software is often preferred.

I know you don’t have to use all the C++ features, all at once, but still :)

Horses for courses. I prefer C for anything embedded.

rowanG077 · 2025-08-11T10:11:34 1754907094

Definitely. I still don't think you should swim against the stream. Just bite the bullet and write idiomatic C. The people who will have to debug your code in the future will thank you.

throw-qqqqq · 2025-08-11T13:19:42 1754918382

I agree 100%

windward · 2025-08-11T10:42:21 1754908941

It is, but the bar of what's considered too 'clever' in embedded/firmware is usually lower than this. In fact, even the ternary conditional operator is too much.

uecker · 2025-08-12T17:23:21 1755019401

BTW: I think what me annoys me about this comment is the claim that this would be "insane amounts of hidden complexity" . If you you look at the article, it four simple one-line macros:

#define maybe(T) struct maybe_##T { bool ok; T value; }

#define maybe_just(T, x) (maybe(T)){ .value = (x), .ok = true }

#define maybe_nothing(T) (maybe(T)){ .value = (T){ }, .ok = false }

#define maybe_value(T, x) (({ maybe(T) _p = &(x); _p->ok ? &_p->value : (void*)0; }))

Please compare this to your other languages.

rowanG077 · 2025-08-12T17:35:37 1755020137

Well yes I find these pretty insane. It's not ergonomic. It doesn't actually offer any safety. It has no integration with anything else. If I compare this to, for example, the rust Option, or Haskell's Maybe, it's not even funny how stark the difference is. And both of them are over the counter included.

Especially the the safety is something I can't step over. I don't feel it offers anything substantial over just having the struct without these macros.

uecker · 2025-08-12T19:31:21 1755027081

Please spell out your criticism exactly. Where do you see an issue with ergonomics? What exactly miss on integration? The safety property for such a type is clear: not being able to access the value when it does not exist. It does this. But most importantly before I let you shift the goal post: this was not original criticism. You said "insane complexity"

rowanG077 · 2025-08-12T20:51:25 1755031885

Bad ergonomics, no integration and safety are all thinks that lead to complexity you need to manage were you to actually use this macro. The macros themselves are simple. But I don't think it's wrong to say using the macros does introduce insane complexity. Such things are often the case in C. The language is simple. To write anything of substance you have to introduce, often high amounts of, complexity.

1. Ergonomics: Forcing the null sanitizer is quite a sledgehammer you often cannot, or don't want, to pay, You are forcing global behavior for something you really only want locally for this construct. A misuse of maybe_value is a crash where in other languages you have case/match and compile time errors. There is no type inference so you have to be explicit at every, single, line, that you really mean a maybe(int) or whatever.

2. Integration: No libraries uses this, meaning any usage is limited to your own code only. Requiring manual and error prone translation at every interface.

3. Safety: No check the null sanitizer is actually on. This is a huge footgun where someone thinks "I know I use this neat macro I saw here". And then of course not enabling the null sanitizer and everything breaks. So now for this to be safe it's not enough to understand the code. You need to check if the compile options are just right. This is especially insidious since this cannot be hidden in some object file where you make damn sure the sanitizer is on. the CONSUMER of this API must enable the sanitizer

For these reason I would ban this macro in any code I have control over.

uecker · 2025-08-12T21:31:43 1755034303

Ok, thanks. I still do not see where the complexity comes in. I also disagree with most your other points, e.g. that other libraries do not use it seems rather irrelevant. I am certainly not switching to another languages for this reason. Type inference works with auto in C23 (or __auto_type as an extension before), but if you mean the macros themselves, being explicit is a bit reason why I prefer C to other languages. For the sanitizer, this is a choice which makes sense to me. If it does not make sense for you, calling "abort()" explicitly instead of relying on the null sanitizer would be a trivial change. The null sanitizer is also certainly not a big sledgehammer. In fact, in the example in the post there is no sanitizer instrumentation left.

amiga386 · 2025-08-11T11:25:54 1754911554

Quite. There's standard POSIX behaviour for this. Divide by zero and execution continues, safely, in your SIGFPE exception handler.

uecker · 2025-08-11T11:40:39 1754912439

Because the different languages suck much more.

munchler · 2025-08-11T12:39:54 1754915994

So this was inspired by Haskell, but you don’t actually think Haskell is a good language?

uecker · 2025-08-11T15:57:26 1754927846

I think Haskell is a cool language. It is still definitely very bad for what I use C for.

layer8 · 2025-08-11T14:07:43 1754921263

A language can provide useful features worth adopting, and suck at the same time.

nextaccountic · 2025-08-11T09:45:28 1754905528

> Here, instead of handling the error condition, I create an lvalue that points nowhere in case of an error because it then corresponds to (({ (void)0; })), relying on the null sanitizer to transform it into a run-time trap for safety.

Isn't this undefined behavior?

uecker · 2025-08-11T12:48:01 1754916481

It is only undefined behavior if it is dereferenced, in which case the null sanitizer can be used to define it to trap, so safely terminate the program. But the example then also shows how you can make sure that this case is not even possible in the final program.

nextaccountic · 2025-08-13T09:42:43 1755078163

It is actually dereferencing a null pointer (there is a * on the beginning of the definition of maybe_value). It is okay if you test for p.ok first, and UB otherwise. So it's like Option::unwrap_unchecked from Rust, not Option::unwrap that merely panics.

Relying on sanitizers to catch UB can only work in a best effort basis, because the compiler can perform optimizations that rely on the fact that the program doesn't have UB (and produces broken code if there is UB - beyond what a sanitizer could catch)

It would be much better to also provide another macro to abort the program if the maybe is nothing.

lmm · 2025-08-11T09:49:25 1754905765

In standard C yes. But any decent C compiler will offer stronger guarantees than the minimum that the standard requires, and presumably the "null sanitizer" they're referring to is one of them.

pjmlp · 2025-08-11T10:27:24 1754908044

As usual, the problem is not what they have been offering for the last decades in tooling, rather what developers actually make use of.

Unfortunely many keep needing education on such matters.

playforclaude · 2025-08-11T19:00:39 1754938839

Any decent C compiler will use loopholes in the standard to optimise your code :)

layer8 · 2025-08-11T14:04:29 1754921069

The null sanitizer is used to define the behavior, which is one of the ways a C implementation is allowed to handle situations whose behavior the C standard leaves undefined.

shakna · 2025-08-11T09:58:30 1754906310

Maybe, but it is defined for GCC:

> You can store a null pointer in any lvalue whose data type is a pointer type. [0]

Though, I would expect a complaint from clang, and clang-tidy.

[0] https://www.gnu.org/software/c-intro-and-ref/manual/html_nod...

BiraIgnacio · 2025-08-11T11:58:26 1754913506

I love seeing the creative ways people implement these types of, if I can say, high level abstractions in C. Thanks for sharing

uecker · 2025-08-12T05:33:52 1754976832

Thank you!

yobbo · 2025-08-11T11:37:54 1754912274

It might be more useful with a signature like maybe_divide -> maybe(int) -> maybe(int) -> maybe(int) ... and then a set of operations over maybe, and functions/macros for and_then(), or_else(), etc. It would be interesting to see how ergonomic it could get.

munchler · 2025-08-11T12:37:49 1754915869

I await the first “Monads in C” tutorial with mixed feelings.

uecker · 2025-08-11T17:35:20 1754933720

Good idea. This is very easy, here is a preview. https://godbolt.org/z/dME1MMr19