Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If calling the same function with a different argument would be considered 'function coloring', every function in a program is 'colored' and the word loses its meaning ;)

Zig actually also had solved the coloring problem in the old and abandondend async-await solution because the compiler simply stamped out a sync- or async-version of the same function based on the calling context (this works because everything is a single compilation unit).





In that case JS is not colored either because an async function is simply a normal function that returns a Promise.

As far as I understand, coloring refers to async and sync functions having the same calling syntax and interface, I.e.

    b = readFileAsync(p)
    b = readFileSync(p)
share the same calling syntax. Whereas

    b = await readFileAsync(p)
    readFileAsync(p).then(b => ...)
    
    b = readFileSync(b)
are different.

If you have to call async functions with a different syntax or interface, then it's colored.


> In that case JS is not colored either because an async function is simply a normal function that returns a Promise.

Exactly, IMHO at least, JS doesn't suffer from the coloring problem because you can call async functions from sync functions (because the JS Promise machinery allows to fall back to completion callbacks instead of using await). It's the 'virality' of await which causes the coloring problem, but in JS you can freely mix await and completion callbacks for async operations).


No, async and callbacks in JS are extremely viral. If a function returns a Promise or takes a callback, there is no possible way to execute it synchronously. Hence, coloring.

The reason this coloring isn't a problem for the JS ecosystem, is that it's a single-threaded language by design. So, async/callbacks are the only reasonable way to do anything external to the JS runtime (i.e. reading files, connecting to APIs, etc.)

(notwithstanding that node.js introduced some synchronous external operations in its stdlib - those are mostly unused in practice.)

To put it a different way - yes, JS has function coloring, but it's not a big deal because almost the entire JS ecosystem is colored red anyway.


await isn't viral per se, it's a purely local transformation. The virality is from CPS/callbacks and Promise.

> If calling the same function with a different argument would be considered 'function coloring', than every function in a program is 'colored' and the word loses its meaning ;)

Well, yes, but in this case the colors (= effects) are actually important. The implications of passing an effect through a system are nontrivial, which is why some languages choose to promote that effect to syntax (Rust) and others choose to make it a latent invariant (Java, with runtime exceptions). Zig chooses another path not unlike Haskell's IO.


> Zig actually also had solved the coloring problem in the old and abandondend async-await solution because the compiler simply stamped out a sync- or async-version of the same function based on the calling context (this works because everything is a single compilation unit).

AFAIK this still leaked through function pointers, which were still sync or async (and this was not visible in their type)


Pretty sure the Zig team is aware of this and has plans to fix it before they re-release async.

I’m pretty sure that was an issue specifically with the old implementation, and not something still left to fix.

Let's revisit the original article[1]. It was not about arguments, but about the pain of writing callbacks and even async/await compared to writing the same code in Go. It had 5 well-defined claims about languages with colored functions:

1. Every function has a color.

This is true for the new zig approach: functions that deal with IO are red, functions that do not need to deal with IO are blue.

2. The way you call a function depends on its color.

This is also true for Zig: Red functions require an Io argument. Blue functions do not. Calling a red function means you need to have an Io argument.

3. You can only call a red function from within another red function.

You cannot call a function that requires an Io object in Zig without having an Io in context.

Yes, in theory you can use a global variable or initialize a new Io instance, but this is the same as the workarounds you can do for calling an async function from a non-async function For instance, in C# you can write 'Task.Run(() -> MyAsyncMethod()).Wait()'.

4. Red functions are more painful to call.

This is true in Zig again, since you have to pass down an Io instance.

You might say this is not a big nuisance and almost all functions require some argument or another... But by this measure, async/await is even less troublesome. Compare calling an async function in Javascript to an Io-colored function in Zig:

  function foo() {
    blueFunction(); // We don't add anything
  }

  async function bar() {
    await redFunction(); // We just add "await"
  }
And in Zig:

  fn foo() void {
    blueFunction()
  }

  fn bar(io: Io) void {
    redFunction(io); // We just add "io".
  }

Zig is more troublesome since you don't just add a fixed keyword: you need a add a variable that is passed along through somewhere.

5. Some core library functions are red.

This is also true in Zig: Some core library functions require an Io instance.

I'm not saying Zig has made the wrong choice here, but this is clearly not colorless I/O. And it's ok, since colorless I/O was always just hype.

---

[1] https://journal.stuffwithstuff.com/2015/02/01/what-color-is-...


> This is also true for Zig: Red functions require an Io argument. Blue functions do not. Calling a red function means you need to have an Io argument.

I don't think that's necessarily true. Like with allocators, it should be possible to pass the IO pointer into a library's init function once, and then use that pointer in any library function that needs to do IO. The Zig stdlib doesn't use that approach anymore for allocators, but not because of technical restrictions but for 'transparency' (it's immediately obvious which function allocates under the hood and which doesn't).

Now the question is, does an IO parameter in a library's init function color the entire library, or only the init function? ;P

PS: you could even store the IO pointer in a public global making it visible to all code that needs to do IO, which makes the coloring question even murkier. It will be interesting though how the not-yet-implemented stackless coroutine (e.g. 'code-transform-async') IO system will deal with such situations.


In my opinion you must have function coloring, it's impossible to do async (in the common sense) without it. If you break it down one function has a dependency on the async execution engine, the other one doesn't, and that alone colors them. Most languages just change the way that dependency is expressed and that can have impacts on the ergonomics.

Look at Go or Java virtual threads. Async I/O doesn't need function coloring.

Here is an example Zig code:

    defer stream.close(io);

    var read_buffer: [1024]u8 = undefined;
    var reader = stream.reader(io, &read_buffer);

    var write_buffer: [1024]u8 = undefined;
    var writer = stream.writer(io, &write_buffer);

    while (true) {
        const line = reader.interface.takeDelimiterInclusive('\n') catch |err| switch (err) {
            error.EndOfStream => break,
            else => return err,
        };
        try writer.interface.writeAll(line);
        try writer.interface.flush();
    }
The actual loop using reader/writer isn't aware of being used in async context at all. It can even live in a different library and it will work just fine.

Not necessarily! If you have a language with stackful coroutines and some scheduler, you can await promises anywhere in the call stack, as long as the top level function is executed as a coroutine.

Take this hypothetical example in Lua:

  function getData()
    -- downloadFileAsync() yields back to the scheduler. When its work
    -- has finished, the calling function is resumed.
    local file = downloadFileAsync("http://foo.com/data.json"):await()
    local data = parseFile(file)
    return data
  end

  -- main function
  function main()
    -- main is suspended until getData() returns
    local data = getData()
    -- do something with it
  end
    
  -- run takes a function and runs it as a coroutine
  run(main)
Note how none of the functions are colored in any way!

For whatever reason, most modern languages decided to do async/await with stackless coroutines. I totally understand the reasoning for "system languages" like C++ (stackless coroutines are more efficient and can be optimized by the compiler), but why C#, Python and JS?


Uncoloured async is possible, but it involves making everything async. Crossing the sync/async boundary is never trivial, so languages like go just never cross it. Everything is coroutines.

The subject of the function coloring article was callback APIs in Node, so an argument you need to pass to your IO functions is very much in the spirit of colored functions and has the same limitations.

In Zig's case you pass the argument whether or not it's asynchronous, though. The caller controls the behavior, not the function being called.

The coloring is not the concrete argument (Io implementation) that is passed, but whether the function has an Io parameter in the first place. Whether the implementation of a function performs IO is in principle an implementation detail that can change in the future. A function that doesn't take an Io argument but wants to call another function that requires an Io argument can't. So you end up adding Io parameters just in case, and in turn require all callers to do the same. This is very much like function coloring.

In a language with objects or closures (which Zig doesn't have first-class support for), one flexibility benefit of the Io object approach is that you can move it to object/closure creation and keep the function/method signature free from it. Still, you have to pass it somewhere.


> Whether the implementation of a function performs IO is in principle an implementation detail that can change in the future.

I think that's where your perspective differs from Zig developers.

Performing IO, in my opinion, is categorically not an implementation detail. In the same way that heap allocation is not an implementation detail in idiomatic Zig.

I don't want to find out my math library is caching results on disk, or allocating megabytes to memoize. I want to know what functions I can use in a freestanding environment, or somewhere resource constrained.


> Performing IO, in my opinion, is categorically not an implementation detail. In the same way that heap allocation is not an implementation detail in idiomatic Zig.

It seems you two are coming at this from opposing perspectives. From the perspective of a library author, Zig makes IO an implementation detail, which is great for portability. It lets library authors freely use IO abstractions if it makes sense for their problem.

This lets you, as an application developer, decide the concrete details of how such libraries behave. Don't want your math library to cache to disk? Give it an allocating writer[0] instead of a file writer. Want to use an library with async functionality on an embedded system without multi threading? Pass it a single threaded io[1] runtime instance, implement the io interface yourself as is best for your target.

Of course someone has to decide implementation details. The choices made in designing Zig tend to focus on giving library authors useful abstractions thst give application authors meaningful control over important decisions for their application.

[0] https://ziglang.org/documentation/master/std/#std.Io.Writer....

[1] https://ziglang.org/documentation/master/std/#std.Io.Threade...


This is also why function coloring is not a problem, and is in fact desirable a lot of the time.

The problem with function coloring is that it makes libraries difficult to implement in a way that's compatible with both sync and async code.

In Python, I needed to write both sync and async API clients for some HTTP thing where the logical operations were composed of several sequential HTTP requests, and doing so meant that I needed to implement the core business logic as a Generator that yields requests and accepts responses before ultimately returning the final result, and then wrote sync and async drivers that each ran the generator in a loop, pulling requests off, transacting them with their HTTP implementation, and feeding the responses back to the generator.

This sans-IO approach, where the library separates business logic from IO and then either provides or asks the caller to implement their own simple event loop for performing IO in their chosen method and feeding it to the business logic state machine, has started to appear as a solution to function coloring in Rust, but it's somewhat of an obtuse way to support multiple IO concurrency strategies.

On the other hand, I do find it an extremely useful pattern for testability, because it results in very fuzz-friendly business logic implementation, isolated side-effect code, and a very simple core IO loop without much room in it for bugs, so despite being somewhat of a pain to write I still find it desirable at times even when I only need to support one of the two function colors.


My opinion is that if your library or function is doing IO, it should be async - there is no reason to support "sync I/O".

Also, this "sans IO" trend is interesting, but the code boils down to a less ergonomic, more verbose, and less efficient version of async (in Rust). It's async/await with more steps, and I would argue those steps are not great.


> there is no reason to support "sync I/O"

I disagree strongly.

From a performance perspective, asynchronous IO makes a lot of sense when you're dealing concurrently with a large number of tasks which each spend most of their time waiting for IO operations to complete. In this case, running those tasks in a single-threaded event loop is far more efficient than launching off thousands of individual threads.

However, if your application falls into literally any other category, then suddenly you are actually paying a performance penalty, since you need the overhead of running an event loop any time you just want to perform some IO.

Also, from a correctness perspective, non-concurrent code is simply a lot less complex and a lot harder to get wrong than concurrent code. So applications which don't need async also end up paying a maintainability, and in some cases memory safety / thread safety, penalty as well.


The beautiful thing about the “async” abstraction is that it doesn’t actually tie you to an event loop at all. Nothing about it implies that somebody is calling `epoll_wait` or similar anywhere in the stack.

It’s just a compiler feature that turns functions into state machines. It’s totally valid to have an async runtime that moves a task to a thread and blocks whenever it does I/O.

I do agree that async without memory safety and thread safety is a nightmare (just like all state machines are under those circumstances). Thankfully, we have languages now that all but completely solve those issues.


You surely must be referring to Rust, the only multithreaded language with async-await in which data races aren't possible.

Rust is lovely and all, but is a bad example for the performance side of the argument, since in practice libraries usually have to decide on an async runtime, so in practice library users have to launch that runtime (usually Tokio) to execute the library's Futures.


Sure, but that’s a library limitation (no widespread common runtime interface that libraries such as Tokio implement), not a fundamental limitation of async.

Thread safety is also a lot easier to achieve in languages like C#, and then of course you have single-threaded environments like JS and Python.


Exactly, there is nothing wrong with function coloring. It's a design choice.

Colored functions are easier to reason about, because potential asynchronicity is loudly marked.

Colorless functions are more flexible because changing a function to be async doesn't virally break its interface and the interface of all its callers.

Zig has colored functions, and that's just fine. The problem is the (unintentional) gaslighting where we are told that Zig is colorless when the functions clearly have colors.


As mentioned, the problem with coloring is not that you see the color, the problem is that you can't abstract over the colors.

Effectful languages basically add user-definable "colors", but they let you write e.g. a `map` function that itself turns color based on its parameter (e.g. becoming async if an async function is passed).


I think talking about colouring often misses the point. Sync & async code are fundamentally different; languages without coloured functions make everything async. Everything in go (for instance) is running in an async runtime, and it's all preemptable.

> I don't want to find out my math library is caching results on disk, or allocating megabytes to memoize. I want to know what functions I can use in a freestanding environment, or somewhere resource constrained.

On that vein, I would often like to know whether the function I can is creating a task/thread/greenlet/whatever that will continue executing, concurrently, after it returns. Making that be part of the signature is approximately called “structured concurrency”, and Zig’s design seems to conflate that with taking an io parameter. This seems a bit disappointing to me.


> A function that doesn't take an Io argument but wants to call another function that requires an Io argument can't.

Why? Can’t you just create an instance of an Io of whatever flavor you prefer and use that? Or keep one around for use repeatedly?

The whole “hide a global event loop behind language syntax” is an example of a leaky abstraction which is also restrictive. The approach here is explicit and doesn’t bind functions to hidden global state.


You can, but then you’re denying your callers control over the Io. It’s not really different with async function coloring: https://news.ycombinator.com/item?id=46126310

Scheduling of IO operations isn’t hidden global state. Or if it is, then so is thread scheduling by the OS.


Is that a problem in practice though? Zig already has this same situation with its memory allocators; you can't allocate memory unless you take a parameter. Now you'll just have to take a memory allocator AND an additional io object. Doesn't sound very ergonomic to me, but if all Zig code conforms to this scheme, in practice there will only-one-way-to-do-it. So one of the colors will never be needed, or used.

If your functions suddenly requires (currently)unconstructable instance "Magic" which you now have to pass in from somewhere top level, that indeed suffers from the same issue as async/await. Aka function coloring.

But most functions don't. They require some POD or float, string or whatever that can be easily and cheaply constructed in place.


Colors for 2 ways of doing IO vs colors for doing IO or not are so different that it’s confusing to call both of them “function coloring problem”. Only the former leads to having to duplicate everything (sync version and async version). If only the latter was a thing, no one would have coined the term and written the blog post.

IMO the problem was never about it actually doing IO or an async actions or whatever. It's about not being able to a call a async function from a sync function. Because in my experience you almost never wholesale move from sync to async everywhere. In fact I would consider that an extremely dangerous practice.

> If calling the same function with a different argument would be considered 'function coloring', than every function in a program is 'colored' and the word loses its meaning ;)

I mean, the concept of "function coloring" in the first place is itself an artificial distinction invented to complain about the incongruent methods of dealing with "do I/O immediately" versus "tell me when the I/O is done"--two methods of I/O that are so very different that it really requires very different designs of your application on top of those I/O methods: in a sync I/O case, I'm going to design my parser to output a DOM because there's little benefit to not doing so; in an async I/O case, I'm instead going to have a streaming API.

I'm still somewhat surprised that "function coloring" has become the default lens to understand the semantics of async, because it's a rather big misdirection from the fundamental tradeoffs of different implementation designs.


100% agree, but fortunately I don't think it is the "default lens". If it were nobody would be adding new async mechanisms to languages, because "what color is your function" was a self-described rant against async, in favour of lightweight threads. It does seem to have established itself as an unusually persistent meme, though.

Function coloring is the issue, that arises in practice, which is why people discuss, whether some approach solves it or does not.

Why do you think it automatically follows, that with an async I/O you are going to have a streaming API? An async I/O can just like the sync I/O return a whole complete result, only that you are not waiting for that to happen, but the called async procedure will call you back once the result is calculated. I think a streaming API requires additional implementation effort, not merely async.


My understanding of this design is that you can write the logic separately from the decision to "do I/O immediately" versus "tell me when the I/O is done"

You can write a parser thats outputs a DOM and run it on a stream, or write a parser with a streaming API and run it synchronously on a buffer. You should pick the optimal tool for the situation, but there is no path dependence anymore.


Honestly I don't see how that is different than how it works in Rust. Synchronous code is a proper subset of asynchronous code. If you have a streaming API then you can have an implementation that works in a synchronous way with no overhead if you want. For example, if you already have the whole buffer in memory sometimes then you can just use it and the stream will work exactly like a loop that you would write in the sync version.

serde is a pull parser and it would take significant modification to convert it into an incremental push parser without blocking a thread.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: