Nonblocking disk IO

December 20, 2011

It's natural, when writing an event-driven application, to want to perform disk operations like file reads and writes in the same non-blocking manner you use for sockets.

This turns out to be hard. In theory there's POSIX AIO, but it's reported to not really work on Linux (my info might be out of date). Async libraries like node.js use an internal thread pool to simulate its desired event-behavior for files.

Sometimes in discussions about this people point longingly at Windows, which does have an API for performing overlapped disk operations. See Ryan Dahl's nice overview for a ton of links. In fancy apps like Chrome, subsystems like the disk cache use overlapped IO on Windows and threads on non-Windows, with the hope that Windows async IO involves less overhead (less bookkeeping, fewer copies, etc.).

But it turns out that Windows async IO is just broken. In any of a number of situations, including if your disk is encrypted, Windows will silently make your async file operations synchronous. See this MSDN doc for a list of other potential reasons. Special highlights include how extending a file's length is synchronous unless you use a special API, while that API on NTFS file systems requires a special privilege that is only available to administrators by default.

I don't write this to just say "man, Windows sure sucks" — the Linux situation is worse, and this may well all have been fixed in Windows 7. But rather I observe that there is just a ton of API surface in an operating system, and any of it may block (see e.g. Brad discussing sendfile). (In Chrome's case, we found via instrumentation that real users were encountering browser hangs because its supposedly async disk interface wasn't async for seeking within a file.) In practice for anything you want to be truly asynchronous you probably need to use a thread. You can still use the AIO APIs, if you have some that work, from that thread.

There are two ways to interpret that conclusion. One is that trying to make a synchronous world async in a piecemeal fashion doesn't work, and instead it's better to make it easy to coordinate synchronous tasks — the Go model (see rsc's convincing slides). The other view is that the problem is the synchronous system, and any work we can do to move away from that the better — the node model, where you replace the world.