Asynchronous I/O in C#: I/O Completion Ports

I finally got around to writing this blog post as I feel it has been a personal debt for some time now. After introducing async I/O in C#, I spent a lot of time thinking about how to write about the topic of this blog post so let's jump right into it.

What are I/O completion ports?

As this article explains, an I/O Completion Port (IOCP from now on) is a queue-like operating system object that can be used to simultaneously manage multiple I/O operations. This is done by associating multiple file handles (not necessarily to a file, any asynchronous-able I/O endpoint) to a single IOCP and monitoring it for changes. Whenever an operation on any of the file handles completes, and I/O completion packet is queued into the IOCP.

I think the following picture does a better job at explaining the concept:

What do they have to do with asynchronous I/O in C#?

Underneath the covers, an asynchronous operation could be implemented in many different ways.

The important part about asynchrony is that it is relative to the caller.

That means that one possible alternative could be to actually spin up a new worker thread and perform a blocking call in it. Although that might not be what we want in every scenario (we are blocking a worker thread, just not the caller's thread) it is still asynchronous from a caller's perspective.

Another alternative that some framework components implement is using IOCP together with completion port threads. I did not know about this before, but ThreadPool threads can be either worker threads or completion port threads (this is explained with detail in this article). With this approach, the ThreadPool is in charge of monitoring IOCPs and dispatching tasks to completion port threads that are in charge of handling the completion of an operation (also shown in this article). How IOCP and the CLR interact is explained with more detail in Chapter 28 of CLR via C# 4th Edition.

Some of the benefits of using IOCP are:

All I/O operations can be registered to the same completion port object (simplifying the CLRs job).
We avoid blocking any of our own threads and ThreadPool worker threads.
We get automatic thread management, which minimizes context switching and gives our main thread more CPU time.

Show me some code…

The link to the working code is in the samples section (bottom of the post), but this is a high level overview of how you would work with a completion port.

The first thing to do is create one and store a handle to it.

After that, whenever you create a file handle for asynchronous I/O you associate it to the IOCP.

Finally, whenever you perform an operation you must specify a callback and get a pointer to a NativeOverlapped* structure:

How is the callback invoked?

You might have noticed that the previous code only provides a callback, but it is never actually invoked. This is the role of a separate component, which we are using to simulate a completion port thread:

That component is in charge of checking the completion port for queued elements and invoking the related asynchronous callback (and also some cleanup).

Samples

Together with Mariano Converti we held a talk some time ago introducing this subject, and created a GitHub repo for the code. Under the source folder you will find three samples:

IOCompletionPorts: Shows how to create a completion port from C# code by taking advantage of DllImport. It gives an idea a high level idea of how things could be implemented underneath the covers (of course the code is nowhere near reusable), and provides a full working sample for the code provided above.
ThreadPoolsSample: Shows that asynchronous callbacks are invoked in thread pool threads.
CompletionPortThreadsSample:Shows that some asynchronous callbacks are invoked in completion port threads, while others are invoked in worked threads.

A sea of code

Pseudo-random thoughts related to software development...