Tasks aren't threads

I'm writing this post just so I have something I can link to whenever people are starting out with .NET 4.5's async and await asynchronous programming. Much of what is described in this post has been described before by far more interesting people such as Stephen Cleary on his blog.

This post serves as an introduction - nothing more. I'll attempt to describe the difference between traditional multi-threaded programming using Thread, and "modern" threading using Task in the TAP era.

There is far more to it than I could ever describe in this post, and I strongly encourage you to read the documentation and other available resources to get a good understanding of how asynchronous programming using Tasks works, and how you could benefit from using it.

All code in this post will be written in C#, and targeting the .NET Framework 4.6.1. Traditional refers to the pre-4.5 threading model, and modern refers to the "current" task asynchronous pattern.

Traditional threading

Before the introduction of the async and await keywords, whenever we wanted run some long-running job that didn't block the calling thread, we'd create a new thread, have it run some code, and use some sort of mechanism to indicate that the thread has finished doing its work.

Without implementing the code that actually runs, scheduling a task would look something along the lines of:

var t = new Thread(() =>
{
    // Do some long running work..
});

t.Start();

Let's go over this code for a second. We create a new Thread, and pass it a delegate that it should run. This delegate runs for an unspecified amount of time, after which the thread exits. When calling t.Start(), we actually create a "physical", new thread. Whenever we had a thread that didn't exit, because, for instance, it runs in a while (true) loop, this thread would prevent the application from exiting (unless IsBackground was specified).

Concepts such as cancellation, completion and continuation are all absent here, and, if required, should be implemented by the programmer working on the code. This lead to thousands of (often very poorly implemented) home-brewed solutions to notify someone who was waiting for this thread's work that it had finished, was cancelled, or managed to throw an exception.

Completion was usually implemented using callbacks, which required marshalling data back and forth between threads by hand, not to mention to ton of boilerplate code associated with every asynchronous operation. I've seen cancellation being implemented through a simple call to Thread.Abort(), which lead to ThreadAbortExceptions and all kinds of unspecified behaviour.

This is how code was written for years. There are countless systems running today that implement threading in this fashion.

Task-based threading

With .NET 4.5 came the introduction of Task based asynchronous programming. A Task is just like any other CLR type - it doesn't get any special treatment whatsoever. Yet, it does play a central role in the introduction of a new threading model. So how does that work exactly?

First off, there's a conceptual difference. Modern async code implements the futures and promises concept, which basically boils down to "this code will be executed sometime in the future". We don't know exactly when, but then again, we don't really care either. That is a huge difference from how offloading work to another thread was done in the past. Where before we knew exactly when a thread would start doing work, now we just know that it will. We'll get into futures and promises and how they are implemented in .NET later.

Let's take a look at some code involving a Task:

public async Task DoSomething()  
{
    // Do some precondition checks and stuff.
    await DoSomethingInternal();
}

private async Task DoSomethingInternal()  
{
    // Do some long running CPU intensive stuff.
}

We have two methods, both returning a Task. They're both marked with async, which tells the compiler that the await keyword should be allowed within these methods. DoSomething() awaits DoSomethingInternal() after checking some preconditions.

The above code doesn't necessarily create a new thread. It'll schedule the work using a TaskScheduler, which will schedule the work on a thread on the ThreadPool by default, whenever a thread is available to do said work. The runtime may decide to fire up more threads if circumstances require it to (e.g. having to process a lot of work at once).

So what exactly is the Task in this context? Simply put, it's a promise. It allows you to track the state of the operation, and, in combination with the await keyword, track its completion.

A Task is a promise for a better future

If Task represents a promise, TaskCompletionSource<T> implements a future:

private Task<long> DetermineNumber()  
{
    var tcs = new TaskCompletionSource<long>();

    // We perform some IO to determine the number, and register
    // for the result of that operation. When it returns, we set the 
    // value on the TaskCompletionSource, and by doing so, we'll
    // trigger completion for anyone awaiting the Task we return here.

    return tcs.Task;
} 

In the future, we'll have a long. We may not have a long now, but at least we have a promise (Task) that at some point, we'll have one, and we'll be notified when that future is.

A TaskCompletionSource<T> implements the producer side of a Task<T>. It has a Task property that can be handed out (and awaited!) to anyone interested in its result, and the result can be exclusively set through the use of its TrySetResult, TrySetCanceled and TrySetException methods (which, by the way, are thread safe).

Anyone with access to the TaskCompletionSource<T> can set its result. This means that, for example, in a client-server messaging scenario, you can register that you want to receive a specific message, obtain a Task that represents a promise that you will, which you can then await to determine when you receive that message.

I realize I'm doing a poor job explaining this one, but I hope you're getting an idea of what I'm getting at. Stephen Cleary does a better job explaining it in his blog post here.

Conclusion

While the above run down is merely a brief overview, and there's much more to it, we can conclude that the introduction of async and await and the supporting runtime and compiler additions provide a completely new approach to executing code on multiple threads.

Instead of worrying about what thread runs which piece of code, what should happen when that thread finishes work or throws an exception, and various implementation details, we can now focus on the actual work itself, and have the runtime quite reliably handle all of the overhead for us.

Details such as when a piece of code is executed are far less important to us than the fact that it will be executed. We gain built-in support for cancellation, continuations and notifications of completion at the cost of fine-grained control over the thread that executes the code. I'd say that's a worthwhile trade off.

And finally, as the title of this post implies: Tasks are not threads. I hope the various components in "modern" .NET asynchronous programming and how they work together are a bit more clear now.

Thank you for reading!