Eduasync 20: Changes between the VS11 Preview and the Visual Studio 11 Beta

A while I ago I blogged about what had changed under the hood of async between the CTP and the VS11 Preview. Well, now that the VS11 Beta is out, it's time to do it all again...

Note that the code in this post is in the Eduasync codebase, under a different solution (Eduasync VS11.sln). Many of the old existing projects won't compile with VS11 beta, but I'd rather leave them as they are for posterity, showing the evolution of the feature.

Stephen Toub has an excellent blog post covering some of this, so while I'll mention things he's covered, I won't go into much detail about them. Let's start off there though...

(EDIT: Stephen has also mailed me with some corrections, which I've edited in - mostly without indication, as the post has been up for less than seven hours, and it'll make for a better reading experience.)

Awaiter pattern changes

The awaiter pattern is now not just a pattern. The IsCompleted property and GetResult method are still "loose" but OnCompleted is now part of an interface: INotifyCompletion. Awaiters have to implement INotifyCompleted, but may also implement ICriticalNotifyCompletion and its UnsafeOnCompleted method.

The OnCompleted method is just as it was before, and needs to flow the execuction context; the UnsafeOnCompleted method is simpler, as it doesn't need to flow the execution context. All of this only matters if you're implementing your own awaiters, of course. (More details in Stephen's blog post. I've found this area somewhat confusing, so please do read his post carefully!)

Skeleton method changes

Just as I have previously, I'm using the (entirely unofficial) term "skeleton method" to mean the very short method created by the compiler with the same signature as an async method: this is the entry point to the async method, effectively, which creates and starts the state machine containing all the real logic.

There are two changes I've noticed in the skeleton method. Firstly, for some reason the state machine state numbers have changed. Whereas previously a state number of 0 meant "initial or running", positive values meant "between calls, or navigating back to the await expression" and -1 meant "finished", now -1 means "initial or running", non-negative means "between calls, or navigating back to await expression" and -2 means "finished". It's not clear why this change has been made, given that it requires an extra assignment at the start of every skeleton method (to set the state to -1).

More importantly, the skeleton method no longer calls MoveNext directly on the state machine that it's built. Instead, it calls Start<TStateMachine> on the AsyncTaskMethodBuilder<T> (or whichever method builder it's using). It passes the state machine by reference (presumably for efficiency), and TStateMachine is constrained to implement the now-public-and-in-mscorlib IAsyncStateMachine interface. I'll come back to the relationship between the state machine and the builder later on.

Task caching

(Code is in project 30: TaskCaching)

It's possible for an async method to complete entirely synchronously. In this situation, the result is known before the method returns, and the task returned by the method is already in the RanToCompletionState. If two tasks have already run to completion with the same value, they can be (apparently) regarded as equivalent... so the beta now caches a task in this situation, for some types and values. (Apparently the preview cached too, but I hadn't noticed, and the beta caches more.) According to my experiments and some comments:

  • For int, tasks with values -1 to 8 inclusive are cached
  • For bool, both values (true and false)
  • For char, byte, sbyte, short, ushort, uint, long, ulong, IntPtr and UIntPtr tasks with value 0 (or '\0') are cached
  • For reference types, null is cached
  • For other types, no tasks are cached

EDIT: After they've completed, tasks are normally immutable except for disposal - the cached tasks are tweaked slightly to make disposal a no-op.

State machine interface changes

In the VS11 preview release, each state machine implemented an interface, but that interface was internal to the generated assembly, and contained a single method (SetMoveNextDelegate). It's now a public interface with two methods:

Personally I'm not keen on the naming of "MoveNext" - I can't help but feel that if we didn't have the "naming baggage" of IEnumerator and the fact that at least early on, the code generator was very similar to that used for iterator blocks, we'd have something different. (It is moving to the next state of the state machine, but it still doesn't quite feel right.) I'd favour something like "ContinueExecution". However, it doesn't matter - it obviously does what you'd expect, and you're not going to be calling this yourself.

SetStateMachine is a stranger beast. The documentation states:

Configures the state machine with a heap-allocated replica.

... which says almost nothing, really. The implementation is always simple, just like SetMoveNextDelegate was, although this time it delegates to the builder for the real work (a common theme, as we'll see):

void IAsyncStateMachine.SetStateMachine(IAsyncStateMachine param0)
{
    this.<>t__builder.SetStateMachine(param0);
}

Now AsyncTaskMethodBuilder.SetStateMachine is also documented pretty sparsely:

Associates the builder with the specified state machine.

Again, no real help. However, we'll see that it's the builder which is responsible for calling OnContinue now, and as it can call MoveNext on an IStateMachine, it makes sense to tell it which state machine it's associated with... but can't it do that directly?

Well, not quite. The problem (as I understand it) is around when boxing occurs. We initially create the state machine on the stack, and it contains the builder. (Both are structs.) That's fine until we need a continuation, but then we've got to be able to get back to the current state later, after the current stack frame has been blown away. So we need to box the state machine. That will create a copy of the current builder (within the box). We need the builder within the boxed state machine to contain a reference to the same box. So the order has to be:

  • Box the state machine
  • Tell the state machine about the boxed reference
  • The state machine tells its nested builder about the boxed reference

Back when the state machine was in charge of the boxing, this went via the delegate: the act of creating the box was implicit when creating the delegate, and then casting the delegate target to the interface type allowed a reference to the newly-created delegate to be set within the copy. This is similar, but using the builder instead. It's hard to follow, but of course it's not going to matter.

State machine field changes

There are various kinds of fields in the state machine:

  • Those corresponding with local variables and parameters in the async method
  • The state
  • The field(s) associated with awaiters
  • (In the preview/beta) The field associated with the "current execution stack" at the point of an await expression
  • (In the CTP) An unused "disposing" field

Of these, I believe only the awaiters have actually changed, but before we talk about that, let's revisit local variables.

Local variable hoisting

I've just noticed that the local variables are only hoisted to fields when its scope contains an await expressions, but in that case all local variables of that scope are hoisted, whether or not they're used "across" awaits. It would be possible to hoist only those which need to be maintained between executions, but then you wouldn't be able to see the others when debugging, which would be somewhat confusing. Likewise local variables of the same type which are never propagated across the same await could be aliased. For example, consider this async method:

static async Task<int> M(Random rng)
{
    int x = rng.Next(1000);
    int y = x + rng.Next(1000);
        
    await Task.Yield();
        
    int z = y + rng.Next(1000);
        
    await Task.Yield();
    return z;
}

If the compiler could be confident you didn't need to debug through this code, it could make do with one field of type "Random" and one field of type "int" - x can be a completely local variable in MoveNext (it's not used between two awaits) and y and z can be aliased (we never need the value of y after we've first written to z).

Local variable aliasing probably isn't particularly useful for "normal" methods as the JIT may be able to do it (so long as you don't have a debugger attached, potentially) but in this case we expect the state machine to be boxed at some point, so potentially does make a difference (while the stack is typically reasonably small, you could have a lot of outstanding async methods in some scenarios). Maybe in a future release, the C# compiler could have an aggressive optimization mode around this, to be turned on explicitly. (I don't think it should be  a particularly high priority, mind you.)

Awaiter fields

(Code is in project 31, AwaiterFields.)

Awaiter fields have changed a bit over the course of async's history.

In the CTPs (all of them, I believe) each await expression had its own awaiter field in the state machine, with a type corresponding to the declared awaiter type from the awaitable. (Remember that the awaitable is the thing you await, such as a task, and the awaiter is what you get back from calling GetAwaiter on the awaitable).

In the VS11 Preview, there was always a single awaiter field of type object. From what I saw, it was usually populated with a single-element array containing an awaiter. For value type awaiters (i.e. where the awaiter is a struct) this is somewhat similar to boxing, but maintaining strong typing, so calls to IsCompleted etc can still be made. It's possible that reference type awaiters were stored without the array creation, as it would serve no purpose. (I don't have any machines with just the preview installed to verify this.)

In the Beta, we have a mixture. If there are any reference type awaiters, they all end up being stored in a single field of type object, which is then cast back to the actual type when required. (Don't forget that only one awaiter can be "active" at a time, which makes this possible.) This includes awaiters of an interface type - it's only the compile-time type declared as the return type of the GetAwaiter method of the awaitable which is important.

If any of the awaiter types are value types, each of these types gets its own field. So there might be a TaskAwaiter<int> field and a TaskAwaiter<string> field, for example. However, there can still be "sharing" going on: if there are multiple await expressions all of the same value type awaiter, they will all share a single field. (This all feels a little like the JITting of generics, but it's somewhat coincidental.)

MoveNext method changes

(Code is in project 32, BetaStateMachine)

As I've mentioned earlier, the builder is now responsible for a lot more of the work than it was in earlier versions. The majority of the code remains the same as far as I can tell, in terms of handling branching, evaluating expressions with multiple await expressions and so on.  The code in the source repository shows what the complete state machine looks like, but for the sake of clarity, I'll just focus on a single snippet. If we have an await expression like this:

await x;

then the state machine code in the VS11 Preview would look something like this:

localTaskAwaiter = x.GetAwaiter();
if (localTaskAwaiter.IsCompleted)
{
    goto AwaitCompleted;
}
this.state = 1;
TaskAwaiter[] awaiterArray = { localTaskAwaiter };
this.awaiter = awaiterArray;
Action continuation = this.MoveNextDelegate;
if (continuation == null)
{
    Task<int> task = this.builder.Task;
    continuation = MoveNext;
    ((IStateMachine) continuation.Target).SetMoveNextDelegate(continuation);
}
awaiterArray[0].OnCompleted(continuation);
...
return;

(That's just setting up the await, of course - there's then the bit where the result is fetched, but that's less interesting. There's also the matter of doFinallyBodies.)

In the VS11 Beta, it's something this instead (for an awaiter type "AwaiterType" which implements ICriticalNotifyCompletion, in a state machine of type ThisStateMachine).

localAwaiter = x.GetAwaiter();
if (!localAwaiter.IsCompleted)
{
    state = 0;
    awaiterField = localAwaiter;
    builder.AwaitUnsafeOnCompleted<AwaiterType, ThisStateMachine>(ref localAwaiter, ref this);
    doFinallyBodies = false;
    return;
}

If the awaiter type only implements INotifyCompletion, it calls AwaitOnCompleted instead. Note how the calls are generic (but both type variables are constrained to implement appropriate interfaces) which avoids boxing.

The call to the builder will call back to the state machine's SetStateMachine method if this is the first awaiter that hasn't already completed within this execution of the async method. So that handles the section which checked for the continuation being null in the first block of code. Most of the rest of the change is explained by the difference in awaiter types, and obviously AwaitOnCompleted/AwaitUnsafeOnCompleted also ends up calling into OnCompleted on the awaiter itself.

Mutable value type awaiters

(Code is in project 32, MutableAwaiters)

One subtle difference which really shouldn't hurt people but is fun to explore is what happens if you have an awaiter which is a mutable value type. Due to the way awaiters were carefully handled pre-beta, mutations which were conducted as part of OnCompleted would still be visible in GetResult. That's not the case in the beta (as Stephen mentions in his blog post). Mind you, it doesn't mean that all mutations will be ignored... just ones in OnCompleted. A mutation from IsCompleted is still visible, as shown here:

public struct MutableAwaiter : INotifyCompletion
{
     private string message;

     public MutableAwaiter(string message)
     {
         this.message = message;
     }

     public bool IsCompleted
     {
         get
         { 
             message = "Set in IsCompleted";
             return false;
         }
     }

     public void OnCompleted(Action action)
     {
         message = "Set in OnCompleted";
         // Ick! Completes inline. Never mind, it's only a demo...
         action();
     }

     public string GetResult()
     {
         return message;
     }
}

What would you expect to be returned from this awaiter? You can verify that all three members are called... but "Set in IsCompleted" is returned. That's because IsCompleted is called before the awaiter value is copied into a field within the state machine. Even though the state machine passes the awaiter by reference, it's passing the local variable, which is of course a separate variable from the field.

I'm absolutely not suggesting that you should rely on any of this behaviour. If you really need to be able to mutate your awaiter, make it a reference type.

Conclusion

The main changes in the Beta are around the interactions between the AsyncTaskMethodBuilder (et al) and the state machine, including the new interfaces for awaiters. There's been quite a bit of optimization, although I still see room for a bit more:

  • When there's only a single kind of reference type awaiter, the field for storing it could be of that type rather than of type object, removing the need for an execution-time cast
  • The "stack" variable could be removed in some cases, and made into a specific type in many others
  • With appropriate optimization flags, local variables which aren't used await expressions could stay local to the state machine instead of being hoisted, and hoisted variables could be aliased in some cases.

One thing which concerns me slightly is how the C# language specification is going to change - the addition of the new interfaces is definitely going to mean more complexity from this previously "tidy" feature. I'm sure it's worth it for the sake of efficiency and the like, but part of me sighs at every added tweak.

So, is this now close to the finished version of async? Only time will tell. I haven't checked whether dynamic awaitables have finally been introduced... if they have, I'll put that in the next post.

Published Tue, Mar 6 2012 19:54 by skeet
Filed under: , , ,

Comments

# re: Eduasync 20: Changes between the VS11 Preview and the Visual Studio 11 Beta

Tasks aren't immutable; they can be disposed.

A quick look with ILSpy shows that the cached tasks are created with an undocumented flag InternalTaskOptions.DoNotDispose = (TaskCreationOptions)0x4000 that causes the Dispose() method to do nothing.

Also, your list of cached tasks is missing

- true and false

- 0 for short,ushort,decimal,IntPtr,UIntPtr

- null for all reference types and all Nullable<T>

Thursday, March 08, 2012 7:45 PM by Daniel Grunwald

# re: Eduasync 20: Changes between the VS11 Preview and the Visual Studio 11 Beta

@Daniel: Thanks, will edit those in, along with some other comments Stephen has mailed me.

Friday, March 09, 2012 12:19 AM by skeet

# re: Eduasync 20: Changes between the VS11 Preview and the Visual Studio 11 Beta

RE: "MoveNext" naming - maybe "Advance" would be a better name

Friday, March 09, 2012 5:35 AM by Joan Miro

# re: Eduasync 20: Changes between the VS11 Preview and the Visual Studio 11 Beta

`INotifyCompletion` sounds problematic. Pre-Beta one could just add a reference to an external library with the right types, set the target type to .net 4 instead of .net 4.5, and then use async-await on .net 4. Just like one could use linq on .net 2.0 .

`INotifyCompletion` might prevent that. Which would be very annoying for me, since it means that I can't use async-await on windows XP anymore, so I'll probably have to rewrite my code to no longer use async-await :(

Friday, March 16, 2012 1:15 PM by CodeInChaos

# re: Eduasync 20: Changes between the VS11 Preview and the Visual Studio 11 Beta

I've been thinking. Can the state machine be serialized so that my async method can continue the next time i launch the process?

That could be sweet for really long-running stuff:

if (await Bureaucracy.ReviewBoard.Accepts(myProposal))

{

 Yay();

}

Monday, April 02, 2012 11:45 AM by Markus Johnsson

# re: Eduasync 20: Changes between the VS11 Preview and the Visual Studio 11 Beta

@Markus: No, not that I'm aware of. It's an interesting idea though...

Tuesday, April 03, 2012 3:20 AM by skeet

# re: Eduasync 20: Changes between the VS11 Preview and the Visual Studio 11 Beta

While working on async/await decompilation support for ILSpy, I noticed that the Stream.CopyToAsync implementation in the .NET 4.5 Beta mscorlib still has those awaiter arrays, not multiple awaiter fields.

Do you know if Microsoft used some earlier version of the compiler for building the .NET 4.5 Beta BCL?

I can't get the .NET 4.5 Beta C# compiler to emit such code, it always uses multiple awaiter fields as you described.

Sunday, April 15, 2012 8:59 AM by Daniel Grunwald

# re: Eduasync 20: Changes between the VS11 Preview and the Visual Studio 11 Beta

@Daniel: No idea, I'm afraid. Maybe they've used a *later* version - one we haven't seen yet :)

Sunday, April 15, 2012 9:43 AM by skeet