Closures… Implicit or Explicit ??
Posted
Wed, Apr 5 2006 17:27
by
bill
Personally, I prefer to think of "closures" as being "state machines". That is their primary purpose is to maintain state. Let's take an example using some theoretical syntax..
Sub Test()
Dim count as Int32 =0
Dim del as New SimpleDelegate(){count +=1 : Console.Writeline(count)}
' do other code here that might or might not change count
' invoke del
del()
End sub
In the above code I've shown a theoretical anonymous delegate syntax using the count variable from the enclosing method. So as the value of count is that of the local variable when the delegate is invoked, we have to change the reference to the local to that of a field. So what happens is the compiler would create a class for you along the lines of
Private Class _ClassForMethodTest 'or some weird compiler generated name
Public count as Int32
Public Sub AnAnonDelegate()
Me.count +=1
Console.Writeline(Me.count)
End Sub
End Class
now the method Test would be compiled as if the code was:
Sub Test()
Dim c1 as New _ClassForMethodTest
c1.count = 0
Dim del as New SimpleDelegate(addressof c1.AnAnonDelegate)
' do other code here that might or might not change count
' invoke del
del()
End sub
So we now have the state of the local variable persisting as long as the delegate is alive. Note: a delegate keeps a reference to the object in which the delegate resides, hence keeping the object itself alive.
To understand why this is important we need to consider some of the different scenarios in which we might use this kind of code, a basic premise being that it will be with LINQ and DLINQ statements, anonymous methods and lambda expressions.
-
With LINQ, DLINQ, the combination of projections
-
" " " delayed evaluation
-
With a wide range of local variables including reference types, finalizable objects and disposable object
Take the last scenario for example, and substitute "count" with an object that is Disposable. You wouldn't want the object being disposed of before the delegate is invoked as that would just end up throwing a null exception of some sort. So there we do want the local variable to be kept alive.
We definitely wouldn't want a copy to be made, as that might end up in using excessive resources and actually break referential integrity.
Now imagine the scenario where we pass the anonymous delegate out to another object, and the local variable is indeed a reference type. Copying would be out of the question, and yet we need the object lifetime to be that of the delegate. Hence the "closure" provides a means of safely maintaining state. The local variable reference needs to be stored in the closure for it's lifetime to be maintained as long as the anonymous delegate's lifetime.
This of course may not always be what you'd want.. For example
Dim person as New Person("Fred")
Dim del as New SimpleDelegate(){Console.WriteLine(person.Name)}
person = New Person("Wilma")
del()
The above would print out "Wilma" even though you might want it to print out "Fred". This is basically where a lot of the
comments on Paul's blog entry are currently centered.
Okay, so let's assume you want that to print out Fred. How could that possibly happen ? One way might be to force evaluation of the code inside the delegate there and then, but that's not always going to work as the delegate could have parameters passed to it, and you might need those parameters to retrieve the property. So we can't evaluate it there and then. What we end up having to do is in fact a closure, a class with a field that points to the person object. But at this point we could take two different paths. One would be to have person variable actually be c1.person, the other not. So in either case we'd still have the same basic closure:
Private Class _ClassForMethodTest 'or some weird compiler generated name
Public person as Person
Public Sub AnAnonDelegate()
Console.WriteLine(Me.person.Name)
End Sub
End Class
The question is should the calling code be
Dim c1 As New _ClassForMethodTest
c1.person = New Person("Fred")
Dim del as New SimpleDelegate(AddressOf c1.AnAnonDelegate)
c1.person = New Person("Wilma")
del()
hence printing out "Wilma", or :
Dim c1 As New _ClassForMethodTest
Dim person As New Person("Fred")
c1.Person = person
Dim del as New SimpleDelegate(AddressOf c1.AnAnonDelegate)
person = New Person("Wilma")
del()
hence printing out "Fred"
so the last sample, the one that prints out "Fred" de-couples the state. In the example where state is maintained you could provide the same decoupling by introducing a new local variable:
Dim person as New Person("Fred")
Dim temp as Person = person
Dim del as New SimpleDelegate(){Console.WriteLine(temp.Name)}
person = New Person("Wilma")
del()
that code would achieve the same, also printing out "Fred" even when compiled using a state maintaining closure.
But what if you wanted it to print out Wilma. Well in that case only the stateful approach would work. You could of course write your own closure to achieve the same, but then you'd loose the anonymous delegate encapsulation. With LINQ code it'd get incredibly frustrating and complex to write a stateful query.
So if we didn't have closures maintaining state we'd need a simpler syntax to indicate that we wanted state maintained. the inverse however is not necessarily true, as we can use temporary local variable to decouple state. So if we were to say no new special keywords, then I think the argument for closures being stateful by default pretty much wins out.
If however we say we'll use a keyword then we have two options. One to indicate a closure is stateful, or one to indicate it is not. To indicate stateful, I think something like "ref" or " ByRef" could be used. To indicate decoupling, perhaps something like "Eval" or ValueOf or "ByVal" could be used.
Although I don't like them as much as the other alternatives I've listed, "ByRef" and "ByVal" would have the least impact on any existing code as they already have special keyword status.
This possibly leaves the door open to somewhat spurious arguments such as ByVal is the default for parameters so should be for all code in anonymous delegates or lambda expressions.
Perhaps some examples are needed here:
Dim person as New Person("Fred")
Dim temp as Person = person
Dim del as New SimpleDelegate(){Console.WriteLine(ByVal(person).Name)}
person = New Person("Wilma")
del()
and :
Dim person as New Person("Fred")
Dim temp as Person = person
Dim del as New SimpleDelegate(){Console.WriteLine(ByRef(person).Name)}
person = New Person("Wilma")
del()
would print out Fred in the first case and Wilma in the second.
This probably raises yet another question, should the ByVal|ByRef be required to be explicit in all cases where a closure is created with that variable ?
I could see ByVal being handy in places just like how we use to use ()'s in Vb6 and earlier. (note I would NOT want to see the return of that behavior by using ()'s alone) I could also see using ByVal would be handy when passing a Property into a method where the parameter is ByRef.
One of my biggest concerns would be is if ByVal was the default behavior and the ByVal optional or omitted. you could then have seemingly identical code in C# which was very different from VB code. That's not a desirable thing. it obviously shouldn't be the driving force, but it's yet another thing to consider.
I think my vote is for the default behavior to be ByRef, and that you could optionally use ByVal to decouple. That is ByRef would be implicit. All in all, I think that would be the most desirable behavior. If ByVal was the default, I think I would like to see it be explicit, even if it means the code editor would add it for you.