The & operator (Concatenation)

Posted Sun, Apr 23 2006 23:08 by bill
In my previous blog posting I talked briefly about the + operator, and hopefully people realized that perhaps for string concatenation they should use the & operator. So I thought it probably best to also discuss the & operator in excruciating detail ;)
 
VB, unlike other lame languages, differentiates between concatenation operators and addition operators. This is useful not only for the late binding of intrinsic values, but can also be used elsewhere. For example, let's say you were writing a CodeDOM kind of language. you might find it very useful to be able to add CodeDOM elements together to return a list of CodeDOM elements, and at the same time have a concat operator that would return a qualified element via the concatenation of a namespace Element to a CodeMember element. Or perhaps you might have a Product object, where the + operator again returns a List(Of Product), whereas the concat operator returns a comma separated string ready for printing. Or perhaps serial numbers where + will increment the serial number, whereas concat would return the numbers hyphenated (such as suitable for billing purposes.
 
The list is probably endless. The key thing to note is that Addition and Concatenation really are two very different operations. VB provides for this distinction.
 
that said, onto the implementation details ….
 
In VB, arg1 & arg2 will return a string if the operands are numeric, date, char, or string. If an operand is DBNull it will be treated as String.Empty (""). That is, even with strict semantics the operands are converted to string if necessary and the result is a String.
from the language specification :

To make string concatenation simpler, for the purposes of operator resolution, all conversions to String are considered to be widening, regardless of whether strict semantics are used.
Although that is the general case, and was the case prior to VB.NET 2005, in Vb.NEt 2005 there's more to the story…
 
Prior to VB 2005, the & operand didn't work on non intrinsic types. For late bound code where an operand was declared As Object (either explicitly or implicitly), the & operation would look at the underlying type code of the operand, and if it was not an intrinsic type then the operation would fail.
 
In VB.NET 2005, the language introduced the ability to both consume and define operator overloading. So the issue of what to do with custom types and concatenation arose. Prior to this it wasn't an issue because you simply couldn't. The dilemma facing VB was the CLI specifications didn't differentiate between Addition and Concatenation.  VB decided that Concatenation and Addition differentiation was important enough that the team forged ahead with their own operator overloading.
op_Concatenate.
 
So in VB.NET 2005, if an operand is not an intrinsic type, then VB looks for an overloaded op_Concatenate in that type.  For example, let's define a class Foo as follows:
 
 
Public  Class  Foo
   Public  Shared  Operator  &(ByVal  left  As  StringByVal  right  As  Foo)  As  String
       Return  left  &  "-"  &  right.ToString
   End  Operator
End  Class
 
So code such as :
Debug.Print("name" & New Foo) ' would work
Debug.Print(New Foo & " some string") ' would NOT work
 
If you wanted the second case to work, you'd need to provide an overload of the & operator that takes the first argument as type Foo and the second as string ,and so forth.
Oh, and it should be noted that you do not have to return a type string, that's up to you to decide.
 
So this basically works like any operator overloading. The only issue is lame language like C# can't work out how to define the op_Concatenate, so the usefulness of this is limited to within VB pretty much.
 
One more thing to be aware of, is that although the language specification, language reference et all, all refer to the implicit widening to Strings, this is sadly not the case.  The language reference says:
If the data type of expression1 or expression2 is not String but widens to String, it is converted to String. If either of the data types does not widen to String, the compiler generates an error.

consider this code:

Public  Class  Foo
    Public  Shared  Widening  Operator  CType(ByVal  value  As  Foo)  As  String
        Return  value.ToString
    End  Operator
End  Class
 
Well there we allow an implict widening to string. So even with strict semantics we can write code such as :
  Dim x as String = New Foo
 
But we cannot write code such as :

   Debug.Print("name" & New Foo)
 
That code will fail both with strict semantics on and off.
 
Personally I think this is a bug and/or a design flaw.  If the implicit widening was honored, then we could implicitly use the & operator with classes written in other languages as long as they provide the Widening operator to string.  We could also dramatically simplify the definition of the & operator by only having to deal with operands of type string. So instead of writing :

 
Public  Class  Foo
   Public  Shared  Operator  &(ByVal  left  As  StringByVal  right  As  Foo)  As  String
       Return  left  &  "-"  &  right.ToString
   End  Operator
 
  Public  Shared  Operator  &(ByVal  left  As  FooByVal  right  As  String)  As  String
       Return  left.ToSting  &  "-"  &  right
   End  Operator
 
  Public  Shared  Operator  &(ByVal  left  As  StringByVal  right  As  StringAs  String
       Return  left  &  "-"  &  right
   End  Operator
End Class
 
and all the other permutations, we could simply write:
 
Public  Class  Foo
   Public  Shared  Operator  &(ByVal  left  As  StringByVal  right  As  StringAs  String
       Return  left  &  "-"  &  right
   End  Operator
   Public  Shared  Widening  Operator  CType(ByVal  value  As  Foo)  As  String
        Return  value.ToString
   End  Operator
End  Class

or even 

Public  Class  Foo
    Public  Shared  Widening  Operator  CType(ByVal  value  As  Foo)  As  String
        Return  value.ToString
   End  Operator
End  Class
 
if we wanted to forgo the custom formatting (hyphenation in this case) that our overload of the Concatenation operator allowed us to add.
 
Of course that wouldn't limit our ability to provide more specific overloads of the & operator,  but it would make it easier for use to work custom classes and classes from other languages.
 
So there you have it folks, the & operator, bugs warts and all ;)
 
 

 
 
 
 
Filed under: ,

Comments

# re: The & operator (Concatenation)

Monday, April 24, 2006 12:55 AM by David M. Kean

God help us all if users actually started to define these operators on reference types...

# re: The & operator (Concatenation)

Monday, April 24, 2006 1:36 AM by bill

<g> Not sure if you mean operators in general, or just on reference types.
With reference types that aren't sealed, I think instance functions are generally preferable, but that would be a broad critisism of the use of Public Shared (aka public static methods), not just operators.
On operator overloading, I think it usually best to minimise their usage. There's very few cases where the behaviour is always clear and intuitive. Conversion operators being the most common usage.
That of course leads us back to the "bug" or behaviour of & with custom types. I think it would be preferable that the behaviour does accept the widening operator to string if it does exist, as then people won't be defining the & operator rather they'll just use the & operator on the strings, e.g:

x = foo1 & "-" & foo2

# Why doesn't concatenation work with non intrinsic types ?

Monday, April 24, 2006 9:55 PM by @ Head

If you used VB.NET, you might have noticed that you can't do concatenation on non intrinsic types. For...

# Obscure language bug #29: Intrinsics and operator overloading

Tuesday, April 25, 2006 3:25 PM by Panopticon Central

# The &operator

Sunday, June 24, 2007 9:53 PM by @ Head

I just read again today the claim that you should use + for string concatenation in VB. You should NOT