April 2008 - Posts

LINQ To XML: building documents with a functional approach
Tue, Apr 29 2008 23:11

In the previous post I've talked about some of the classes you'll find in this new API. Today we'll see how easy it is to create a new XML document with the new API. Most of the time, you'll end up working with the XElement or XAttribute classes. Lets start with an existing XML tree:

<clients>
  <client id="1">
    <name>Luis</name>
    <address>
      <street>Some place</street>
      <zipCode>YYY Somewhere</zipCode>
    </address>
  </client>
  <client id="2">
    <name>John</name>
    <address>
      <street>Some other place</street>
      <zipCode>EEE Somewhere</zipCode>
    </address>
  </client>
</clients>

Creating this kind of document on the fly is really simple since we're talking about a new lightweight XML API based on an functional construction approach. In practice, this means that the following code represents the previous XML tree:

 

var xml = new XElement(
                   "clients",
                       new XElement(
                           "client",
                               new XAttribute("id", 1),
                               new XElement("name", "Luis"),
                               new XElement("address",
                                   new XElement("street", "Some place"),
                                   new XElement("zipCode", "YYY Somewhere")
                               )
                           ),     
                       new XElement("client",
                           new XAttribute("id", 2),
                               new XElement("name", "John"),
                               new XElement("address",
                                   new XElement("street", "Some otherplace"),
                                   new XElement("zipCode", "EEE Somewhere")
                               )
                           )
               );

As you can see, this is much better than, say, using the traditional MS .NET DOM implementation (ie, than using the XMLDocument et al classes). Notice also that in the previous example I've just created an XML tree (and not an XML document!). This kind of approach is only possible because the XElement class defines a constructor that receives several parameters, as you can from the following snippet:

public XElement(XName name, params object[] content)

Internally, the constructor will end up calling the AddContentSkipNotify (internal) method which will do its best to parse the content that is being passed to it. Do notice that it might receive a simple string, or maybe an attribute...or maybe an attribute and the element's content. who knows?  The best part is that we don't need to worry about that: we only need to enjoy it.

Ok, now there's more cool stuff here! If you've looked carefully at the signature of the previous constructor, you've surely noticed that it receives an XName, but we're simply passing a string. XName is a new class which represents the name of an element or attribute that exists on an XML tree. Fortunately for us, the MS team implemented the string implicit operator which is responsible for converting a string into an XName instance. pretty neat, if you ask me.

So, by now you may be thinking that "this is all good, but I will always use a namespace to set the scope of my XML documents". Good for you! You should keep doing it! To show how you can add namespaces to the previous example, lets add all the elements to a new default namespace (http://msmvps.com/blogs/luisabreu/xmlTests). The new XNamespace class represents a namespace and does implement an implicit string operator which means that you can convert a string to a namespace by simply writing the following code:

XNamespace ns = "http://msmvps.com/blogs/luisabreu/xmlTests";

And now, you can put all the elements on this namespace by just doing this:

var xml = new XElement(
                 ns + "clients",

...

Do notice that you need to keep adding ns to the names of all the other elements that belong to that namespace (if you don't, they'll be put on the global namespace). If you try printing the xml, you'll see that you have the following:

<clients xmlns="http://msmvps.com/blogs/luisabreu/xmlTests">

...

Great! It's important to note that I don't really need to create a new XNamespace explicitly. I could just have written this:

var xml = new XElement(
                    "{http://msmvps.com/blogs/luisabreu/xmlTests}clients",

Internally, XName implicit string operator will do the parsing and ends up creating a namespace that is used to scope the elements maintained on the tree. You can also associate a prefix with a namespace. For instance, lets suppose that only the top <clients> element is defined on the previous namespace and that I want to use a prefix to show that association. Here's the code I'd need to write:

var xml = new XElement(
                  ns + "clients",
                  new XAttribute(XNamespace.Xmlns + "my", ns),

...

Really similar to the previous case, but we need to add a xmlns attribute with the name of the prefix and associate it with the correct namespace.

Now, I'm not sure on what you're thinking, but I can assure you that  I'm finding this stuff really cool (just like the new stuff that C# 3.0 introduced) and I'm having lots of fun using it on my apps. More to come on the next days! keep tuned!

by luisabreu | 1 comment(s)
Filed under:
LINQ To XML: I'm hooked!
Mon, Apr 28 2008 22:21

I've just started doing some LINQ To XML and I can assure you that the new API is really great! Just take a look at the new Object Model introduced by the System.XML.Linq assembly:

linqToXMLModel

Today I'll just cover the basics (really a quick presentation of the most important elements you can find in the new API), but I'm thinking on writing more posts on this subject. XObject is the top base element introduced by this new XML API. It's main objective is to let you add annotations to any element. You can see an annotation as out of band info that you can add to any element or node that is defined on an XML document. You can add a new annotation by calling the AddAnnotation method. As you'd expect, you can remove an annotation by calling the RemoveAnnotations (do notice the plural!). Interestingly, you remove an annotation by passing the type of the annotation you want to remove. Btw, the Annotations method will return a list of annotations and will also take a type to return all the annotations that match. Do keep in mind that you can  pass Object as the type and that will let you go over all the existing attributes. Here's a really dumb example that shows how to use these methods:

var xml = new XElement("myElem", "Hello Xml");
xml.AddAnnotation("Note 1");
xml.AddAnnotation("Note 2");
xml.AddAnnotation(10);

After creating an element called myElemn (which only has text content - in this case, it only has the "Hello Xml" text node), we add two string annotations and one integer annotation. If you want to go through all the annotations, you'll need to call the Annotations method. As I've said, you'll need to pass the type of annotation you want to get. If you want everything, you can simply pass Object, as the next snippet shows:

foreach (var note in xml.Annotations()) { Console.WriteLine(note); }

ok, simple stuff, right? Again, removing an existing annotation is really really simple: just call the RemoveAnnotation method and pass it a type. The next snippet shows you how to remove all the string annotations:

xml.RemoveAnnotations<String>();

All of these methods have overloads: one that uses generics and another that receives the type as a parameter. Now, when I used Reflector to see the code, I couldn't stop noticing that there is duplicate code across both implementations which ends up doing the same thing. wtf? Not sure on why the generic method doesn't just redirect to the non-generic one by passing typeof(T)...I want to believe that I'm missing something...

Still on the annotations, I'm assuming that you'll be using this only for adding important info to a node or an attribute. For instance, if you're building a XML tree from an existing order object, you might find it useful to add that order object as an annotation to your top XML element order node so that you can reuse it later when you're processing the XML tree.

Going back to the model, and since today it's all about the basics, I'll just wrap this post by presenting each of the elements I've just shown you in the previous figure:

  • XAttribute: You've guessed it! You'll use this object to represent an existing attribute.
  • XNode: the base class of all of the XML nodes that might exist on a XML tree. This class adds several important methods that will let you go through some of the axis that exist on an element. For instance, you can call the Ancestors method to get a collection with all the parent nodes of the current element;
  • XComment: used for letting you add XML comments to a XML tree;
  • XContainer: used for all elements that can be used as containers in a document. In practice, this means an XML element or an XML document.
  • XDocumentType: Lets you add a DTD to an XML document. It's important to notice that DTD support is limited on this API (If  you want to know more, read this);
  • XProcessingInstruction: as you can infer from its name, we're talking about processing instructions here.
  • XText: used to represent a text node. If you need escaping, then you should use the XCData class.
  • XDocument: represents an XML document. Most of the time, you'll end up working with the XElement class. However, if you need set up a processing instruction or if you need to set a DTD, you'll necessarily have to use a XDocument instance.

Besides those classes, there at least 2 important classes defined on the System.XML.Linq namespace:

  • XName: used to represent an XML name. As you'll see in future posts, this class makes working with XML names quite painless (I do really love this class!);
  • XNamespace: used to represent an XML namespace. As you'll see in future posts, this and previous class are priceless when you think on querying or building an XML tree that uses namespaces.

And I believe this is quite long for an intro post. We'll start with the cool stuff on the next day. Keep tuned.

by luisabreu | 3 comment(s)
Filed under:
C# and Nullable value types
Sat, Apr 26 2008 12:29

.NET 2.0 introduced Nullable value types. As you surely know by now, you cannot set a value type to null. Here's an example:

Int32 myVar = null; //error: myVar cannot be set to null

This can be easily solved by transforming myVar into a Nullable type:

Nullable<Int32> myVar = null;

If you prefer, you can apply the ? suffix to the variable type:

Int32? myVar = null;

Nullable value types are great when, for instance, you're getting a value from a table's column and that column can have NULL. In fact, C# 2.0 has even introduced the coalescing operator (??) which is rather cool when you're checking for null:

Int32 myVar = null;
Console.WriteLine( myVar ?? 123);

The main objective of this post is not to talk about nullable value types. In fact, there are already some great posts on this...for instance, check this post by ScottGu.

What most people still haven't understood is that nullable value  types are, in fact, supported by the CLR. You can see this in Boxing/Unboxing operations. To understand what's going on, let's present an example. Suppose you have a method that expects an Object parameter and you're passing a Nullable<Int32> which is set to null. If the CLR boxed the Nullable<Int32> (don't forget that Nullable is a struct!), then you'd get a "valid" object in the method, even though the Nullable<Int32> is set to null.

To solve this, the CLR will always check the value of the Nullable<T> variable and if it's null, it will simply pass null instead of running boxing operation over that value type. A similar thing happens with unboxing. When you do that, the CLR will convert the boxed instance to T or Nullable<T>. Typically, when you perform an unboxing operation, the CLR won't allocate heap memory (this is what happens when you have  a non null nullable value boxed). However, when you have a null Nullable<T> value boxed and you unbox it, the CLR will have to allocate memory (in the heap) to save that value.

Another place where the CLR treats Nullable in a special way is when you call the GetType method. Calling this method means getting the type of the instance over which you're calling the method. However, in this case, calling GetType over a Nullable<T> instance will always return T instead of Nullable<T>. The CLR will also treat interface methods specially when you call them over Nullable<T> instaces: it will always cast T to interface instead of Nullable<T>!

As you can see, this behavior is necessary so that things "works as expected" making nullable value types transparent to the programmer.

You might find strange that I'm still writing posts about C# 2.0. After all, we've already got C# 3.0, right? I've decided writing about some 2.0 features because I've noticed that many C# programmers still don't understand how some of the 2.0 stuff works and without getting those features right, you won't be able to get much from C# 3.0. So you can expect to see more about C# in the next couple of days (that is, If I manage to find some free time to write about it:))

by luisabreu | with no comments
Filed under:
Type inference in C# 3.0
Sat, Apr 26 2008 11:54

My friend Paulo published a post that talks about type inference and how it might evolve in the next years. Even though type inference's main objective is to support LINQ (in fact, most of the new stuff that C#3.0 introduced is only there to support LINQ), I must say that I've been enjoying it a lot and I've been using it whenever possible. Initially, I wasn't really into using var because I thought that it would reduce the readability of my code and I've only started using it after installing one of the nightly builds of R#. In fact, I was one of the first to complain on the R# forums about it.

Anyway, I decided to give it a try and guess what? I've noticed that using this approach will not reduce my code readability. In fact, I've noticed an improvement on my local variables' names. I know that this might look really strange at first, but believe me: give it a try and I bet you'll end up loving it (and let me tell you that this is helping my with my carpal tunnel syndrome :))

by luisabreu | with no comments
Filed under: ,
C# Gotchas
Wed, Apr 16 2008 13:47

[Update: Thanks to Mike for uncovering a bug on the sample. It should be IDummy.Increment and not Dumy.Increment. Thanks Mike and now I think it should compile without any problems.]

In these last days I've been reading the C# spec and you can say that I've been re-discovering it :) If you really think about it, I think that it's fair to say that most people will only use the basic features of the language. And since many things work as they should (ie, they work in a logical manner), most of us don't take the necessary time to study the spec.

Let me show you a little example that will help me make a point about this:) Look at the following example and tell me what happens (this example is based on an existing example that is shown on the C# spec):

using System;

namespace CSharpGotchas
{
    public interface IDummy
    {
        void Increment();
    }

    public struct IDummy : IDummy
    {
        public Int32 Counter; 
        void IDummy.Increment()
        {
            Counter++;
            Console.WriteLine(Counter);
        } 
    }

    public class GenericDummy<T> where T : IDummy, new()
    {
        public void Increment()
        {
            var aux = new T();
            aux.Increment();
            aux.Increment();
            aux.Increment();
        }
    }

    internal class Program
    {
        private static void Main(string[] args)
        {
            var aux = new GenericDummy<Dummy>();
            aux.Increment();
        }
    }
}

I've showed this example to several guys I know (I think that most of them had at least 1 year experience in C# programming) and only I knew the correct answer.

btw, if you've been using C# and .NET for some time, haven't read the spec and don't see anything that might make you think twice before answering, then you should be worried :)

Ok, the problem here is what should be printed on the console. Should we get 1, 1, 1 or should we get 1, 2, 3? if Dummy was a class, there really wouldn't be any problems with the code since you would always be working with a reference element. But since Dummy is a struct, will we have boxing when calling Increment? (Do notice that I've implemented the IDummy interface explicitly, making it accessible only through an interface reference).

If you're thinking that the previous code will print 1, 2, 3, you're right! Why? well, because the spec explictly says that there are certain cases where it won't box a struct. One of those cases happens when a struct overrides a method inherited from the System.Object class. Invocation of that method will be made directly over that instance of the struct (ie, you won't be getting a boxed instance over which the method call is made).

Another place where you don't get implicit boxing is when you acess a member or variable of a constrained type parameter (that is the case shown on the previous example). On the other hand, if you box the element by casting it to the interface, you'll always get 1. Here's an example that shows this:

public class GenericDummy<T> where T : IDummy, new()
{
        public void Increment()
        {
            var aux = new T();
            ((IDummy)aux).Increment();
            ((IDummy)aux).Increment();
            ((IDummy)aux).Increment();
        }
}

btw, I'm curious: did you know this?