Cleaning up your XML literal namespaces

Posted Sat, Nov 24 2007 15:16 by bill

If you use XML literals in your code, adding one to another:

Dim e1 = <a:books></a:books>
dim e2 = <a:book></a:book>
e1.Add(e2)

You will have the xmlns declaration repeated in each of the elements, when really it is only needed once per the document or outer element. The problem is caused by VB adding a xmlns declaration as an attribute to the root element. It can get a bit more complex if you have duplicate namespace declarations with different prefixes.  So I decided to write a CleanUpNS extension, that keeps the xml written clean by removing un-necessary namespace declarations. To use it, simply add a call to CleanUpNS to the end of your literals, e.g:

Dim e1 = <a:books></a:books>.CleanUpNS
dim e2 = <a:book></a:book>.CleanUpNS
e1.Add(e2)

 

   <Runtime.CompilerServices.Extension()> _

 Function CleanUpNS(ByVal el As XElement) As XElement

      Dim current = el.LastAttribute

      Do While current IsNot Nothing

         Dim temp = current.PreviousAttribute

         If current.IsNamespaceDeclaration AndAlso el.Name.NamespaceName = current.Value Then

            current.Remove()

         End If

         current = temp

      Loop

      Return el

   End Function

 

I go into more details about how this works and how the XML is stored and emitted in my January On VB article in Visual Studio Magazine

Filed under: , , , , ,

Comments

# Cleaning &raquo; Blog Archive &raquo; Cleaning up your XML literal namespaces

Pingback from  Cleaning  &raquo; Blog Archive   &raquo; Cleaning up your XML literal namespaces

# re: Cleaning up your XML literal namespaces

Saturday, November 24, 2007 9:30 AM by Avner

Hi Bill,

XML Literals provides a similar functionality if you use embedded expressions, however when you use the "Add" method you are simply getting the semantics of LINQ to XML API. Thus we recommend using embedded expression instead of the "Add" method.

Regards,

Avner

# re: Cleaning up your XML literal namespaces

Saturday, November 24, 2007 10:13 AM by bill

Hi Avner,

I wouldn't call it the semantics of LINQ to XML, more the semantics of VB.  VB adds an attribute to the element, which is different than if you were to use the Xelement constructor. It'd be an extra step. When you use an embedded expression, VB then goes about removign the attributes it thinks it might have added ;)

That works in most/some cases but not all, and relies on you defining the namespace via the Imports statement.  That of course only allows one default namespace declaration per .vb code file. If you find yourself actually using default namespaces for example defined inside the XML literals you will get duplication even when using placeholders, e.g:

Dim e2 = <book xmlns="abc"></book>

Dim e1 = <books xmlns="abc"><%= e2 %></books>

That's where my CleanUpNS comes into play ;)

# re: Cleaning up your XML literal namespaces

Saturday, November 24, 2007 10:25 AM by Avner Aharoni

You are right, it does require using the "Imports" for the XML namespaces used in the literal. However if you do Import your Xml namespaces and use embedded expressions to add elements, then VB will take care of this functionality so users who follows this pattern do not need to use this function.

Regards,

Avner

# re: Cleaning up your XML literal namespaces

Saturday, November 24, 2007 10:44 AM by bill

Right. But considering the VB compiler keeps track of what namespaces it adds, and passes them to it's internal clean up namespace method, it should be able to include in that list any defined xmnls attributes in the literals.  So really, code such as :

Dim e2 = <book xmlns="abc"></book>

Dim e1 = <books xmlns="abc"><%= e2 %></books>

should have the second xmlns removed.

:)

# re: Cleaning up your XML literal namespaces

Monday, November 26, 2007 1:48 PM by Avner Aharoni

Our design for removing namespaces was as follows:

1) We only move (bubble up), remove, and add namespaces that are declared in the "Imports" statement

2) We only remove namespaces on elements that were added to elements using embedded expression.

The Reason for this design is that if a user puts xml namespace declaration on an element, then moving/removing it would change the output, although it is easy to guarantee that it will not affect the semantics, some people may care that the XML that was generated is not similar enough to the literal that created it. However if the users declare the XML namespaces using the "Imports" statement, it means that they trust VB compiler to output the XML in the most efficient way, thus we move and eliminate namespaces declarations.

One thing to remember, that although we can ensure that all the namespace manipulations ends up generating a completely equivalent XML document, there are a lot of applications that supports only their custom XML format that does not follow all the XML rules (XAML is one example), thus with the current design we think we can get these scenarios covered as well.

Hope it helps,

Avner

# re: Cleaning up your XML literal namespaces

Monday, November 26, 2007 1:52 PM by David Schach

Hi Bill,

The compiler can certainly do what you suggest.  The issue is how much it should change the user's Xml.  We discussed this case and made a decision to only remove declarations that the compiler added and leave the user declarations alone. As Avner pointed out, the workaround is to use the Imports statement.  

Regards,

David

# re: Cleaning up your XML literal namespaces

Monday, November 26, 2007 6:02 PM by bill

Hi Avner and David,

I agree it's nice the way the VB compiler tracks the namespace attributes it adds (those declared as Imports), and then removes those same ones from any elements added as embedded expressions.

The question as Avner says is one of having the developer decide what the XML exactly looks like.

Let's say for example I do specifically add a xmlns, and then add it as an embedded expressions into an element in which I am using Imports. If the xmlns is the same, the VB compiler will remove it, even though it didn't add it, e.g:

Imports <xmlns:a="books.com">

...

Dim book1 = <a:book xmlns:a="books.com">my first book</a:book>

Dim books = <a:books><%= book1 %></a:books>

So the compiler is not *pure* in only removing what it adds. Instead it removes anything that matches what it has added to the root node. This of course makes sense as the added element could be generated anywhere. But it is not a pure fidelity of what the developer wrote. And I'm happy with that as for pure fidelity there is the framework approach that can be used instead.

Which leads us back to the example I gave earlier of default namespaces. The problem is two-fold.  If you declare the Imports at file or project level you are stuck with only one allowed default namespace for the entire code file (or project) What if I want to create two different based files both with default namespaces ?  I can't, and have the same experience of embedded attributes cleaning up the code for me.

Perhaps this removal of namespaces should have been explicit. Maybe it can still be.  If we look at the cases we have:

(a) remove only the xmlns declarations that match those Imported into the parent literal

(b) remove all duplicate namespaces

(c) remove none

So if this was a directive for an embedded expression, it could be expressed as an Enum, RemoveImportsOnly, RemoveAll, RemoveNone, where RemoveImportsOnly would be 0 or the default.

Personally though, I would much rather have the default be to Remove All duplicates, because if I want duplicates I can do that in the framework. What I can't do in Vb9 easily today is remove all duplicates when I'm working with more than one default namespace.

# re: Cleaning up your XML literal namespaces

Thursday, December 06, 2007 10:15 PM by niemguy

Thanks for putting this out there.  One question - I have a situation where I'd like to generate an XElement that will have a lot of children and then append that to a root document.  The structure would look something like so:

<root>

<child1>...lots of chitlins...</child1>

<child2>....more of the same...</child2>

...

</root>

I have delegated the responsibility of creating child1 to childN to a seperate class (I'm looping through a datatable and passing the datarow to that class).  The issue I have is my namespaces get spelled out on child1 through childN, ex:

<child1 xmlns="blah" ...>

<child2 xmlns="blah" ...>

I tried your extension but it didn't work for this situation.

Is there any way around this?  I am wanting to switch from using a class structure that mimics my schema and serializing.  So far in my tests this is much faster than that, but I am just picky about my doc size.  Having those namespaces spelled out repeatedly substantially adds to the doc size when I'm adding hundreds or thousands of child elements.

Any insight would be appreciated!

Chris

# re: Cleaning up your XML literal namespaces

Thursday, December 06, 2007 11:53 PM by bill

My extension only removes namespace declaration attributes in the element you call it on.  If you are seeing the xmlns being output it is either because the element has a different namespace or the default namespace you are using is not declared at the root level.  The XElement will include the default xmlns where needed if it hasn't already been defined.

So the solution in your case is to declare the xmlns on the root element, either explicitly in the XML literal or by using the Imports declaration.

# More on XML Namespaces in VB....

Sunday, December 09, 2007 7:20 AM by @ Head

A couple of weeks ago I wrote about XML Namespace issues in VB: one in particular was to do with namespace

# Removing duplicate namespaces in XML Literals (Shyam Namboodiripad)

Wednesday, June 16, 2010 1:59 PM by The Visual Basic Team

A common problem that one often runs into with XML literals and the LINQ to XML API is duplicate XML