February 2008 - Posts

Rethinking “Nearly VB”
Thu, Feb 28 2008 9:56

Bill McCarthy added a comment to my blog which I wanted to answer:

 

So why not use VB for the templates but C# for the initial output rather than some "Nearly VB" . Doesn't C# address every issue you've raised ?

But I am curious as to what about issues that are language specific, such as declarative event wiring, optional parameters etc ?

C# fixes the majority of the issues I raised, except ambiguity in closing brackets. If you assume that the closing of a structure will always be at the same level and outside embedded expressions such that you maintain symmetry in relation to the evaluation stack, you can resolve the closing brackets. Retaining symmetry in closing brackets means that the following will work:

  

   Return _

   <code>

      if (x == 1)

      {

         <%= MoreStuffFunction() %>

      }

   </code>.Value

Any variation of the following will not work:

   Return _

   <code>

      if (x == 1)

      {

         <%= MoreStuffFunction() %>

         <%= "}" %>

   </code>.Value

Which means among other things you cannot do:

   Return _

   <code>

      if (x == 1)

      { 

          <%= StuffFunction() %>

         <%= If(z, "}", MoreStuffFunction() & "}") %>

    </code>.Value

But you can rearrange it to:

   Return _

   <code>

      if (x == 1)

      {

      <%= StuffFunction() %>

      <%= If(z, String.Empty, MoreStuffFunction()) %>

      }

   </code>.Value

Bill believes this symmetry restriction is less onerous than the restrictions I placed on VB, especially the open/close parentheses on method calls. Another significant value to the C# first approach is that it’s much easier to recognize equals comparisons in assignment statements, and some of the null comparison problems I’m currently ignoring will be lessened because C# does not allow certain comparisons with nullable that VB allows.

While Bill convinced me that the C# first template was not nearly as difficult as I imagined it, by convincing me the restriction on the location of the close brackets in symmetry with the open was reasonable. However, he didn’t convince me to change my current work. VB first is the best scenario for my current client and I think if we have the possibility we should try to supply both so people can write and maintain their templates with the output code that they prefer, and prefer to debug the first version of the output in. Hopefully this can happen, but the most important thing to me at the moment is getting a working version out to you to play with – I don’t want to derail that with a second template converter/preprocessor. If someone else wants to work on that…J let me know.

by Kathleen | with no comments
Filed under:
Template Languages and "Nearly VB"
Mon, Feb 25 2008 8:12

The templates I’ve been talking about require very specific language features of the VB compiler and language neutral templates do not allow any ambiguity in the code output in the initial template.

The template itself must be in VB because it’s required for embedded XML – the code blocks. The code blocks are essential for understanding which code to translate when creating an alternate language template in a pre-processor. Code in strings would be impossible (or nearly so) to translate at the template level and translating at the output level would have many issues including debugging and performance. There are tools available that translate normal source code, and you could do that, but I’m not sure why. It’s a lot of extra variables, when translating the template offers faster performance and more reproducible results. Sticking with template translation - the code block clearly indicates to the template preprocessor where to switch into translation mode.

The language output by the initial template must be VB, or “nearly VB.” Even if your primary interest is C#, a language neutral solution requires that the initial template have no ambiguity. Sticking with familiar and well supported languages is helpful because the initial output can be tested in VB, isolating problems in the template from any in the template translator/precompiler. This requires a non-ambiguous language I’ll call “Nearly VB”. If you’re strictly interested in C#, and have no interest in language neutrality, you can, of course, use VB’s XML literal code blocks to directly output C# code.

Ambiguity breaks the ability to build language neutral templates because the preprocessor has very little idea of the current context. It cannot understand whether a particular close curly bracket is an End If, a Next, an End Get or something else. Unfortunately, Visual Basic is not totally ambiguity free either, which forces the concept of “Nearly VB” rather than just normal VB. Nearly VB has one syntax change and a couple of extra rules when compared to VB.

VB is ambiguous on parentheses. It uses parentheses to include both method parameters and indices. VB is also ambiguous when it comes to case. To solve this in templates, use square brackets to indicate indices and parentheses for normal method calls. The C# compiler will help you find the problems when your C# output files fail to compile. The VB output can easily replace the square brackets with parentheses when outputting VB files.

At the moment I’m not convinced that the other meaning of square brackets – allowing identifiers to match keywords – need to be supported very well. There aren’t that many keywords and simply avoiding them seems an easier solution. You can support them if you escape the character via the \x20 escape pattern and the ASCII character (/x28). OK, that’s not very pretty, so a shorter escape sequence may make sense if people run into this very often.

Case insensitive is really another way to say “case ambiguous”. Language neutral templates require that you correctly case all symbols, the preprocessor can manage the keywords it’s translating, but you’ve got to get the symbols correct. Consider a Symbols class with constants, which also provides Intellisense while you’re creating your templates.

VB is sloppy in not forcing you to include the open/close brackets after a call to a method that does not have parameters. In a broader perspective this is ambiguous because in C# the presence or absence of the parentheses indicate whether you want to call the method or grab the delegate. While that particular ambiguity is resolvable because VB would require the AddressOf operator (or a lambda expression), I’m not tracking symbols. So I don’t know whether your symbol is a method, variable, or property. Thus, I don’t know whether the parentheses is needed. For language neutral templates, you add the parentheses on all method calls.

NOTE: I actually explored whether this problem is solvable, and I believe it is not. I don’t think it’s that much to ask you to include the parentheses correctly – it’s just a place we VB coders have historically been lazy.

So, to allow language neutral templates:

  • Use basic VB syntax
  • Use square brackets instead of parentheses for indexes
  • Maintain consistent case for all symbols
  • Include open/close parentheses for all method calls
  • Avoid keywords as symbol names or escape the surrounding brackets with the XML escape sequence
  • Rather obviously, avoid features unique to VB

I’ll do another post in a few weeks on issues around spots the two languages inherently work differently. There will be more items on this list, particularly around the management of nulls in relation to operators.

I do not dream that I’ve covered everything. The only way to ensure language neutral templates is to create them, ensure the code is are syntax correct, compile and run in VB and then create the similar code in C# and make sure you valid syntax, clean compile, and can run the finished applications. After the upcoming preprocessor has been out for a while we will have a better idea how you can break it and *** the holes where you can. But issues that involve ambiguity will have to be solved by the template author.

Validation Information in Metadata
Tue, Feb 19 2008 8:17

Mike asks:

Just curious if your metadata also contains validation rules or not?  Things like property is required or range of valid values.

It could include them in three possible ways – it currently uses one and I’ve had two others working in the past that may be resurrected.

The metadata that the database inherently knows is automatically transferred - this would be nulls and string length. How well nulls are handled is up to the architecture, but the metadata definitely knows what's nullable.

I've experimented with two additional approaches: using extended properties and parsing the TSQL of check constraints. The first would work for simple ranges and other predictable data sets, but it puts information in an unexpected place. I currently can’t justify it over placing validation in known places in the handcrafted code.

Using check constraints leverages existing information so is a "good" thing. Unfortunately, no one ever seemed to care about the months of work I put into that five years ago so I let it stagnate. Since I know more now, I could resurrect that work, but honestly I don't think I'll get time soon.

The problem is that most people just don't put check constraints in the database very often. I find that unfortunate for many reasons, but it becomes a chicken and the egg problem. People don’t put the constraints in the database because they’ll have to restate them in the business layer for decent usability. This initiative doesn’t get attention to solve that problem, because the check constraints aren’t already there. Perhaps the time is ripe now. I would love to include check constraint based validation in the Open Source version that we plan to start up on Code Plex this week or next (public within thirty days after) – at least a framework for it.

Check constraints are closely related to defaults because both require parsing TSQL. Turns out, over the years folks have been primarily interested in defaults of “now”, new guids, and raw values. Today or Now are pretty easy because it’s just a straight up translation between a SQL function and a .NET function. Any straight up translations like that can be defined in sort of a metametadata (hate that phrase) layer. I handle all three of these scenarios in my metadata extraction tool (a metadata extraction tool will be part of the CodePlex project).

I think validation should be stated in the business layer in rules. I wasn’t doing this five years ago so the whole process of incorporating validation from check constraints will be vastly simpler. Instead of code to code, you need to recognize a category – such as a bound range (the most important) and parse out the bounds into a structure usability by a specific rule. Then another rule is “there’s a check constraint and I think you need to validate based on it, but I can’t write it so you need to.” The architecture could enforce some code being written in response to that rule. To state the change from five years ago, the metadata wouldn’t contain code but the statement of which rule and its parameters.

Validation in triggers would seem, at least to my weak TSQL mind, to be exceedingly difficult.

So, the basic answer to Mike’s question is “some, but not all of the really important scenarios are covered, and I don’t think you’ll ever cover all scenarios”

Template PreProcessor
Sun, Feb 17 2008 9:08

I better say it up front, because it will quickly become obvious. I am not a computer science graduate. I have never written a compiler. It was quite a route to get my thinking in line with this particular problem and I’m sure it will evolve further.

As I said in my post on Friday, one part of my solving the VB/C# problem without making unreadable templates is a preprocessor. I struggled with what to call it because the real pattern is – create the VB templates, run the processor to create the C# templates, execute the C# or VB templates. So is it really a pre processor? I am still calling it that as it is before running the C# templates so I ‘m thinking of it as an optional pre-processor.

The result is modified templates. A second set of source code and a second template assembly.

The first decision I faced was how much context I was going to demand for any decision. More context, more sophisticated decisions. You could attempt to build a full syntactic tool that understands the structure of your output code and knows a great deal about what you are accomplishing. This may or may not be possible, and will certainly require restrictions on what template code is legal because evaluating multiple paths will be a nightmare and stray strings can result in legal templates, but won’t provide the same evaluation. You may be able to; I’m choosing not to tackle that and decided on least possible context.

The absolute minimum of understanding about the template being converted is which of a finite set of states you are in. Possible states are:

  • Template logic (the code that runs the template)
  • Comments
  • Code blocks
  • Expressions
  • Conditional blocks within code blocks
  • Possibly additional states around declarations, for loops and using statements

My first attempt was line based. Faster, easier to recognize comments and nearly impossible to ever restructure the line wrapping correctly. Trust me, that route did not go well.

A week ago Friday, nearly in tears, I told my son Ben “Look, I told everybody I could do this, and Carl just posted that show. And I am doing it, except I think the bugs I am facing with end of line issues are not solvable.”

My brilliant son said “Why on earth are you doing it that way – do character by character.”

“What, rewrite the whole thing?” maybe I cried.

The rewrite actually went pretty well, painful as it was to abandon nearly completed code. It was made easier by the fact I really do not care about performance. This is a template translation. The converted templates will be compiled and blazingly fast. I can take a second or two a template to do the translation. Thus I can skip all that compiler theory that I never learned about managing buffers and look aheads and all that. A bit of brute force with the simplest possible RegEx.

I’m basically looking at the entire template as a string. I step character by character through the string doing a substring check starting at the current position. I avoid the dumbest of the .NET mistakes such as copying the substrings unnecessarily and I do concatenate via a string builder so performance doesn’t suck too badly. And I do restrict what I’m looking for to what makes sense in context. But I don’t worry that I am looking at the next handful of characters an excessive number of times.

I start off in the template logic. I output the template logic character by character until I find a character sequence that indicates a new mode. I’m keeping this simple by managing both the modes and the required stack via the call stack. Meaning, when I shift into a new mode, such as the Comment mode I just call a method called TranslateComment. Comments are easy - just change the start character and read to the end of the line for output. I need comments treated differently because a code block in a comment should not be translated.

For now, I’m making the restriction that code blocks – blocks to output – must be exactly <code>stuff</code>. This makes parsing a bit easier than allowing any element name. If I’m in template logic and hit a code block, I know I need to start translating. I start looking for sequences that need conversion Me as a word, If, For Each, End If, Next, etc. This list is pretty short right now, I expect the preprocessor to evolve.

If I’m in a code block and I hit an embedded expression (<%= ) I switch back to template logic mode. This is not precise but its close enough. Characters are output exactly until I hit another code block because this is template logic, not output code. If you concatenate strings in there, you’re toast, but you can call methods that are in the VB/C# namespaces.

There are some special cases around code constructs. I recognize an If block by searching for the Then and taking what’s between as a code expression that needs translation. Wherever I’m translating expressions I just use a simple replacement because it’s really just separate symbols.

The preprocessor is simple and focused on what’s actually needed, not boiling the ocean. It will evolve as far as it needs to, staying well shy of both the power and usability issues of the CodeDOM – we just don’t need that for business templates in VB and C#.

Whew! I could write tons more on glitch little details of this preprocessor that’s really eaten my last couple of weeks. It’s one of the pieces I want to get Open Source early on.

Catching up on Blogs – Conceptual Space
Fri, Feb 15 2008 10:53

I’ve been catching up on blogs and ran across this from Zlatko from Dec. 14.

His basic point is that EF is more than an OR/M mapper because it works in a conceptual space between the object layer and the database – creating a third layer.

I’m very happy that Zlatko said this. It articulates something I’ve never articulated well. The metadata is not a representation of the object layer – it is a way of thinking described in metadata that can be thought of as entities, or abstractions, or something else rather vague and fluffy – see I have problems explaining it.

Entity Framework does pins down this previously mind based abstraction. It’s a subtle shift but it exposes how we think about objects, and now gives us a word for it – the conceptual model. It takes what was previously a mind cloud that we shared by implication from metadata definitions and makes it something we can visual in a drawing.

Unless I’m entirely missing the point though, I do not buy that the existence of this layer is new. I think most or all of us that do metadata based code generation have been doing this for years.

But it is not trivial and it is important to articulate and create a visualizer for something that we’ve just been doing between our ears by implication. It’s part of what makes the implications of EF for metadata for all code generation significant.

The EF conceptual and metadata layers are important even though its current incarnation comes up a bit short in richness and in ease of access. We can fix both these with some effort – I’m loving the moment in time we’re living in and just wishing I had twice as much time to work each day.

by Kathleen | with no comments
I Hate it When I Learn from Dilbert
Fri, Feb 15 2008 8:50

Do we all live in fear of that moment when we notice that we’re the one on the other side of Dilbert? When Dilbert is wise and well, we’re not.

Two weeks ago I was writing a long paper explaining some nuances about the state of the templates at that time and asking my client not to reject it until he had looked into it and really understood it. So, in the next morning’s Dilbert strip someone comes to Dilbert and says “I’ll tell you my idea if you promise not to reject it until thinking about it” and Dilbert says “I already rejected it because only putrid ideas come with warnings”

So I spend the better part of the weekend rationalizing that my idea really doesn’t fall into that category.

And then I spring out of bed at 6AM Monday morning (I sort of wish that part was a joke) with the solution. So, let’s look at the problem today and the solution in the next post:

Yesterday’s code was:

 

Private Function MemberGetPrimaryKey() As String
      Return OutputFunction( _
                           Symbols.Method.GetPrimaryKey, _
                            Scope.Protected, _
                            MemberModifiers.Overrides, _
                            ObjectData.PrimaryKey.NetType, _
Function() _
<code>
   Return m<%= ObjectData.PrimaryKeys(0).Name %>
</code>.Value)
End Function

 

It’s easy enough to ditch the Return statement with a constant. I put these constants in a class and imported the class, which allowed me to directly access the constant, although it was in a different file:

 

Private Function MemberGetPrimaryKey() As String
      Return OutputFunction( _
                           Symbols.Method.GetPrimaryKey, _
                            Scope.Protected, _
                            MemberModifiers.Overrides, _
                            ObjectData.PrimaryKey.NetType, _
Function() _
<code>
   <%= returnString %> m<%= ObjectData.PrimaryKeys(0).Name %>
</code>.Value)
End Function

 

How bad is that?

But as Bill pointed out in comments on the last post, if all I’m doing is returning a value, I don’t need the code block at all and can teach the OutputFunction method to do the job. So, switching to a more complex and common example, and remembering that I’m out to solve the C#/VB single template problem to allow a single template for any architecture, I took these concept a few steps further. The result of a more complex method becomes:

 

  Private Function MemberPropertyAccessSet(ByVal propertyData As IPropertyData) As String
         Return _
<code>
         CanWriteProperty("<%= propertyData.Name %>", true)
         <%= OutputConditional("m" & propertyData.Name & " <> value", _
            Function() _
            <code>
            m<%= propertyData.Name %> = value
            PropertyHasChanged("<%= propertyData.Name %>")
            </code>.Value) %>
</code>.Value
      End If
   End Function

 

Which for a single language is the same as merely doing:

 

 Private Function MemberPropertyAccessSet(ByVal propertyData As IPropertyData) As String
         Return _
<code>
         CanWriteProperty("<%= propertyData.Name %>", true)
         If m<%= propertyData.Name %> &lt;<> value
            m<%= propertyData.Name %> = value
            PropertyHasChanged("<%= propertyData.Name %>")
        End If
</code>.Value
End Function

 

Which would you rather debug? Imagine debugging the templates for an even more complex routine.

This spawned my Dilbert moment. If I have to convince you of the wisdom of this, then maybe it’s not so wise. So, what I jumped out of bed to do was build a preprocessor – converting templates that output VB into templates that output C#. By removing that technical restriction, we can find the sweet spot between reducing typos and obfuscating the code logic. I think that is about where the first and last code fragments in this post are. What do you think?

Avoiding Typos via Output Methods
Thu, Feb 14 2008 12:14

One of the issues with the code generation templates is that they do not test the syntax of the output as you type. I’m a VB coder, and that would be my fantasy, an editor that told me whether my templates produced valid output as I type. That’s nearly impossible to do, so don’t hold your breath.

In the meantime, you may have code like the following where

 

Private Function MemberGetPrimaryKey2() As String
      Return _
<code>
   Protected Overrides Function GetPrimaryKey() as <%= ObjectData.PrimaryKey.NetType %>
      Return m<%= ObjectData.PrimaryKeys(0).Name %>
   End Function
</code>.Value
End Function

 

Any typos between the <code> elements will result in dozens or hundreds of compiler errors when the output code is incorporated in your project. This is a pain in the neck to deal with, so anything we can do to have less typos is desirable.

When you create a UI for your users, you limit the number of mistakes the user can make via techniques like combo boxes. We can take advantage of Visual Studio’s editor to do a similar thing.

Your output code has logic within subroutines, functions and properties. While this code is trivial in the example above – just a return statement – your code will generally involve more complex logic. It’s important that you see this logic to evaluate it as you’re maintaining templates. The actual function declaration however, is not logic.

I created methods to output the enclosing declarations, as well as other non-logic based structures. This transforms the code above into:

 

Private Function MemberGetPrimaryKey() As String
      Return OutputFunction( _
                           "GetPrimaryKey", _
                            Scope.Protected, _
                            MemberModifiers.Overrides, _
                            ObjectData.PrimaryKey.NetType, _
Function() _
<code>
   Return m<%= ObjectData.PrimaryKeys(0).Name %>
</code>.Value)
End Function

 

I’ve spread this out for clarity.

You can still typo the name of the method and the word Return. While you can make a bad selection, you cannot make a typo in anything else. And the parameters of the output function remind you of the types of modifiers that make sense. Under the covers, the OutputFunction creates a FunctionInfo object. While I’m not using it here, the OutputFunction method accepts a paramarray of ParmaeterInfo objects if you’re function needs parameters. Again, you can typo symbol names, but nothing else. Of course since the OutputFunction is within the code active in the IDE, you get full Intellisense, information blocks, background compilation and all the good stuff.

I’m using a lambda expression. In this case, it creates an in line delegate used to output the function body. If this template method becomes unduly complex, you could also use VB’s AddressOf operator to call a separate method as a delegate. In this case, the delegate signature I expect has no parameters and returns a string. Since the <code>…</code>.Value returns a string, it’s an effective delegate.

The FunctionInfo object includes an attribute collection. Thus, any attribute you desire to place on the function can be assigned by explicitly instantiating the function info object, rather than using the helper function.

This is quite similar to the OutputClass and OutputRegion methods I’ve showed earlier, but it takes the idea of using explicit method calls in the template to reduce the opportunity for typos in the output.

Output symbol typos are a problem, and you can avoid this through an enum or constants. You’ll use some of these constants across many templates and there will be a lot of them across your templates so I’d suggest you keep things clean by creating classes that contain your symbols. I created a namespace called “Symbols” and classes for Type, Method, Interface, etc. This gives nice clean Intellisense and makes it easier to find symbols in the constant list. Thus the code above becomes:

 

Private Function MemberGetPrimaryKey() As String
      Return OutputFunction( _
                           Symbols.Method.GetPrimaryKey, _
                            Scope.Protected, _
                            MemberModifiers.Overrides, _
                            ObjectData.PrimaryKey.NetType, _
Function() _
<code>
   Return m<%= ObjectData.PrimaryKeys(0).Name %>
</code>.Value)
End Function

 

That leaves “Return” as the only remaining opportunity for a typo – which is the subject of tomorrow’s post.

Runtime vs. Design/Compile Time
Thu, Feb 14 2008 11:21

Chris asks:

At what point with code gen / templating do you start to think about doing all this codegen at runtime instead of compile time?

And if we were to be doing it at runtime would be be better served by using a dynamic language such as ruby to program in?

That's a good point. In a perfect world, there would be no need for code generation. We would write nothing but business/domain specific code and everything else would just happen. But for well over twenty years we've been aiming for that perfect world and we seem only a few baby steps closer than in 1987.

In an imperfect world, we have two basic choices, manage an architecture and run a lot of code at design time or manage an architecture and run a lot of code at runtime. Both require a fair amount of configuration.

And both can offer the real long term benefit of switching away from coding code – which is transitory and dying before you even finish coding. We want to switch away from coding code and toward creating metadata which is a true business representation. Of course metadata changes. But it changes at the speed the business changes – not due to artificial technology shifts.

For my effort – I want the extra code run at design time to offer the best possible runtime behavior, including performance. I can extend an architecture I’m expressing in templates far easier than I can extend an architecture I am expressing via an OR/M tool. I also want to debug directly into code specific to the problem at hand – I want to debug through generated code, not an OR/M engine.

The next round of effort at making plumbing simpler from Microsoft is also code generation – Entity Framework. When someone gets the plumbing correct and we truly never need to care, we can turn off the code gen and go direct to whatever structure, however its’ done and completely ignore the problems code gen is primarily used for today. In the long term, we shouldn’t care about anything except the business problem we’re solving.

Dynamic languages certainly change the architecture. I'm looking forward to exploring them as the overall knowledge base expands. But we've been sitting with a pretty dynamic language in our laps for many years, the majority of us have programmed in it, and with the exception of precisely one person - everyone I know varies between mild distaste and downright hatred for it. I think a lot of Javascript's problems have been related to debugging and platform issues, and I realize that there are differences between Ruby/Python and Javascript. However, if we didn't fall in love with the dynamic aspects of Javascript in the last ten years, I remain slightly skeptical about dynamic languages in the next few years. Some of the differences, and our attitudes and skills may change as a result, are that the new languages have a broader platform base and we're increasing our understanding of how to actually use them as opposed to hacking them enough to solve some trivial website detail.

But at the core, the technique for expressing metadata into a working application is not half as important as metadata at the core of the application, however its expressed.

Comments Fixed
Thu, Feb 14 2008 10:41

For someone that writes software for a living, I have a remarkably hard time using it. I would not have expected "Filter: Ignore" to display no new comments. Ignoring a filter would be more like showing everything.

Happily I have friends that are as patient as I am confused. Thanks to Bill McCarthy and Susan Bradley (who reset my password which was lost in the bowels of my system and I wanted to switch to Live Writer) my blog is slightly more functional.

My apologies to the folks that wrote comments that I seemingly ignored for the last several weeks. They should be fixed, and please let me know if you have difficulties.

I'm still approving anonymous comments so that will sometimes take a while. Non-anonymous comments go live immediately. Unless I start getting spammed too badly.

by Kathleen | 1 comment(s)
Filed under:
Open Source
Wed, Feb 13 2008 9:31

It’s occurred to me that if you are following this and my DNR TV show a logical reaction would be “OK, so that’s a lot of hot air, where do I get it?” I intend for all of this to be released Open Source, on whichever site s hot when I release it. I hope I’ll start releasing pieces in just a matter of weeks. It will help a lot if it becomes “we” instead of “me”. So, if you’re interested n this stuff and you want to help, let me know and you can get this stuff directly as its rapidly evolving too much to publically post right now.

My current expectation of the order of release:

  • Metadata interfaces
  • GenDotNet metadata load
  • EDMX metadata load
  • Template infrastructure – what’s needed for the language neutral templates
  • Simple harness
  • Activity based harness

If you’re interested, you can contact me directly at Kathleen@mvps.org

Isolating Metadata
Wed, Feb 13 2008 9:26

In code generation, metadata is the information about your application, generally about your database and definitions to express your data as business objects. If you use Entity Framework, your metadata is the edmx file which is displayed via the designers. If you’re using CodeSmith, the metadata is more subtle. Metadata can also be about the process itself. CodeBreeze in particular has a very rich set and extensible set of information about your application.

Since metadata itself is data – information - we can store it many ways. I’ve used XML for years. CodeSmith has used a couple of mechanisms including XML. Entity Framework uses XML. Metadata can also come directly from a database, although I think this is a remarkably bad idea and one of my code generation principles is not to do that – you need a level of indirection and isolation surrounding your database.

What I haven’t talked about before how valuable it is to have another layer of indirection between your metadata storage structure – your XML schema – and your templates. In my XSLT templates I could provide this only through a common schema – you can morph your XML into my schema so that’s indirection – right?

No, that’s not really indirection. It’s great to be back in .NET classes with real tools for isolation and abstraction. Now I use a set of interfaces for common metadata constructs such as objects, properties and criteria. I can then offer any number of sets of metadata wrappers that implement these interfaces via a factory.

 

MetadataIsolation

 

The template programs only against the interfaces. The template could care less whether I am using entity framework, my own metadata tools, or something entirely different. I can write the same template and use it against Entity Framework’s edmx file or any other metadata format. That’s powerful stuff. Especially since you already heard that the template will run against C# or VB. That means in my world the only reason to have more than one set of templates against an architecture like CSLA is that they are pushing the boundaries and actually doing different things.

But if you don’t like this new templating style, you can use classes based on exactly the same interfaces in CodeSmtih (at least) and again free your framework and metadata extraction. You’ll still need VB/C# versions there, but you’re metadata input can use the same interfaces.

The interfaces is expressed in sets of classes that know how to load themselves from a data source. Each set uses a different metadata source – different XML structures or other format.

Isolated metadata removes your templates from caring what the metadata source is – beyond being something that could successfully fill a set of classes that implement the data interfaces. This is a very important step and one we need to work together to get right. What do you think I've left out of the current design?

Minnesota VSTS User Group - Code Generation in 2008
Wed, Feb 13 2008 8:57

I'm stepping out of my comfy sandbox again. Not only am I going to Minnesota in February, but I'm also talking to a VSTS user group - to remember that it's not just about code.

You can get more information here.

I am really looking forward to this talk and sharing perspectives with people that I'm hoping connect on some of the process questions about code generation. While I dropped the ball on updating the abstract, this talk won't just be templating techniques - I'll spend a good bit of time on the research I've been doing moving to an activity metaphor for code generation - raising above the simple process declarations. This is very fun stuff and I'm excited to find a group interested in sharing it. If you're in the area, please come by.

by Kathleen | with no comments
Filed under: ,
New Jersey SQL Server User Group - SQL Server Stored Procedure Code Generation in 2008
Tue, Feb 12 2008 9:55

If you can make it to Parsippany, New Jersey on Tuesday Feb. 19, I'll be talking about code generation. This is a SQL group that's been kind enough to host a hard core .NET geek. The common ground is stored procedures and making them easier to write, and especially to maintain. They've said they'll be gentle, and I hope to learn a thing or two. It's always nice to step out of my comfort zone,and I think I can offer a great night.

So, if you live in the area, I'll see you at the meeting!

by Kathleen | with no comments
Filed under:
CurrentTypeInfo and the Context Stack
Tue, Feb 12 2008 9:46

Creating templates requires a lot of access to the thing you’re currently creating. That’s the current output type, which as I discussed in yesterday I suffix with “Info.” The CurrentTypeInfo is thus what you’re currently outputting.

I neglected to clarify in that post that the Data and the Info classes are in entirely different assemblies. The Data classes could be used with any .NET generation mechanism, including (at least) XML Literal code generation and CodeSmith. The Info objects are relatively specific to my style of code generation.

The CurrentTypeInfo may not be the same throughout the file.

There are a few reasons to combine multiple classes or enums in a file. In some cases, that’s to nest them, and in some cases it’s just to logically combine them in a file. While FxCop has declared them unfashionable, I find nested classes tremendously valuable, especially for organizing Intellisense and keeping well structured classes and naming. If you’re working with nested classes, there is a good chance you’ll need to access not only the current type, but also the encapsulating type. I use a stack for this, and give control of pushing and popping TypeInfo objects from the stack to the templates themselves.

TemplateInheritanceDotNetBase3

The base item on the stack is the current outer most class in the current file. Once you’ve pulled a class off the stack, such as by popping the base and adding a new base TypeInfo, you can’t access the previous version unless you’ve saved it.

Here’s where you see the flaw I mentioned yesterday – these classes could be in separate namespaces, and I don’t allow for that – yet. I’ll fix it later.

Remember the TypeInfo is the thing you’re outputting. The entity definition it’s built from is ObjectData in my semantics.

The stack is an extremely useful construct for this scenario. You have quick access to information about the class you’re currently outputting. You’ll frequently need this for type information and perhaps calling shared/static methods. You don’t want to recreate its name every time you use it because that would be redundant and hard to maintain. You can also access any of the containing classes, which again is useful in defining types and calling shared/static methods.

While I haven’t done this in CodeSmith, I expect this technique to be viable there. I’m not sure on other platforms, but it’s not specific to XML literal code generation.

Two Parallel Entities - Metadata and Info
Mon, Feb 11 2008 11:08

One of the confusing things about templating is that you are writing two programs simultaneously and there is no way around it. My brilliant son may write a templating language for a class project, and this is exactly what he wants to address – that and the issue of escaping. You can’t avoid it, he just thinks they should look different – I’m hoping he doesn’t write one in Magyar.

One of the programs you’re writing (or maintaining) is the template program. There is logic in any non-trivial template, regardless of the template style. The other program is the one you’re outputting that will eventually run on your clients computer. In ASP.NET style templates, the output program is the overall template, and the logic is inserted. In XSLT and XML literal generation, the logic of the template program is the overall code and the output template is nested inside.

You are also working with two sets of data – the metadata you’re using as source and what you’re currently outputting. I call the elements in the XML I use as source metadata for .NET templates Object and Property for clarity in the XML. If I called them that in the templates, I’d encounter great confusion between the object definition in metadata and the class I am outputting. That’s particularly painful when it comes to properties.

I solve this by calling suffixing all metadata with the word “Data”. Thus I have ObjectData, PropertyData, CriteriaData, etc. Each of these classes contains the metadata on the corresponding item. I might have an ObjectData for a Customer that had a PropertyData for FirstName. The PropertyData would include things like the maximum length of the field and its caption.

I also need to describe the output – at least the way I’m building templates. I need information about the file I’m creating, the type I’m creating, and in certain locations in the template, the property, function, constructor, etc I’m creating. I identify these by suffixing them with the word “Info”. Thus I have a ClassInfo, PropertyInfo, FunctionInfo, etc. I do not have a CriteriaInfo because I am never outputting a .NET thing called a Criteria. It’s strictly metadata for .NET features.

In the template design I’m describing in this series, the DotNetBase class contains information on the file being output. To be picky, the namespace can actually be associated with part of a file, and I may add this flexibility, but I don’t do that very often in business code, so I have included namespaces at the file level.

 TemplateInheritanceDotNetBase2

Regardless of the template mechanism you use, you need to maintain a clear separation between the template logic and the output logic, and between the input metadata and state about what you’re creating.

Next post: CurrentTypeInfo and the Context Stack

Inheritance and Templates
Mon, Feb 11 2008 10:25

One of the most valuable things about templates written in .NET code of any style is the ability to use inheritance. This is classic inheritance where local state allows you to push functionality into base class methods that would become cumbersome and have excess dependencies if treated as external utility methods.

Inheritance also allows polymorphism, which is nothing more than a big word for exchangeability – OK, that’s actually a bigger word. Templates are called by something. Life is simpler and harnesses practical if that something is known and predictable, thus it’s logical to have an interface in a common location as a contract to the template interface. If the base class implements this interface and provides functionality via MustInherit/abstract members, there’s further capacity for evolution.

TemplateInheritanceHierarchy

There are three assemblies involved. The template interface is in a common assembly available to all other assemblies. The base classes are in a support assembly which can be reused across many template sets. The templates themselves are in a separate assembly. For now, I have all templates associated with a project in one assembly, but I may change that to isolate the SQL and the .NET templates. In any case, these template assemblies are sets of templates that work together.

Inheritance let’s you push common functionality into the base class. Now that we have extension methods they offer an additional mechanism to remove common functionality from the leaf class. I know Scott Hanselman has said he’ll cut off the pinkies of anyone who uses extension methods on classes you own, and I’ll show you when I get to that in a few days why I disagree (although it’s in the DRN TV show). But that explains why my base classes are focused on state more than functionality. The key state issue is what you’re currently working on. In a template, state is metadata. In a .NET template, the important metadata is the definition for the entity you’re currently creating. In my templates, the ObjectData member of the DotNetBase contains this information.

TemplateInheritanceDotNetBase

Stay tuned for the distinction between the input data you’re working on and the thing you’re creating, which explains some of those other members.

.NET Template Organization
Sun, Feb 10 2008 17:55

 

So, now that you know where I’m going with this – language neutral templates – I want to step back to the basics. Even if you don’t want to build language neutral templates, there are things to learn along the way about making good XML literal templates, and why this approach might be better than your current code gen mechanism. Note, the examples in this post are NOT language neutral. Not surprisingly, there are specific requirements for language neutral templates and I want to show basic templating with XML literals, then the language neutral templates.

If you look at the XML literal sample I posted a few days ago you’ll notice that the entire contents of the output DataPortalFetch method is created within the template method “MemberDataPortalFetch.” This provide important organization as the most common challenging task in code generation is “I have a problem in my output code right here, how do I find where that is in the template. In a simple template this isn’t too hard, but in more complex templates such as CSLA it can be quite challenging.

Creating templates with an extremely predictable structure is very valuable in creating maintainable templates. To start with each template should have a one to one correspondence with an output class.

Within this class, the mechanism of the predictable structure is one of the primary differences from codes that are code with template segments (XML literal code generation) and templates with code segments (ASP.NET style template). In an ASP.NET template, the template must parallels the output and this organizes the template. In an XML literal template, you provide the organization with Visual Studio then providing navigation. Code is organized my regions, nested classes and member. Each is its own method preceded by Region, NestedClass or Member. This organizes the template. The entry points create a hierarchy working down to the local code (navigation mechanism #1):

   Protected Overrides Function GenerateFile() As String
      Return _
<code>
   &lt;Serializable()> _
Public Class <%= mObjectData.ClassName %>
   Inherits BusinessBase(Of <%= mObjectData.ClassName %>)
   <%= RegionBusinessMethods() %>
   <%= RegionValidationRules() %>
   <%= RegionAuthorizationRules() %>
   <%= RegionFactoryMethods() %>
   <%= RegionDataAccess() %>
   <%= RegionExists() %>
End Class
</code>.Value
   End Function

And

#Region "Business Methods"
   Private Function RegionBusinessMethods() As String
      Return _
<code>
#Region " Business Methods "

   <%= From prop In mObjectData.Properties Select MemberPropertyFields(prop) %>
   <%= From child In mObjectData.Children Select MemberChildFields(child) %>
   <%= From prop In mObjectData.Properties Select MemberPropertyAccess(prop) %>
   <%= From child In mObjectData.Children Select MemberChildAccess(child) %>
   <%= MemberIsValid() %>
   <%= MemberIsDirty() %>
   <%= MemberGetId() %>

#End Region
</code>.Value

   End Function

The carefully named methods also allow you to use alphabetical tools in Visual Studio including the combo box in the upper right of the editor and a class diagram (navigation mechanism #2) which works only because you can predict the name of the member you want from the name of the output member and no other members begin with “Member”, “Region” or “NestedClass”

The third navigation approach is that the template matches the structure of the output as closely as practical. Thus if the output has a region names “Business Methods” the template does as well. The order of items in the template is top to bottom closely paralleling the order of the output. This allows you to cruise down in the file.

Template organization is the first step to great templates.

The Punch Line
Sun, Feb 10 2008 17:21

 

I mentioned a few days ago that there was a punch line for the XML Literal Code Generation. I planned to unveil this slowly, but it just sprung out of the box when Carl posted episode #102 (which I thought was due for next Friday.

And I’m afraid that I need to add that I was sick during the taping and my brain running at half capacity. If anything isn’t clear, please let me know.

No matter.

You get to hear sooner.

I’ll be unveiling the details in about the same time frame, so you’ll have the big picture by next week.

So, the drumroll please…

I can create excellent readable templates that output code in Visual Basic or C# - no CodeDOM involved.

I’ll show you how to do that. Watch for more here.

Code for DNR TV # 97 and #98 Sample Code
Sun, Feb 10 2008 17:14

Sometimes I drop the ball and I did on getting these samples posted. Note that this is for the two episodes I did on 3.5 languages. I think these episodes are a good run through of the features in both languages. The title wound up with the word "Compare" in the title. I'm not sure I really compare them, beyond seeing them side by side, which I think is useful to understand the underlying mechanisms.

You can get the code here.

I will not be posting code for the code generation episode which is episode #102. That code was too transitory for me to want to release. I will be releasing similar code in my blog over the next week or so.

by Kathleen | with no comments
Filed under: , ,
XML Literal Code Genaration - Code again again
Sun, Feb 10 2008 10:53

Crap.

 For now, I'm just removing the coloring. Paste this into VS for coloring. It's much prettier:

   Private Function MemberDataPortalFetch() As String
      ' TODO: Add special handing for timestamp
      Return _
<code>
   Private Overloads Sub DataPortal_Fetch(ByVal criteria As Criteria)
      Using cn As New SqlConnection(<%= mObjectData.ConnectionStringName %>)
         cn.Open()
         Using cm As SqlCommand = cn.CreateCommand
            cm.CommandType = CommandType.StoredProcedure
            cm.CommandText = "get<%= mObjectData.ClassName %>"
            <%= From prop In mObjectData.PrimaryKeys Select _
               <code>
            cm.Parameters.AddWithValue("@<%= prop.Name %>", criteria.<%= prop.Name %>)
               </code> %>

            Using dr As New SafeDataReader(cm.ExecuteReader)
               dr.Read()
               With dr
                  <%= From prop In mObjectData.Properties Select _
                     <code>
                  m<%= prop.Name %> = <%= GetReadMethod(prop) %>("<%= prop.Name %>")
               </code>.Value %>

                  ' load child objects
                  .NextResult()
                  <%= From child In mObjectData.Children Select _
                     <code>
                  m<%= child.Name %> = <%= child.Name %> .Get<%= child.Name %> (dr)
                     </code>.Value %>
               End With
            End Using
         End Using
      End Using
   End Sub
</code>.Value
   End Function

More Posts Next page »