A Step Too Far

straight-jacket

Occasionally, I have a bit of a compulsive behavior. When presented with a challenge, I usually won't give up until I have a working answer... and sometimes that answer get's a little crazy. Here's one of my more recent journeys down that path.

I was asked, "Hey Kevin. I am a method that accepts a type parameter, 'T'. Is there a type constraint I can add to T so that I may sum them?"

As an example, consider this code: 

public string FormatNumber<T>(T t1, T t2) where T:IFormattable
{
    T sum = t1 + t2;
    return String.Format("{0:N2}", sum);
}

This sample is not very useful by itself, and it just an example.

The problem is, t1 and t2 cannot be added because the compiler cannot guarantee that t1 and t2 can even be summed. Unfortunately, generics do not have a way to add a constraint for operators yet.

The eventual solution was that the code needed some rethinking. The need for generics and adding them was legitimate, but not worth going over. Either way, I started wondering how this could be possible without doing something like this:

public T Add<T>(T t1, T t2)
{
    if (t1 is int)
        return (T) (object)(((int) ((object) t1)) + ((int) ((object) t2)));
    if (t1 is long)
        return (T) (object)(((long) ((object) t1)) + ((long) ((object) t2)));
    if (t1 is float)
            return (T) (object)(((float) ((object) t1)) + ((float) ((object) t2)));
    throw new NotImplementedException("Can't do addition.");
}

That just seems unattractive to me. So I starting thinking... and thinking... and it spiraled down from there. The solution that I came up with is very unattractive, slow, and nuts.

My solution? Dynamically generate an assembly using Reflection.Emit. It works, but not very elegant. Here it is:

public static T Add<T>(T t1, T t2)
{
    if (!typeof(T).IsPrimitive)
    {
        throw new Exception("Type is not primitive.");
    }
    ModuleBuilder moduleBuilder = assemblyBuilder.DefineDynamicModule("MainModule");
    TypeBuilder typeProxy = moduleBuilder.DefineType("AdditionType", TypeAttributes.Class | TypeAttributes.Public | TypeAttributes.Serializable);
    MethodBuilder methodBuilder = typeProxy.DefineMethod("SumGenerics", MethodAttributes.Static, typeof(T), new[] { typeof(T), typeof(T) });
    ILGenerator generator = methodBuilder.GetILGenerator();
    generator.Emit(OpCodes.Ldarg_S, 0);
    generator.Emit(OpCodes.Ldarg_S, 1);
    generator.Emit(OpCodes.Add);
    generator.Emit(OpCodes.Ret);
    Type adder = typeProxy.CreateType();
    MethodInfo mi = adder.GetMethods(BindingFlags.Static | BindingFlags.NonPublic)[0];
    return (T)mi.Invoke(null, BindingFlags.Default, null, new object[] {t1, t2}, null);
}

Not to mention, that you have to tell your AppDomain how to resolve the type, and that involves a little magic with the OnAssemblyResolve of the AppDomain:

private static Assembly OnAssemblyResolve(object sender, ResolveEventArgs args)
{
    return args.Name == assemblyBuilder.FullName ? assemblyBuilder : null;
}

Sometimes, it just feels good to write crazy code and get it out of your system, before you really do start writing code...

Posted by vcsjones | with no comments

XP SP3 and Internet Explorer

ielogo Perhaps it's just me, or maybe I missed something. I was a little excited about Service Pack 3, particularly the Network Level Authentication feature made it into Remote Desktop, which means that I can force the NLA requirement for Windows Server 2008 and Terminal Service Gateways. Also the added support for WPA2 is a huge security advantage for people with wireless networks. I was happy to be able to change my wireless router to use WPA2 and still be able to support my machine that is still running XP, and this is also something Network Administrators look for.

Not to mention, it was pretty painless to install.

However, I also had Internet Explorer 8 Beta 1 installed. Beta 1 was interesting, but due to some stabilities issues, I thought it was time to remove it and hope for the best in Beta 2. However, when looking at Add / Remove Programs, I saw that I could not remove it. The Uninstall button was missing. I did a little bit of digging, but I didn't find anything online that caught my attention.

At Tech Ed 2008 though, I did talk to Jane Maliouta on the IE 8 team, and she explained to me that Service Pack 3 caused IE 8 to be un-installable. The only known work around is to remove Service Pack 3, then remove Beta 1, then put Service Pack 3 on.

Please read Jane's blog entry on how else XP SP3 effects Internet Explorer. Service Pack 3 also prevents you from un-installing Internet Explorer 7 as well.

http://blogs.msdn.com/ie/archive/2008/05/05/ie-and-xpsp3.aspx

Posted by vcsjones | with no comments
Filed under:

Tech·Ed 2008

florida Tech Ed 2008 is the first Tech Ed to be split into two separate conventions, one for IT Professionals and one for Developers. For 2008, I will be attending the 2008 IT Professional conference from June 10th - 13th in Orlando, Florida. I'll be on the convention floor with my company, Thycotic Software, showing off our flagship product, Secret Server. If you're going to be there, be sure to come check our booth out!

Posted by vcsjones | with no comments
Filed under:

DocBook

Let's take a break from the text encoding idea real quick so I can talk about a new tool that I recently got into..

One of the things that every product needs, regardless of how simple it is to use, is good documentation. It's not fun, it takes time, and isn't technically intriguing. Regardless, it has to be done. The part that myself and team members have struggled with is a tool take makes it easy. We looked at a few commercial applications such as RoboHelp, but it always left me the impression we were rabbit hunting with a Barrett M107 .50 rifle. Our requirements were pretty simple:

  • Easy to use
  • Text based - This makes differentials and merging easy
  • Reasonably priced
  • Able to produce different types of documents (HTML, PDF, etc)

penandpaper We finally settled on what is the best solution (not to mention, it's open source and free) called DocBook. It's based on XML, and does have a standard. XML is extremely flexible, and their output is generated by XSL transformations, so we can easily customize the output to meet our requirements. We started using the e-novative DocBook Environment, which gives you a simple command line environment for compiling your DocBook books. It too uses a GPL license, so you can customize it to your needs.

A simple book looks something like this:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE book
  PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" "file:/c:/docbook/dtd/docbookx.dtd"
  [
    <!ENTITY % global.entities SYSTEM "file:/c:/docbook/include/global.xml">
    %global.entities;

    <!ENTITY % entities SYSTEM "entities.xml">
    %entities;
]
>
<book lang="en">
    <bookinfo>
        <title>My First Book</title>
        <pubdate>May, 2008</pubdate>
        <copyright>
            <year>2008</year>
            <holder>Kevin Jones</holder>
        </copyright>
    </bookinfo>
    <chapter>
    <title>My First Chapter</title>
    <sect1>
        <title>How to crash a .NET Application</title>
        <para>Call System.AppDomain.Unload(System.AppDomain.CurrentDomain)</para>
    </sect1>
    </chapter>
</book>

Pretty simple, right? Each book can be broken down into separate chapters, which are broken down into sections, then paragraphs. It takes care of some dirty work for you, such as maintaining a Table of Contents for you. It offers a lot of other standard features as well, embedding graphics, referencing other places in the document.

Since DocBook is capable of understanding external entities, I can place chapters, sections, any part of the document that I want into another file and create an <!ENTITY ... > for it.

Compiling it is pretty easy. From the e-novative environment, just use the docbook_html.bat for docbook_pdf.bat to create your generated output, something like this:

>C:\docbook\bat\docbook_html.bat MyFirstBook

MyFirstBook is the name of the project in the projects folder, which is all automatically created for you by the docbook_create.bat script. Using the compiler, the out-of-box HTML template looks like this:

docbookhtmloutput
(click for full image)

There you have it, a simple documentation tool. Not very pretty at the moment, but of course it's easy enough to theme it to your company or product by changing the XSL.

Posted by vcsjones | with no comments
Filed under:

Text Encoding (Part-1)

I was recently on the ASP.NET Forums and a member was asking, "How can I figure out the encoding of text?" and that got me thinking. There should be a reasonable way to do this, right? It's a useful thing to know. First, we need a little background on how text is encoded into bytes.

Long ago, back when 64K of memory was a big deal, characters took up a single byte. A byte ranges from 0 - 255, which allows us to support a total of 256 characters. Seems like plenty, no? English has 26, 52 for both cases, 62 with numbers, 92 with punctuation, and a few extra for line breaks, carriage returns, and tabs. So about 100, give or take a few. So what's the problem?

Well, this worked great and all, but other languages use different characters. The Cyrillic language by itself supports 33 letters. This is where encoding was introduced. In order to support multiple character sets, what each byte meant was determined by its encoding. This was done simply by knowing what encoding was used.

textIn today's world, where that average calculator has more memory than PCs did long ago, we now also use 2 byte encoding. That means that we can support 255 to the second power of characters, or 65,536. That is enough to support all languages in a single encoding, even though it takes up double the space. Problem solved, right? Not exactly.

While in this day and age we support double byte encoding, there are still other factors involved, such as the endianness (the order of the bytes. Big endian is backwards). Even then, there is still a lot of legacy data to support that is still single byte.

Say I give you a big binary chunk of data, and I tell you to convert it to text. How do you know which encoding is used? How do you even know which language it is in? I could be giving you a chunk of data using IBM-Latin. So how do we figure this out? Some smarts and process of elimination. Let's start with things we know.

All of the non single-byte encodings have what's called a Byte Order Mark, or BOM for short. This is a small amount of binary data pre-appended to the rest of the data that identifies which encoding it is. In .NET world, this is called the Preamble. Since the BOM is an ISO standard, it is always the same for the encoding used regardless if you are using .NET, Python, Ruby on Rails, etc. We can look at our data and see if the BOM can tell us.

To achieve this in .NET, we will be using most of the classes in the System.Text namespace. Specifically, the Encoding class. An instance of the encoding class has a method called GetPreamble(). Which will give us our BOM for that encoding. A BOM can be from 2 - 4 bytes, depending on the number of bytes used in the encoding. Remember when I said two bytes would be plenty? Well I fibbed, since there is an encoding called UTF-32 that supports 4 bytes (a whopping 4.2 billion character support).

We can then check our data to see if it starts with the BOM.

private static bool DataStartsWithBom(byte[] data, byte[] bom)
{
bool success = data.Length >= bom.Length && bom.Length > 0;
for (int j = 0; success && j < bom.Length; j++)
{
success = data[j] == bom[j];
}
return success;
}

So lets look at this method. This method takes our data, and a BOM, and determines if the data starts with the BOM. There are a few assumptions:

  1. The data length is always greater than or equal to the BOM. If it is not, then there is no BOM at all, and we'll cover that in a bit.
  2. The BOM's length is always greater than zero.

So let's put it to use (assume the local data is a byte[]):

foreach (EncodingInfo encodingInfo in Encoding.GetEncodings())
{
Encoding encoding = encodingInfo.GetEncoding();
byte[] bom = encoding.GetPreamble();
if (DataStartsWithBom(data, bom))
return encoding;
}

Here, we get all of the encodings that .NET knows of, and looks to see if our data byte array starts with that encodings BOM. If the encoding has no BOM, the DataStartsWithBom method will handle that with the bom.Length > 0 on the 3rd line. Once we know the encoding, we can decode it. You have to ensure that you don't actually try to decode the BOM itself:

encoding.GetString(data, bom.Length, data.Length - bom.Length);

Pretty straight forward so far, right?

Yes? OK let's move on. What about the case where we can't figure it out by the BOM? Most encodings don't have a BOM, only the UTF encodings do. ISO and OEM encodings, do not.

This is where it gets tricky, and where some pretty complex algorithms can come into play. The most important piece of information that you can have at this point, is knowing which language the text is in. With that, we can take a reasonable stab at which encoding is it.

.NET supports languages through the System.Globalization.CultureInfo class. This class will be very useful from here on forward. Let's take baby steps on attacking this problem, and while we don't know everything, we can use clues.

Each language has what's called an ANSI encoding. This a standard encoding used for that language assigned by the American National Standards Institute. The ANSI encoding is always a single byte encoding. This seems like a reasonable place to start.

We can get this Encoding by calling cultureInfoInstance.TextInfo.ANSICodePage. This only gives us the numeric code page (an identifier), but it's simple enough to create an instance of the Encoding class with the code page by calling Encoding.GetEncoding(int codePage).

How do I figure out the language? Chances are you know what language your users are using, or at least most of them. A case where you wouldn't know is screen-scraping. That can be figured out by looking at the encoding of the response. You can do that by looking at the ContentEncoding property off of the HttpResponse instance.

In most cases, this will probably work. By no means am I saying, "this will always work" in fact, there are a lot of bases that I haven't covered that I hope to in future blog posts. There are other code bits out there that do this already, and do a good job, but it's always good to know how it actually works, and fully understand the problem you are trying to solve.

So what'll be in part 2? How to decode text without knowing the language, and maybe in part two (part 3?) lossy decoding.

Posted by vcsjones | with no comments
Filed under:

Too Little Too Late?

Adobe Logo, Flash Logo, and the Adobe name are used under fair use. I booted up my PC today and saw this nice message from Adobe telling me that there was an update for my Flash installation.

I couldn't help but notice that one of the highlighted features was support for HD content. I can't help but feel that this is in response to Silverlight's support for High Definition content. I wonder though, is it too little too late? I've heard a lot of stories from people about switching from Flash to Silverlight just for the support for HD. Now don't misunderstand me, I think this will be a huge hit for the Flash community, and definitely merits use.

This means to me that Silverlight is doing something right, and is going to be able to hold its ground.

Posted by vcsjones | with no comments
Filed under:

CMAP User Group Presentation

On Tuesday, May 6th at the CMAP User Group Meeting will be Heroes Happen {here} Launch where I will be discussing the new features in SQL 2008 and Steve Michelotti will be discussing the new C# 3.0 Language Enhancements. If you are in the Baltimore area, I encourage you to come to the presentation to learn some cool new stuff.

Take a look at the meeting details here for more information and directions.

Posted by vcsjones | with no comments
Filed under: ,

CMAP Code Camp and Richmond Code Camp

cmap Saturday was the Spring '08 CMAP Code Camp. Lots of good sessions there, lots of fun as well! I presented on "SQL Server 2008" and "What's new in C# 3.0" as a replacement for Jay Flowers since he wasn't able to make it. A big congratulations to Chris Steener and Randy Hayes for putting together a fabulous code camp.

 

RCC2008 Also, coming up on the 26th is the Spring '08 Richmond Code Camp. I hope to see a lot of new people there, as well as all of the ones I already know. The tentative schedule as been posted, which you can see here. It's going to be a good one! I'll be re-presenting my "SQL Server 2008".

Posted by vcsjones | with no comments
Filed under:

I'd like to report a negligence

Safe I've always been interested in software security, and it's always been a number one priority for me. Software security is really honoring the trust of the people that use your software. I've also been fortunate to be the lead developer of a security product. I myself also tend to keep an eye on the security of other products.

We use a few applications in house that we really like. I decided poke around at the security of some of these products. I won't say any of the product names because they really are, good products sans some poor security. If I find a security bug in a piece of software, I will report it to support or the development team. I feel like I've done all that I can, and I'll leave it to them to fix it.

Though the one thing that there really is no excuse for is storing a password in clear text. While doing my digging, I found that two products we use stored passwords in clear text. One of them was attempting to hash a login password using String.GetHashCode, which isn't a good idea, but much better than a clear text. However, this product also stored some other passwords in clear text. They needed to be two way, so a hash wouldn't work; rather a symmetric encryption would be better off. The other system just used clear text for all passwords. This is really just neglecting security, it's not even a bug. It's just not caring.

It's not too hard to encrypt data in .NET, it's pretty easy and there are a lot of tutorials on it, and there are a few usergroups around that talk about it as well, too.

Seeing this makes me think a couple of things. The first being, are my standards too high? I don't think so honestly. I don't see any reason for storing a password in plain text other than reducing developer effort. The second thing is, how common is this? If two applications that we use have this issue, should I lose trust in all of the applications I use? It's not a comfortable thought; knowing that some software abuse the trust that we give them. The third thing is, I know one of these products is extremely popular. I'm surprised no one has caught this before. Am I really the only one that tinkers around with other software's security?

Off topic, but I am trying to get back into the swing of blogging again. I've set a goal of trying to blog every other day or more often. We'll see how it goes.

Posted by vcsjones | with no comments
Filed under:

MVC Framework

MVC Pattern The MVC framework is the new "hot" thing in the ASP.NET world for developers. As such, everyone has at least one blog entry about it. So, I think it's time I jumped on that ship. Though, I wanted to voice a few concerns with the MVC Framework, or at least how people perceive it.

The MVC design pattern is by no means new. It's been around since around 1979, and .NET is certainly not the first framework that supports the MVC Pattern, and nor is Microsoft's MVC Project the first for .NET. Spring is a very popular MVC solution for Java developers, and there is a .NET port of it as well. Though, I'm not here to give a history lesson either.

I often hang out at the ASP.NET Forums as a moderator and contributor. The MVC Framework is a pretty hot forum over there at the moment. Though after reading several posts, I can't help but get the feeling that several people aren't certain as to what the MVC design pattern is trying to solve.

So, at it's core, MVC is a design pattern that was originally used in SmallTalk-80. The original paper is up for interpretation, but at it's core theory, MVC's original goal was clean separation between layers of your application.

Yes, MVC gives you cool features like Routing, and a big  bonus of Unit Testing. Anyone that has been using unit testing before knows that good separation of your code is important to achieve practical unit tests, especially those who also practice Test Driven Development; but the true goal of MVC (or MVP for that matter) is decoupling logic.

Some people originally made the claim that the current ASP.NET model (before the MVC project) was an MVC model. Well... mmmnnn...ooo. Not quite. It could be argued that it was, but the "not quite" was the event processing in .NET. The initial argument was that the event handles on controls coupled the code behind too tightly. A better solution would be to have the event handles work with a presentation model.

OK, I've been rambling on a little bit. What's my point (if any)? I suppose that Microsoft's MVC is pretty cool. I am a little perplexed as to why everyone got so excited about MVC since Microsoft announced they were providing a Framework when there are frankly, better and more mature MVC Frameworks around, and I'll circle back around to Spring for that one. Spring has been around since .NET 1.1, and since it is a Java port, they have a lot of experience from there, too. I get concerned that some people really aren't using MVC for the purpose of decoupling and improving their code; rather for some of its specialized features, such as Routing.

Though, in all honestly I tend to lean towards the MVP pattern myself. Not quite as popular, I still prefer it because the view is not as coupled with the model. A good MVP project is Stormwind NMVP.

Posted by vcsjones | with no comments
Filed under: ,

CaptainHook

This is my first post to the msmvps.com web site. I decided to move my blog from http://community.strongcoders.com/blogs/vcsjones.

So, I thought I would start off by talking about something a little obscure that a co-worker turned me onto.

Captain Hook - Image used under Fair Use.OK, different captain hook. We're not talking about Dustin Hoffman, either. No, I am talking about the .NET Subversion hook framework from Phil Haack. Phil and his company released a nice .NET Framework Library on Source Forge.

OK, a bit of information, first.

I'm a big fan of Subversion. It's what we use at our company for all of our source control, and we use it for a lot of things besides maintaining our code. Documents, Code, etc. It's by far my favorite Source Control solution, and the details of that are another blog post.

Aside from Subversion, we also use an Agile planning and estimation program called TargetProcess. It's very useful for Agile shops. But for the purpose of this post, think of it as any other tracking system, like Bugzilla or FogBugz (we use FB too for a client).

One of my favorite features of TargetProcess is that it integrates with Subversion cleanly. When I commit with Subversion, all I have to do is put a pound (#) and then the ID of the "story" (think of that like a bug) and then I can track it from TagetProcess, view diffs, and read commit messages to that given story. For example:

KJ - #1234 - Fixed minor typo in Administration screen.

Now when I look in TargetProcess at #1234, I can see the commit message, and the files that changed.

So, the problem I am trying to solve on the team is forgetting to put a number in the commit message. Fortunately, Subversion allows us to hook pretty easily. Specifically in this case, we want to use a pre-commit hook.

Despite Subversion's ease of hooking, it can be a little tricky in .NET. And unlike the movies, CaptainHook is the protagonist for this. I'm not going to go into details of how captain hook actually works, but I will go into detail of what's great about it.

Subversion allows hooking at the repository level. So, for a given repository, there are several types of hooks. The one I am keen on is the pre-commit. If you look in a repository, there is a directory called "hooks". When a commit is made, it will start a file named "pre-commit" It can be a BAT file, executable, perl script, what have you. If the exit code does not equal zero, the commit will not go through.

CaptainHook, in a nutshell, is an executable that probes a directory for assemblies with "hooks" in them. It executes the hooks, and if any of the hooks return false, CaptainHook exits with a non-zero code, probably 1.

So, what we want to do is put a regular expression on the commit message... something like "#\d+". What does one of these hooks look like?

public class RequireTargetProcessNumber : PreCommitHook
{
     protected override bool HandleHookEvent(ITransactionInfo commit)
     {
          string commitMessage = commit.LogMessage;
          if (!HasTargetProcessNumber(commitMessage))
          {
               this.Context.Output.WriteError("TargetProcess Number was not specified in commit message.");
               return false;
          }
          else
          {
              return true;
          }

     }

     public bool HasTargetProcessNumber(string commitMessage)
     {
          return Regex.IsMatch(commitMessage, @"#\d+");
     }

}

So there are a few things going on in here... we are overriding the HandleHookEvent from our base class, PreCommitHook. This gives us an ITransactionInfo, which provides us with information about the current transaction. We test the commit message for our regular expression, and if it doesn't match, we fail. I extracted that regular expression to unit test it. If HandleHookEvent was public I could probably test it that way with a mock test as well.

Anyway, that's all there is to it. I can create as many assemblies of these as I want to modularize them, and drop them in a configurable path that CaptainHook looks for.

I highly recommend this for anyone that uses Subversion. It has a lot of other practical uses, such as emailing when a commit is done, or writing to an RSS feed on a successful commit.

Posted by vcsjones | with no comments
Filed under: