January 2014 - Posts

The Sixth Level of Code Generation
Fri, Jan 31 2014 10:16

I wrote here about the five levels I see in code generation/meta-programming (pick your favorite overarching word for this fantastically complex space).

I missed one level in my earlier post. There are actually (at least) six levels. I missed the sixth because I was thinking vertically about the application - about the process of getting from an idea about a program all the way to a running program. But as a result I missed a really important level, because it is orthogonal.

Side note: I find it fascinating how our language affects our cognition. I think the primary reason I missed this orthogonal set is my use of the word “level” which implied a breakdown in the single dimension of creating the application.

Not only can we generate our application, we can generate the orthogonal supporting tools. This includes design-time deployment (NuGet, etc), runtime deployment, editor support (IntelliSense, classification, coloration, refactorings, etc.), unit tests and even support for code generation itself – although the last might feel a tad too much like a Mobius strip.

Unit tests are perhaps the most interesting. Code coverage is a good indicator of what you are not testing, absolutely. But code coverage does not indicate what you are testing and it certainly does not indicate that you are testing well. KLOC (lines of code) ratios of test code to real code are another indicator, but still a pretty poor one and still fail to use basic to use basic boundary condition understanding we’ve had for what, 50 years? And none of that leverages the information contained in unit tests to write better library code.

Here’s a fully unit tested library method (100% coverage) where I use TDD (I prefer TDD for libraries, and chaos for spiky stuff which I later painfully clean up and unit test):

public static string SubstringAfter(this string input, string delimiter)
{
var pos = input.IndexOf(delimiter, StringComparison.Ordinal);
if (pos < 0) return "";
return input.Substring(pos + 1 );
}

There are two bugs in this code.

Imagine for a minute that I had not used today’s TDD, but had instead interacted – with say… a dialog box (for simplicity). And for fun imagine it also allowed easy entry of XML comments; this is a library after all.

Now, imagine that the dialog asked about the parameters. Since they are strings – what happens if they are null or empty, is whitespace legal, is there an expected RegEx pattern, and are there any maximum lengths – a few quick checkboxes. The dialog would have then requested some sample input and output values. Maybe it would even give a reminder to consider failure cases (a delimiter that isn’t found in the sample). The dialog then evaluates your sample input and complains about all the boundary conditions you overlooked that weren’t already covered in your constraints. In the case above, that the delimiter is not limited to a length of one and I didn’t initially test that.

Once the dialog has gathered the data you’re willing to divulge, it looks for all the tests it thinks you should have, and generates them if they don’t exist. Yep, this means you need to be very precise in naming and structure, but you wanted to do that anyway, right?

Not only is this very feasible (I did a spike with my son and a couple conference talks about eight years ago), but there’s also very interesting extensions in creating random sample data – at the least to avoid unexpected exceptions in side cases. Yes, it’s similar to PEX, and blending the two ideas would be awesome, but the difference is you’re direct up-front guidance on expectations about input and output.

The code I initially wrote for that simple library function is bad. It’s bad code. Bad coder, no cookies.

The first issue is just a simple, stupid bug that the dialog could have told me about in evaluating missing input/output pairs. The code returns wrong the wrong answer if the length of the delimiter is greater than one and I’d never restricted the length to one. While my unit tests had full code coverage, I didn’t test a delimiter greater than one and thus had a bug.

The second issue is both common, insidious, and easily caught by generated unit tests. What happens if the input string or delimiter is null? Not only can this be caught by unit tests, but it’s a straightforward refactoring to insert the code you want into the actual library method – assertion, exception, or automatic return (I want null returned for null). And just in case you’re not convinced yet, there’s also a fantastic opportunity for documentation – all that stuff in our imagined dialog belongs in your documentation. Eventually I believe the line between your library code, unit tests and documentation should be blurry and dynamic – so don’t get too stuck on that dialog concept (I hate it).

To straighten one possible misconception in the vision I’m drawing for you, I am passionately opposed to telling programmers the order in which they should do their tasks. If this dialog is only available before you start writing your method – forget it. Whether you do TDD or spike the library method, whether you make the decisions (filling in the imagined dialog) up front or are retrofitting concepts to legacy code, the same process works.

And that’s where Roslyn comes in. As I said, we abandoned the research on this eight years ago as increasing the surface area of what it takes to write an app and requiring too much work in a specific order (and other reasons). Roslyn changes the story because we can understand the declaration, XML comments, the library method, the unit test name and attributes, and the actual code in the method and unit test without doing our own parsing. This allows the evaluation to be done at any time.

That’s just one of the reasons I’m excited about Roslyn. My brain is not big enough to imagine all the ways that we’re going to change the meaning of programming in the next five years. Roslyn is a small, and for what it’s worth highly imperfect, cog in that process. But it’s the cog we’ve been waiting for.

Confidentiality and ETW Content
Tue, Jan 28 2014 16:01

I’ve been asked a couple of times about confidentiality and the content of traces, including ETW traces through the .NET 4.5 EventSource (including a comment in this post on using Event attribute parameters).

The problem is that trace information is unlikely to get the same level of security protection as the database or other information sources. If you’re rolling your own tracing, consider whether it’s worth effort to ensure the trace remains forever confidential. It’s much easier to just keep the confidential information out.

If you’re using ETW, there are some considerations within the Windows design for some security limitations. This is documented to include the capacity to block registration of certain provider Guids to authorized users, and to limit the providers writing to certain provider Guids.

I would love someone to jump up and tell me that I’m wrong and post a link to a clear discussion of providing a secure ETW trace. But as far as I can tell, the designers of ETW considered it, and it’s not currently raised to any level of usability.

Registering an ETW manifest on a machine requires admin privileges. Turning on an ETW trace requires admin privileges. So, it’s not like any yahoo can create traces. Just admin privileged yahoos.

So, if you live in a land where all confidential information is available to your IT staff with admin privileges on your servers, then perhaps you can put confidential information into traces.

But this also means you have to keep track of and destroy all the trace files that might ever be created.

I think that’s too high a bar and you need to keep confidential information out of your traces – regardless of the tools you use to create them.

How are Event Parameters Best Used to Create an Intuitive Custom EventSourceTrace
Fri, Jan 24 2014 16:12

Jason asked a really, really good question on StackOverflow. I’m answering it here, because it’s a wordy answer. The good news, is the answer is also in my about-to-be-released ETW Pluralsight video on ETW (everything is in their hands, just not posted yet, hopefully next Thursday!). I’ve also got some additional blog posts coming, but today, let me just answer Jason’s question.

“it is unclear how some of the properties are best used to create an intuitive custom trace”

Jason goes on to categorize Event attributes as “intuitive” and “non-intuitive”. I’m throwing out that distinction and covering all of them. And the most important advice might be the last on Message.

Channel

ETW supports four basic channels and the potential for custom channels. EventSource does not support custom channels (if you have a user story, contact me or the team). The default channel and the only one currently supporting in-line manifests is the Debug channel.

The Channel parameter exists only in the NuGet version and only for the purpose of accessing the additional channels, primarily the admin channel to access EventViewer for messages to admins. I was one of the people that fought for this capability, but it is for a very limited set of cases. Almost all events logically write to your channel – the default channel – the Debug channel.

To write to EventViewer, you need to write to the Admin channel and install a manifest on the target computer. This is documented in the specification, in my video, and I’m sure a couple of blog posts. Anything written to the admin channel is supposed to be actionable by ETW (Windows) guidelines.

Use Operational and Analytic channels only if it is part of your app requirements or you are supporting a specific tool.

In almost all cases, ignore the Channel parameter on the Event attribute and allow trace events to go to the Debug channel.

Level

For the Admin Channel

If you are writing to the admin channel, it should be actionable. Information is rarely actionable. Use warning when you wish to tell them (not you, not a later dev, but ops) that you want them to be concerned. Perhaps that response times are nearing the tolerances of the SLA. Use error to tell them to do something. Perhaps that someone in the organization is trying to do something they aren’t allowed to do. Tell them only what they need to know. Few messages, but relatively verbose and very clear on what’s happening, probably including response suggestions. This is “Danger, danger Will Robinson” time.

For the Debug Channel

This is your time-traveling mind meld with a future developer or a future version of yourself.

I’m lucky enough to have sat down several times with Vance, Dan and Cosmin and this is one of the issues they had to almost literally beat into my head. The vast majority of the time, your application can, and probably should run with the default information turned on.

If you’re looking at an event that clearly represents a concern you have as a developer – something you want to scare a later developer because it scares you – like a serious failed assert – use warning. If someone is holding a trace file with ten thousand entries, what are the three things or the ten things you think tell them where the problem is? If they are running at the warning (not informational level) what do they really, truly need to know?

If it’s an error, use the error level.

If it’s a massively frequent, rarely interesting event, use verbose. Massively frequent is thousands of times a second.

In most cases, use the default informational level for the Level parameter of the Event attribute. Depending on team philosophy, ignore it or record it.

Keywords

If you have verbose events, they need to be turned on and off in an intelligent fashion. Groups of verbose events need keywords to allow you to do this.

Warnings and Error levels do not need keywords. They should be on, and the reader wants all of them.

The danger of missing an event so vastly outweighs the cost of collecting events that informational events should be turned on without concern for keywords. If keywords aren’t going to be used to filter collection, their only value is filtering the trace output. There are so many other ways to filter the trace, keywords are not that helpful.

In most cases, use the Keywords parameter of the Event attribute only for verbose events and use them to group verbose events that are likely to be needed together. Use Keywords to describe the anticipated debugging task where possible. Events can include several Keywords.

Task

On the roller coaster of life, we just entered one of the scary tunnels - the murky world of ETW trace event naming. As far as ETW is concerned, your event is identified with a numeric ID. Period.

Consumers of your trace events have a manifest – either because it’s in-line (default for Debug channel, supported by PerfView and gradually being supported by WPR/WPA) or installed on the computer where the trace is consumed. The manifest does not contain an event name that is used by consumers.

Consumers, by convention, make a name from your Task and Opcode.

EventSource exists to hide the weirdness (and elegance) of ETW. So it takes the name of your method and turns it into a task. Unless you specify a task. Then it uses your task as the task and ignores the name of your method. Got it?

In almost all cases, do not specify a Task parameter for the Event attribute, but consider the name of your method to be the Task name (see Opcode for exception).

Opcode

I wish I could stop there, but Jason points out a key problem. The Start and Stop opcodes can be very important to evaluating traces because they allow calculation of elapsed time. When you supply these opcodes, you want to supply the Task to ensure proper naming.

And please consider the humans. They see the name of the method, they think it’s the name displayed in the consumer. For goodness sakes make it so. If you specify a task and opcode, ensure that the method name is the concatenation. Please

This is messy. I’m working on some IDE generation shortcuts to simplify EventSource creation and this is a key reason. I think it will help, but it will require the next public release of Roslyn.

Almost never use an Opcode parameter other than Start/Stop.

When using Start/Stop Opcodes, also supply a Task and ensure the name of the method is the Task concatenated with the Opcode for the sake of the humans.

Version

The version parameter of the Event attribute is available for you and consumers to communicate about whether the right version of the manifest is available. Versioning is not ETW’s strength – events rarely changed before we devs got involved and now we have in-line manifests (to the Debug channel). You can use it, and the particular consumer you’re using might do smart things with it. And even so, the manifest is or is not correctly installed on any machines where installed manifests are used.

Overall, I see some pain down this route.

The broad rule for versioning ETW events is don’t. That is do not change them except to add additional data at the end (parameters to your method and WriteEvent call). In particular, never rearrange in a way that could give different meaning to values. If you must remove a value, force a default or marker value indicating missing. If you must otherwise alter the trace output, create a new event. And yes, that advice sucks. New events with “2” at the end suck. As much as possible, do up front planning (including confidentiality concerns) to avoid later changes to payload structure.

Initially ignore the Version parameter of the Event attribute (use default), but increment as you alter the event payload. But only add payload items at the end unless you can be positive that no installed manifests exist (and I don’t think you can).

Message

Did you notice that so far I said, rarely use any of the parameters on the Event attribute? Almost never use them.

The Message parameter, on the other hand, is your friend.

The most important aspect of EventSource is documenting what the event needs from the caller of the code. It’s the declaration of the Event method. Each item passed should be as small as possible, non-confidential, and have a blazingly clear parameter name.

The guy writing against your event sees an available log method declaration like “IncomingDataRequest(string Entity, string PrimaryKey).” Exactly how long does it take him to get that line of code in place? “IncomingRequest(string msg)” leaves the dev wondering what the message is or whether it’s even the correct method. I’ve got some stuff in my upcoming video on using generics to make it even more specific.

Not only does special attention to Event method parameters pay off by speeding the writing of code that will call the Event method (removing all decision making from the point of the call), but (most) consumers see this data as individual columns. They will lay this out in a very pretty fashion. Most consumers allow sorting and filtering by any column. Sweet!

This is what Strongly Typed Events are all about.

Parameters to your method like “msg” do not cut it. Period.

In addition to the clarity issues, strings are comparatively enormous to be sticking into event payloads. You want to be able to output boatloads of events – you don’t want big event payloads filling your disks. Performance starts sucking pretty quickly if you also use String.Format to prepare a message that might never be output.

Sometimes the meaning of the parameter is obvious from the name of the event. Often it is not. The contents of the Message parameter is included in the manifest and allows consumers to display a friendly text string that contains your literals and whatever parts of the event payload seem interesting. Sort of like String.Format() – the “Message” parameter is actually better described as a “format” parameter. Since it’s in the manifest, it should contain all the repeatable parts. Let the strongly typed data contain only what’s unique about that particular call to the trace event.

The Message parameter uses curly braces so you feel warm and fuzzy. That’s nice. But the actual string you type in the parameter is passed to the consumer, with the curly braces replaced with ETW friendly percent signs. Do not expect the richness of String.Format() to be recognized by consumers. At least not today’s consumers.

By splitting the data into strongly typed chunks and providing a separate Message parameter, the person evaluating your trace can both sort by columns and read your message. The event payload contains only data, the manifest allows your nice wordy message. Having your beer and drinking it too.

Not sold yet? If you’re writing to a channel that uses installed manifests, you can also localize the message. This can be important if you are writing to the admin channel for use in EventViewer.

Almost always use Message so consumers can provide a human friendly view of your strongly typed event payload.

Summary

There are four basic rules for EventSource usage:

  • Give good Event method names
  • Provide strongly typed payload data – consider confidentiality – and work to get payload contents right the first time (small where possible)
  • Use the Message parameter of the event attribute for a nice human friendly message
  • For every other Event attribute parameter – simplify, simplify, simplify. Use the defaults unless you are trying to do something the defaults don’t allow
Why In-Place Upgrades?
Fri, Jan 24 2014 14:00

In a comment on my last post someone asked me why in-place upgrades are better than side by side upgrades. I thought it worth a post, but it’s an opinion piece more than a technical piece.

Honestly, if I was on the committee that said “side by side or in-place?” I don’t know what I would have voted. This is a really hard problem and they didn’t invite me to the meeting. But I will summarize where I think we stand nearly two years later. If you want review, this Scott Hanselman post and the linked Rick Strahl post cover it.

Before I tell you why in-place upgrades are better, let me tell how they are worse. They are scary. Really, really scary. An in-place update means you can get a midnight phone call that your application in production is broken. You can get this because a third party tool throws an error it didn’t use to throw, and only throws that error in an obscure data scenario. And the only thing standing between you and this disaster – is Microsoft’s competence. Yeah, I think that’s very scary.

Scary enough that I spent weeks researching the known breaking changes and summarizing their impact in what I think is a much-watch 45 minutes in my What’s New in .NET 4.5 Pluralsight video. The languages teams are the best at considering, avoiding and communicating about known breaking changes. CLR/Framework ranks a ways down and some ancillary teams like Workflow suck at it.

So why should Microsoft ask you to take such an astounding risk? Why ask you to stake the future of your company on their (Microsoft’s) competence?

Because the world is moving at an astounding pace. The notion of “cadence” (I so hate this use of that word) is that releases will be frequent and small. Not because some big mucky muck at Microsoft declared releases will be frequent and small – but because that is the world that our industry has created. It’s there in Open Source, it’s there with your phone’s OS upgrade, it’s there with response to new security threats, and it’s there because of a world clamoring for new features.

Let’s say Microsoft updates .NET twice a year. In three years, side by side updates would mean there were six copies of .NET on the user’s machines. It would mean third party tools would have to test against six scenarios (or themselves force an update which would be really bad). You’re application libraries would have to be coordinated across six versions – probably meaning that many devices had multiple copies of libraries used by different versions of your different apps. It would mean your libraries had to load side by side too.

And it might not be a cheap little desktop with all the memory and all the hard disk space and inexpensive power you could ever want. It might be a phone, a tablet, or a cloud system you’re paying to access.

And with rapid updates the line quickly blurs between what’s a real release (4.5 or 4.5.1) and a stability release (like the important Jan 2013 release). You have to keep up with all the change to know when to move forward from a technical perspective.

And if all that wasn’t enough, the concept of the .NET framework has fundamentally shifted. Behind the scenes it’s tied up into many, many little satchels. It’s not a small set of three or six frameworks. The ability of Microsoft to test all combination of the presence of different parts of the framework to work across a large number of in the wild releases would be impossible. In-place releases mean things are updated as needed into a finite case of tested scenarios.

And finally, there’s the security implications of needing to keep not just one or two versions tidied up with security releases, but a very large number of branched framework versions.

In the end there must have been a weighing of options. The demand that we trust Microsoft competence to avoid changes that break your app lined up against a nightmare web of multiple side-by-side framework versions. And there was a third choice – not to move so fast. And there are people reading this that think it’s blatantly clear that each of these three options was the only logical one.

Not going to the fast cadence would have doomed .NET to a historical footnote as opposed to the best bet for the next decade of development. Side-by-side releases has no option but chaos – on our hard drives, in our test strategies and in Microsoft’s security and testing scenarios. The path they chose is the one that can have a good outcome. If they prove competent – and the languages and framework teams get passing marks so far – we all get to live in a sane world.

The backward compatibility commitments Microsoft made regarding the in-place upgrades are the most important commitments Microsoft has ever made. We have to hold their feet to the fire and remind them of that every day, every year for the next decade. We best remind them by testing our apps with CTPs and release candidates on test machines.

.NET 4.5.1 and Security Updates
Wed, Jan 22 2014 17:01

I’d just like to add a quick little opinion piece on this very important announcement from the .NET framework team.

On Oct 16, 2013, the team announced that NuGet would be a release mechanism for the .NET framework. On that day, we should have all started holding our breath, getting prescriptions for Valium, or moving to Washington or Colorado (although technically that would not have helped until Jan 1).

A .NET framework release vehicle that is not tied to a security update mechanism? Excuse me? What did you say? Really? No? Are you serious?

Today’s announcement is that security update mechanism.

Obviously, this mechanism has been in the works and it was just a matter of getting everything tied and ready to go, which for various reasons took .NET 4.5.1.

So, there are now several really important points to make about .NET framework/CLR and its relation to NuGet

  • The concept of the “.NET Framework” is now a bit fuzzy. Did you notice I no longer capped the word “framework?” It’s been a couple years coming and a lot of background work (like PCL), but you have control over what “.NET framework” means in your application in a way that could never have been imagined by the framers of the Constitution.
  • The NuGet vehicle for delivering pieces of the framework/CLR supports a special Microsoft feed, PCL, and has already been used for really interesting things like TPL Dataflow and TraceEvent. It looks mature and ready from my vantage.
  • Gone are the days when NuGet was only pre-release play-at-your-peril software. Look for real stuff, important stuff, good stuff, stuff you want from the CLR/.NET framework team to be released on NuGet, as well as ongoing releases from other teams like ASP.NET and EF.
  • They’ve made some very important promises in today’s announcement. They must fulfill these promises (no breaking changes, period). We must hold their feet to the fire. In place upgrades are a special kind of hell, the alternative is worse, and even if you disagree on that point, this is the game we are playing.
  • Upgrade to .NET 4.5.1 as soon as your team feels it can. Please.

Thank you to the team for the incredible work of the last few years. Thank you. Thank you.

by Kathleen | 3 comment(s)
Filed under: ,
Explanation of Finding All Unique Combinations in a List
Mon, Jan 6 2014 7:10

I showed an algorithm for finding all unique combinations of items in a list here.

I didn’t know whether the explanation would be interesting, so I simply offered to add it if someone wanted to see it. Someone did, so here goes.

Theory

Imagine first the problem on paper. Make a column for each item in the list – four is a good place to start and then you can generalize it upwards. The list will look something like this:

ComboAlgorithmExplanation1

I’ll update this grid to use the number 1 as an x, and add a column for no items, because the empty set is important for the problem I’m solving:

ComboAlgorithmExplanation2

And then assign each column to a bit. This makes each of the numbers from 0 to 15 a bit mask for the items to select to make that entry in the unique set.

ComboAlgorithmExplanation3

Code

The code creates a List of Lists because the problem I was solving was combinations of objects, not strings. I should have used IEnumerable here, as the returned list won’t be changed.

public static List<List<T>> GetAllCombos<T>(this List<T> initialList)
{
   var ret = new List<List<T>>();

I’m not sure about the mathematical proof for this, but the number of items is always 2^N (two to the power of N) or 2^N – 1 if you’re ignoring the empty set.

   // The final number of sets will be 2^N (or 2^N - 1 if skipping empty set)
   int setCount = Convert.ToInt32(Math.Pow(2, initialList.Count()));

When I started this post, I realized I’d left the Math.Pow function in this first call, so I’ll explain the difference. Math.Pow takes any value as a double data type and raises it to the power of another double. Doubles are floating points, making this a very powerful function. But in the special case of 2^N, there’s a much faster way to do this – perhaps two orders of magnitude faster. This doesn’t matter if you are only calling it once, but it is sloppy. Instead, I should have done a bit shift.

   var setCount = 1 << initialList.Count();

This bit shift operator takes the value of the left operand (one) and shifts it to the left. Thus if the initial count is zero, no shift is done, and there will be one resulting item, the empty list. If there are two items, the single bit that is set (initially 1) is shifted twice, and the result is 4:

ComboAlgorithmExplanation4

Since each number is an entry in my set of masks, I iterate over the list (setCount is 16 for four items):

   // Start at 1 if you do not want the empty set
   for (int mask = 0; mask < setCount; mask++)
   {

For my purposes, I’m creating a list – alternatively, you could build a string or create something custom from the objects in the list (a custom projection).

      var nestedList = new List<T>();

I then iterate over the count of initial items (4 for four items) – this corresponds to iterating over the columns of the grid:

      for (int j = 0; j < initialList.Count(); j++)
      {

For each column, I need the value of that bit position – the value above the letter in the grids above. I can calculate this using the bit shift operator. Since this operation will be performed many times, you definitely want to use bit shift instead of Math.Pow here:

          // Each position in the initial list maps to a bit here
          var pos = 1 << j;

I use a bitwise And operator to determine whether the bit is set, for each item in the list. If the mask has the bit in the position j set, then that entry in the initial list is added to the new nested list.

         if ((mask & pos) == pos) { nestedList.Add(initialList[j]); }
      }

Finally, I add the new nested list to the lists that will be returned, and finish out the loops.

      ret.Add(nestedList);
   }
   return ret;
}

Questions?

by Kathleen | 1 comment(s)
Filed under: ,