EventSource: A New Way to Trace
Module 3 Highlights
This is the next installment in my series of high points in .NET 4.5, which parallels my Pluralsight course “What’s New in .NET 4.5”. I think the course offers important insight and details into the release, and if you’ve already watched it, this will give a second explanation of the most important features. If you don’t watch it, you can follow along with the most important features I cover.
First two installments of the course highlights are here and here.
Tracing becomes more and more critical as our applications move onto remote servers and into the cloud. Even with other tools available, desktop applications still need tracing to determine what the user was doing when a problem appears. Debug mode is rarely an option in production.
Microsoft’s answer is high speed unified tracing within the operating system: Event Tracing for Windows. ETW has been around since Vista, so it’s hardly a news flash. What’s new in .NET 4.5 is that ETW goes from being a royal pain to use to being one of the most simple and elegant features of .NET. This change makes ETW support and the new EventSource class is the sleeper feature of .NET 4.5 – the most important feature you never heard of.
So, let me explain, and hopefully inspire you to completely rethink your tracing strategy. Warning, if you’re fantasies get too carried away, I’ll knock a few over when I cover shortcomings near the end of the post.
While I won’t cover it hear, the Semantic Logging Block Patterns and Practices is using unifies ETW with traditional trace output locations and plus a couple of holes. So, if I get your attention, you may also want to explore what they’ve done – same underlying tools and syntax, with some important additions.
ETW is a high performance tracing system built into the operating system. It’s a fully composite system with some extreme extensibility options. There are four moving parts – ETW itself, providers, controllers and consumers. Because it’s a completely interchangeable system, not only can you come up with your own providers, controller and consumers to support ETW, you can replace ETW with something else filling it’s role – allowing alternate tracing options such as Silverlight support. That’s amazing, but the important parts to .NET 4.5 are providers, consumers, and controllers.
Consumers are important because the OS creates the smallest possible binary output to meet its goal of not hurting performance with tracing. There are a number of task specific consumers, including Visual Studio tooling, such as WCF tools and the Concurrency Visualizer. A good general purpose tool to consider is PerfView. And if you want, or need, to write your own tooling, you can use PerfView’s underlying libraries to access the binary stream. Or, if you need to party on raw trace events, you can get the binary stream transformed into XML output.
Controllers turn ETW on and off. Gone are the days of stopping your server, changing a configuration setting, and restarting the server – only to repeat in order to turn tracing off. Controllers communicate to providers whether they are on or off, and can do this to running applications. Tracing can be turned on before your application starts, as part of startup, or as the application runs. Controllers also communicate filtering information such as severity level to providers.
Providers create trace events. The operating system, .NET and other libraries are already fully on board. Right now you can fire up PerfView and get information on your application, including crashes. The purpose of the new EventSource class is to let you add events to this holistic stream.
Providers create trace events, controllers turn tracing on and off, and consumers do something with the resulting trace. Central to the design is a stream of trace events (actually a few streams, called channels, see more below). It doesn’t matter if the trace event is created by your application, the operating system, the .NET framework, WCF, or anything else.
This creates a fundamentally different approach to tracing because you don’t have much work to do. Almost everything you want to know is already in the trace system.
A New Way of Thinking about Tracing
If you’ve done tracing before, you probably worked to create a reasonably complete picture of what was happening in your application - you took responsibility for providing all of the relevant information. That’s changed. Since other providers are already dumping information into ETW you can think of the trace stream as a river with all of the information you can imagine flowing passed.
Your job is simply to complete the picture. The river has no context without your help. The surface is just a series of waves with no relation to what’s actually happening. Throw in key buoys about what you’re doing, and suddenly the part of the river adjacent to your buoy makes sense.
The markers you throw out can contain information like the current record. But remember the earlier caveat – what you put into the ETW stream with EventSource will be visible to whomever asks for it. You can’t put in sensitive information. Let me say that again – do not put any information into the ETW stream that can’t be seen by people with access to your servers.
Strong Typing and Manifests
One of the important ways ETW differs from some other approaches to tracing is that the information you provide is strongly typed. If you record primary keys and tables, you can find a trace vent with a particular primary key from a particular table. You can then watch that thread or process for information on the process that occurs.
To keep tracing blazingly fast, the operating system creates the trace as a binary stream of goo. Turning goo into strongly typed events requires en event specific definition, which is called a manifest in ETW.
Prior to .NET 4.5 you could create your own manifest or skip the strongly typed benefits of ETW. It’s icky work to create your own manifest and you have to install it on the machine that interprets the trace. This is really problematic if you version your trace event structure (and trust me, you will) because the machine may need to interpret traces from many other machines.
.NET 4.5 and the EventSource class creates the manifest for you and sidesteps manifest issues by using an “in-line manifest.” That’s just another event that contains the manifest. As long as the consumer knows how to grab that special manifest event and use the attached manifest (and PerfView does) all your manifest problems are solved.
The ETW goodness in .NET 4.5 is wrapped up in the EventSource class. The second best source of information on ETW (I like my Pluralsight video) is Vance Morrison’s blog posts, such as this.
While Vance’s posts do a great job, and this post is important, this particular post doesn’t mention that you should probably define your event id’s and provider name. Before you add tracing to your own application, be sure to understand those two features.
It’s Not All Joy – Bad Tooling
I said I might knock over your fantasies a bit. The biggest problem with ETW is a lack of good tooling. Please help pressure the Visual Studio team to supply an awesome consumer and at least a decent controller. Interpreting traces is hard and we don’t all need to work out the tricks.
PerfView is the best tool I know of now for the things you’ll need to do. But imagine Microsoft releasing the tool one dev made for their own use and it sort of spiraled into general use by those brainiac softies, and then (thank goodness) they released it to use. You won’t be too far off the mark. But from that you might guess that the important parts are open source, so you can use the underlying library to build your own tool. You’ll find the TraceEvent library on the BCL team’s CodePlex site.
It’s Not Complete – Channels, etc.
The other issue with ETW right now is hard to complain about. Once they got to the edges of what they needed and problems they understood – largely around profiling the .NET framework itself – they slowed down. I don’t think this is laziness, but an honest desire to get it right. I think the most obvious place you’ll see this is the lack of channel support.
ETW supports four channels. While you can define your own, I haven’t yet seen a good use case. The well-known channels are Admin, Operational, Analytic and Debug. EventSource tracing aims at the last – events by devs for devs. Your IT folks (and probably you as well) use the Windows Event Viewer to gauge system healthy. At present, trace events you create with the EventSource class do not show up in Event Viewer because they aren’t admin events.
I was initially appalled by this omission, annoyed that it felt like they ran out of time. But I’ve evolved my thinking. Event Viewer and trace events for the purpose of system health monitoring and warm fuzzies for your IT staff, are inherently different than profiling and debugging information (which overlap). Specifically, you want bells and whistles for filtering and you want to filter extra events at the point of consumption (assuming they aren’t so extreme that they affect performance).
Your IT folks probably don’t see it that way. They probably want your apps to tell them what they need to know when they need to know it. “Just the facts, ma’am.” And maybe strong typing doesn’t matter very much. Let’s start a conversation about the best ways to get information into the admin channel.
Semantic Logging Application Block
I’m just getting into this, but I’m excited about what I’m hearing about the Semantic Logging Application Block (other than its name, and it’s even worse acronym, but I’ll survive). You can get info on the CTP here and here.
I’m really happy that P&P is building on the good work of the BCL team and building on it in ways that make sense to business programmers. What most catches my eye is that you’re EventSource derived classes can be used to write to ETW or more traditional trace listeners. I hope to have time to explore the Semantic Logging Application Block in the coming weeks and post a more thorough discussion.