Windows 7 64bit, Outlook 2010 64bit, Conferencing Addin 64bit, Macbook Pro 64bit

I am a 64bit freak. I got Windows 7, Outlook 2010, Conferencing Addin all 64 bit versions to work on a Macbook pro. Those who are thinking about moving to 64bit and hesitating whether something will break, GO AHEAD! Macbook Pro hardware and Microsoft’s software are the best combination out there. You will enjoy every moment you spend with your laptop. Moreover, I have tried these combinations on HP tablet PC, Sony VAIO, Dell Inspiron and Dell Vostro. HP works best. Others are struggling with driver issues.

I will give you positive and negative feedback with the apps I have tried so far:

Outlook 2010 64 bit:

image

Here are my negative feedback. Outlook Product Manager, please read this. I am a hardcore Outlook customer of you.

  • All my Outlook COM addins are dead. Outlook 2010 64bit does not support them. Looks like not so good backward compatibility.
  • Not so significant improvement with Exchange 2007. The startup time has improved from about 5 secs to 2 secs. But the startup time saving is not really a big saver since I start outlook and it keeps running for days until my PC is so screwed that I need a restart.
  • Office Communicator 2005 does not work.
  • The beta Office 2010 applications are CPU hungry. I see 30% to 40% CPU most of the time.
  • It took me over 30 hours until Outlook 2010 started to perform well. All this time, it was indexing and indexing and indexing and burning CPU.
  • There’s nothing so ground breaking and productivity enhancing in Outlook 2010 yet. After upgrading and using it for couple of days, I don’t see something so attractive that justifies the time spent in upgrading for busy professional. It’s not upgrade at this stage so far. You have to uninstall all Office 2007 or earlier products, addins etc and then install Outlook 2010.
  • Outlook Keyboard shortcuts are changed, having hard time adjusting. My precious Alt+L for Reply to All is gone. Now it’s Ctrl+Shift+R. Come on guys, when do you just Reply and not to Reply to All? I barely remember ever using Reply only. It’s always Reply to All. Can’t you make a easier shortcut for this?
  • Keyboard focus gets lost to some weird place sometimes and my navigation using cursor gets broken. I have to click using mouse to get into track.
  • Quick Tasks are kind of limited. For ex, “Reply & Delete”, who would want to press CTRL+SHIFT+1 to do reply and delete? It’s more natural to press Ctrl+R to reply and then send it and hit DEL. The choices on Quick Tasks are limited as well. I was hoping I would be able to chain multiple commands like – open a new message window, select a specific account to send mail using, select a specific signature and after the mail is sent, show move dialog box to move the conversation to a specific folder. Nope, it does not work this way. First of all there are limited commands which does not even support this. Secondly, all the actions are performed instantly one after another without waiting for the first action to complete.
  • Quick Steps cannot be added to Quick Access Toolbar. Go figure!

image

Now the good things:

  • Overall Outlook experience is smooth. Opening new mail, typing address, doing search, moving messages, viewing a folder on conversation view mode are all significantly faster, even with Exchange. It’s hard to say if it’s due to fully 64 bit environment or due to the fact that none of my COM addins are working.
  • Outlook exits. Finally! None of the previous Outlook would terminate the process if I exited Outlook. It remains in memory forever unless I kill it from task manager. Now the Outlook really closes, or at least kills itself when I exit. Whenever I exit Outlook and start again, I see it doing some Data Integrity check. This means it is not really closing itself properly, but killing itself. I assume that’s bad and my data in Outlook are slowly getting messed up.
  • The conversation view is great!
  • Inline appointment viewer is a life saver. When I get an appointment invite, the email preview shows a small view of the calendar around the meeting time. I can see if I am occupied or if there’s an available time before or after the meeting. This saves me a lot of time everyday as rescheduling meeting is a tedious job in my company and it takes around 4 to 7 reschedules attempts to get a suitable time slot in everyone's diary for every darn meeting.
  • Quick Steps is more or less useful. I am getting used to using CTRL+SHIFT+1 to “reply to all and delete” and CTRL+SHIFT+2 for “reply to all and move to folder”. You just have to configure the quick steps to make it suit you. Previously I used to use QuickFile addin, which was a super useful tool, worth paying $39.95.

Onenote 2010 64 bit

The UI is certainly much slicker. It really looks and feels like a notebook now. Sketching performance is improved.

However, a big bug. I was sketching and suddenly my pointer switched to selection mode from pen. All pen options are disabled. I tried exiting and coming back. Nope. Can’t go back to pen mode at all. I am using a Genius Tablet. Looks like Onenote is Tablet PC friendly only. Hope Apple makes a Tablet Macbook Pro soon.

Word 2010 64 bit

Haven’t used it much. Ribbons are as confusing as before. The File menu is even more confusing now. No new shape styles that makes word documents stand out from the rest. No new Smart Art worth mentioning. Overall – disappointing.

The print features are much improved!

Powerpoint 2010 64 bit

I did not notice any significant new feature in Powerpoint, sadly. The ribbon has been made more useful than before. There’s a “Transition” and “Animations” ribbon bar which is very useful to use and saves time putting animations in slides. But that’s all I could see from my limited trial. This is disappointing. I was expecting there would be richer collection of shapes which are really cool to look and makes presentations look like Web 2.0 sites, a lot of new Smart Arts, but nothing.

image

Visio 2010 64 bit

The UML Diagram designer is as crappy as ever. Come on Microsoft, watch the other UML designers and learn from them. Currently Visio is my last choice for UML design and makes my job life unhappy because my company forces me to use it. I use PlantUML wherever I can.

I don’t see any new amazingly cool diagram either. I was hoping the Detailed Network Diagram stencil would be much improved with smooth round glossy servers, amazingly cool looking router icons etc. But no luck. The new ribbon interface is as confusing as other Office applications.

Conclusion

So far I can see significant improvement in Outlook only. Other apps do not have anything that stands out.

Posted by omar with 3 comment(s)
Filed under: ,

Unit Testing and Integration Testing in real projects

I am yet to find a proper sample on how to do realistic Test Driven Development (TDD) and how to write proper unit tests for complex business applications, that gives you enough confidence to stop doing manual tests anymore. Generally the samples show you how to test a Stack or a LinkedList, which is far simpler than testing a typical N-tier application, especially if you are using Entity Framework or Linq to SQL or some ORM in data access layer, and doing logging, validation, caching, error handling at middle tier. There are many articles, blog posts, video tutorials on how to write unit tests, which I believe are all very good starting points. But all these examples show you basic tests, not good enough to let your QA team go. So, let me try to show you some realistic unit and integration test examples which should help you write tests that gives you confidence and helps you gradually move towards TDD.  

I will show you tests done on my open source project Dropthings, which is a Web 2.0 AJAX portal built using jQuery, ASP.NET 3.5, Linq to SQL, Dependency Injection using Unity, caching using Microsoft Enterprise Library, Velocity and so on. Basically all the hot techs you can grasp in one shot. The project is a typical N-tier application where there’s a web layer, a business layer and a data access layer. Writing unit tests, integration tests and load tests for this project was challenging, and thus interesting to share so that you can see how you can implement Unit Testing and Integration Testing in a real project and gradually get into Test Driven Development.

image

Read this codeproject article of mine to learn how I did Integration Tests and Unit Tests using Behavior Driven Development approach:

Unit Testing and Integration Testing in business applications

http://www.codeproject.com/KB/testing/realtesting.aspx

If you like it, please vote for me.

kick it Shout it
Posted by omar with 3 comment(s)
Filed under: , ,

Simple way to cache objects and collections for greater performance and scalability

Caching of frequently used data greatly increases the scalability of your application since you can avoid repeated queries on database, file system or to webservices. When objects are cached, it can be retrieved from the cache which is lot faster and more scalable than loading from database, file or web service. However, implementing caching is tricky and monotonous when you have to do it for many classes. Your data access layer gets a whole lot of code that deals with caching objects and collection, updating cache when objects change or get deleted, expire collections when a contained object changes or gets deleted and so on. The more code you write, the more maintenance overhead you add. Here I will show you how you can make the caching a lot easier using Linq to SQL and my library AspectF. It’s a library that helps you get rid of thousands of lines of repeated code from a medium sized project and eliminates plumbing (logging, error handling, retrying etc) type code completely.

Here’s an example how caching significantly improves the performance and scalabitlity of applications. Dropthings – my open source Web 2.0 AJAX portal, without caching can only serve about 11 request/sec with 10 concurrent users on a dual core 64 bit PC. Here data is loaded from database as well as from external sources. Avg page response time is 1.44 sec.

Load Test Without Cache

After implementing caching, it became significantly faster, around 32 requests/sec. Page load time decreased significantly as well to 0.41 sec only. During the load test, CPU utilization was around 60%.

Load Test with in memory cache

It shows clearly the significant difference it can make to your application. If you are suffering from poor page load performance and high CPU or disk activity on your database and application server, then caching Top 5 most frequently used objects in your application will solve that problem right away. It’s a quick win to make your application a lot faster than doing complex re-engineering in your application.

Common approaches to caching objects and collections

Sometimes the caching can be simple, for example caching a single object which does not belong to a collection and does not have child collections that are cached separately. In such case, you write simple code like this:

  • Is the object being requested already in cache?
    • Yes, then serve it from cache.
    • No, then load it from database and then cache it.

On the other hand, when you are dealing with cached collection where each item in the collection is also cached separately, then the caching logic is not so simple. For example, say you have cached a User collection. But each User object is also cached separately because you need to load individual User objects frequently. Then the caching logic gets more complicated:

  • Is the collection being requested already in cache?
    • Yes. Get the collection. For each object in the collection:
      • Is that object individually available in cache?
        • Yes, get the individual object from cache. Update it in the collection.
        • No, discard the whole collection from cache. Go to next step:
    • No. Load the collection from source (eg database) and cache each item in the collection separately. Then cache the collection.

You might be thinking why do we need to read each individual item from cache and why do we need to cache each item in collection separarely when the whole collection is already in cache? There are two scenarios you need to address when you cache a collection and individual items in that collection are also cached separately:

  • An individual item has been updated and the updated item is in cache. But the collection, which contains all those individual items, has not been refreshed. So, if you get the collection from cache and return as it is, you will get stale individual items inside that collection. This is why each item needs to be retrieved from cache separately.
  • An item in the collection may have been force expired in cache. For ex, something changed in the object or the object has been deleted. So, you expired it in cache so that on next retrieval it comes from database. If you load the collection from cache only, then the collection will contain the stale object.

If you are doing it the conventional way, you will be writing a lot of repeated code in your data access layer. For example, say you are loading a Page collection that belongs to a user. If you want to cache the collection of Page for a user as well as cache individual Page objects so that each Page can be retrieved from Cache directly. Then you need to write code like this:

public List<Page> GetPagesOfUserOldSchool(Guid userGuid)
{
    ICache cache = Services.Get<ICache>();
    bool isCacheStale = false;
    string cacheKey = CacheSetup.CacheKeys.PagesOfUser(userGuid);
    var cachedPages = cache.Get(cacheKey) as List<Page>;
    if (cachedPages != null)
    {
        var resultantPages = new List<Page>();
        // If each item in the collection is no longer in cache, invalidate the collection
        // and load again.
        foreach (Page cachedPage in cachedPages)
        {
            var individualPageInCache = cache.Get(CacheSetup.CacheKeys.PageId(cachedPage.ID)) as Page;
            if (null == individualPageInCache)
            {
                // Some item is missing in cache. So, the collection is stale. 
                isCacheStale = true;
            }
            else
            {
                resultantPages.Add(individualPageInCache);
            }
        }

        cachedPages = resultantPages;
    }

    if (isCacheStale)
    {
        // Collection not cached. Need to load collection from database and then cache it.
        var pagesOfUser = _database.GetList<Page, Guid>(...);
        pagesOfUser.Each(page =>
        {
            page.Detach();
            cache.Add(CacheSetup.CacheKeys.PageId(page.ID), page);
        });
        cache.Add(cacheKey, pagesOfUser);
        return pagesOfUser;
    }
    else
    {
        return cachedPages;
    }
}

Imagine writing this kind of code over and over again for each and every entity that you want to cache. This becomes a maintenace nightmare as your project grows.

Here’s how you could do it using AspectF:

public List<Page> GetPagesOfUser(Guid userGuid)
{
    return AspectF.Define
        .CacheList<Page, List<Page>>(Services.Get<ICache>(), 
CacheSetup.CacheKeys.PagesOfUser(userGuid),
page => CacheSetup.CacheKeys.PageId(page.ID)) .Return<List<Page>>(() => _database.GetList<Page, Guid>(...).Select(p => p.Detach()).ToList()); }

Instead of 42 lines of code, you can do it in 5 lines!

Read my article Simple way to cache objects and collections for greater performance and scalability on CodeProject and learn:

  • Caching Linq to SQL entities
  • Handling update and delete scenarios
  • Expiring dependent objects and collections in cache
  • Handling objects that’s cached with multiple keys
  • Avoid database query optimizations when you cache sets of data

Enjoy. Don’t forget to vote for me!

Burn! kick it Shout it
Posted by omar with 8 comment(s)
Filed under: , ,

7 tips for for loading Javascript rich Web 2.0-like sites significantly faster

Introduction

When you create rich Ajax application, you use external JavaScript frameworks and you have your own homemade code that drives your application. The problem with well known JavaScript framework is, they offer rich set of features which are not always necessary in its entirety. You may end up using only 30% of jQuery but you still download the full jQuery framework. So, you are downloading 70% unnecessary scripts. Similarly, you might have written your own javascripts which are not always used. There might be features which are not used when the site loads for the first time, resulting in unnecessary download during initial load. Initial loading time is crucial – it can make or break your website. We did some analysis and found that every 500ms we added to initial loading, we lost approx 30% traffic who never wait for the whole page to load and just close browser or go away. So, saving initial loading time, even by couple of hundred milliseconds, is crucial for survival of a startup, especially if it’s a Rich AJAX website.

You must have noticed Microsoft’s new tool Doloto which helps solve the following problem:

Modern Web 2.0 applications, such as GMail, Live Maps, Facebook and many others, use a combination of Dynamic HTML, JavaScript and other Web browser technologies commonly referred as AJAX to push page generation and content manipulation to the client web browser. This improves the responsiveness of these network-bound applications, but the shift of application execution from a back-end server to the client also often dramatically increases the amount of code that must first be downloaded to the browser. This creates an unfortunate Catch-22: to create responsive distributed Web 2.0 applications developers move code to the client, but for an application to be responsive, the code must first be transferred there, which takes time.

Microsoft Research looked at this problem and published this research paper in 2008, where they showed how much improvement can be achieved on initial loading if there was a way to split the javascripts frameworks into two parts – one primary part which is absolutely essential for initial rendering of the page and one auxiliary part which is not essential for initial load and can be downloaded later or on-demand when user does some action. They looked at my earlier startup Pageflakes and reported:

2.2.2 Dynamic Loading: Pageflakes
A contrast to Bunny Hunt is the Pageflakes application, an
industrial-strength mashup page providing portal-like functionality.
While the download size for Pageflakes is over 1 MB, its initial
execution time appears to be quite fast. Examining network activity
reveals that Pageflakes downloads only a small stub of code
with the initial page, and loads the rest of its code dynamically in
the background. As illustrated by Pageflakes, developers today can
use dynamic code loading to improve their web application’s performance.
However, designing an application architecture that is
amenable to dynamic code loading requires careful consideration
of JavaScript language issues such as function closures, scoping,
etc. Moreover, an optimal decomposition of code into dynamically
loaded components often requires developers to set aside the semantic
groupings of code and instead primarily consider the execution
order of functions. Of course, evolving code and changing
user workloads make both of these issues a software maintenance
nightmare.

Back in 2007, I was looking at ways to improve the initial load time and reduce user dropout. The number of users who would not wait for the page to load and go away was growing day by day as we introduced new and cool features. It was a surprise. We thought new features will keep more users on our site but the opposite happened. Analysis concluded it was the initial loading time that caused more dropout than it retained users. So, all our hard work was essentially going to drain and we had to come up with something ground breaking to solve the problem. Of course we had already tried all the basic stuffs – IIS compression, browser caching, on-demand loading of JavaScript, css and html when user does something, deferred JavaScript execution – but nothing helped. The frameworks and our own hand coded framework was just too large. So, the idea tricked me, what if we could load functions inside a class in two steps. First step will load the class with absolutely essential functions and second step will inject more functions to the existing classes.

I published a codeproject article which shows you 7 tricks to significantly improve page load time even if you have large amount of Javascript used on the page.

7 Tips for Loading JavaScript Rich Web 2.0-like Sites Significantly Faster

  1. Use Doloto
  2. Split a Class into Multiple JavaScript Files
  3. Stub the Functions Which Aren't Called During Initial Load
  4. JavaScript Code in Text
  5. Break UI Loading into Multiple Stages
  6. Always Grow Content from Top to Bottom, Never Shrink or Jump
  7. Deliver Browser Specific Script from Server

If you like these tricks, please vote for me!

Burn! kick it Shout it
Posted by omar with 9 comment(s)

Windows 7 64bit works!

Windows 7 64bit finally works! This is the first 64bit OS I could really use in my daily acitvities. I tried Vista 64bit, it was unreliable. It would show blue screen right when I am about to make a presentation to the CEO. Until Microsoft released SP1, Vista 64 bit was not usable at all. Then came Windows 7 beta. I immediately tried the 64bit version of Windows 7 beta. It was even worse than Vista. It would crash every now and then – waking up from standby, trying to do livemeeting share, switching screens, plugging in external USB drives and what not. So, I patiently waited for the final version to come out before I get on installing it on all my laptops. Happy to say, the final version works perfectly on HP tx2000 Tablet PC, DELL Vostro 1500, DELL Inspiron 1520. Once you do a full windows update and install some drivers here and there, it all works perfectly. And let me say, Windows 7 is beautiful. I found back the joy of working on computers again!

Working on 64bit Operating System is challenging. You don’t always find the right printer driver. Your cool external USB speakers won’t work – even if it is made by Microsoft. And above all, there’s that C:\Windows\Winsxs folder which keeps increasing forever. By the time I was done with Vista 64bit (two years approx in business), my Winsxs folder was staggering 26 GB eating up every bit out of my C: partition. I had no choice but to format and start over. It seems like this folder keeps copy of every single DLL version it ever sees. The more windows update I do, the larger it gets. Now on a fresh new Windows 7 installation, after installing VS 2008, Office Applications, Windows Live applications and some handy tools, the Winsxs folder is 5.62 GB. Let’s see how it keeps growing over the year. A useful information for 64bit wannabes, make sure your C partition is at least 60 GB. I just installed Windows 7 64bit 3 days back and it has already taken 31 GB space.

image

Since I am doing a totally useless post, let me sprinkle some productivity tips on it before you lose interest reading my blog.

I realized I do a lot of context swiching. I get over 200 mails per day, so I pretty much switch focus from Visual Studio/Browser to Outlook once every minute, which is big cencentration killer. So, I tried the above setup on my 25” screen and it works great!

The left half of the screen is visual studio and the right half screen shows Outlook and my todolist. As you see, I can see the emails coming up on Outlook without ever switching. The Visual Studio screen width is the right size to read code without horizontally scrolling. The right bottom half of the screen shows my toodlist so that I am always doing the right task from my todolist and not wondering around heedless. If I browse, I bring up the browser on top of the Visual Studio and keep the right half same so that while browsing I am not missing important mails and I still have an eye on my next actions.

I have been using Toodledo for a year. I love it! It has a geat iPhone app which is the only reason I use Toodledo and not other alternatives. The ajax interface is slick, especially when you use Google Chrome to make an application out of it on your desktop. You can turn on keyboard shortcuts and then the toodledo inside Google Chrome’s application like view becomes the best web based todolist application out there. Whenever I file a task, I hit ‘n’, enter the task title, press tab, 1/2 for priority, hit enter and I am done. How convenient! Especially when I read mails and file actionable tasks at least 40 to 60 times per day.

Burn! kick it Shout it
Posted by omar with 8 comment(s)
Filed under: , ,

AspectF fluent way to put Aspects into your code for separation of concern

Aspects are common features that you write every now and then in different parts of your project. it can be some specific way of handling exceptions in your code, or logging method calls, or timing execution of methods, or retrying some methods and so on. If you are not doing it using any Aspect Oriented Programming framework, then you are repeating a lot of similar code throughout the project, which is making your code hard to maintain. For ex, say you have a business layer where methods need to be logged, errors need to be handled in a certain way, execution needs to be timed, database operations need to be retried and so on. So, you write code like this:

public bool InsertCustomer(string firstName, string lastName, int age, 
    Dictionary<string, string> attributes)
{
    if (string.IsNullOrEmpty(firstName)) 
        throw new ApplicationException("first name cannot be empty");

    if (string.IsNullOrEmpty(lastName))
        throw new ApplicationException("last name cannot be empty");

    if (age < 0)
        throw new ApplicationException("Age must be non-zero");

    if (null == attributes)
        throw new ApplicationException("Attributes must not be null");

    // Log customer inserts and time the execution
    Logger.Writer.WriteLine("Inserting customer data...");
    DateTime start = DateTime.Now;

    try
    {
        CustomerData data = new CustomerData();
        bool result = data.Insert(firstName, lastName, age, attributes);
        if (result == true)
        {
            Logger.Writer.Write("Successfully inserted customer data in " 
                + (DateTime.Now-start).TotalSeconds + " seconds");
        }
        return result;
    }
    catch (Exception x)
    {
        // Try once more, may be it was a network blip or some temporary downtime
        try
        {
            CustomerData data = new CustomerData();
            if (result == true)
            {
                Logger.Writer.Write("Successfully inserted customer data in " 
                    + (DateTime.Now-start).TotalSeconds + " seconds");
            }
            return result;
        }
        catch 
        {
            // Failed on retry, safe to assume permanent failure.

            // Log the exceptions produced
            Exception current = x;
            int indent = 0;
            while (current != null)
            {
                string message = new string(Enumerable.Repeat('\t', indent).ToArray())
                    + current.Message;
                Debug.WriteLine(message);
                Logger.Writer.WriteLine(message);
                current = current.InnerException;
                indent++;
            }
            Debug.WriteLine(x.StackTrace);
            Logger.Writer.WriteLine(x.StackTrace);

            return false;
        }
    }

}

Here  you see the two lines of real code, which inserts the Customer calling a class, is hardly visible due to all the concerns (log, retry, exception handling, timing) you have to implement in business layer. There’s validation, error handling, caching, logging, timing, auditing, retring, dependency resolving and what not in business layers nowadays. The more a project matures, the more concerns get into your codebase. So, you keep copying and pasting boilerplate codes and write the tiny amount of real stuff somewhere inside that boilerplate. What’s worse, you have to do this for every business layer method. Say now you want to add a UpdateCustomer method in your business layer. you have to copy all the concerns again and put the two lines of real code somewhere inside that boilerplate.

Think of a scenario where you have to make a project wide change on how errors are handled. You have to go through all the hundreds of business layer functions you wrote and change it one by one. Say you need to change the way you time execution. You have to go through hundreds of functions again and do that.

Aspect Oriented Programming solves these challenges. When you are doing AOP, you do it the cool way:

[EnsureNonNullParameters]
[
Log]
[
TimeExecution]
[
RetryOnceOnFailure] public void InsertCustomerTheCoolway(string firstName, string lastName, int age, Dictionary<string, string> attributes) { CustomerData data = new CustomerData(); data.Insert(firstName, lastName, age, attributes); }

Here you have separated the common stuffs like logging, timing, retrying, validation, which are formally called ‘concern’, completely out of your real code. The method is nice and clean, to the point. All the concerns are taken out of the code of the function and added to the function using Attribute. Each Attribute here represents one Aspect. For example, you can add Logging aspect to any function just by adding the Log attribute. Whatever AOP framework you use, the framework ensures the Aspects are weaved into the code either at build time or at runtime.

There are AOP frameworks which allows you to weave the Aspects at compile time by using post build events and IL manipulations eg PostSharp, some does it at runtime using DynamicProxy and some requires your classes to inherit from ContextBoundObject in order to support Aspects using C# built-in features. All of these have some barrier to entry, you have to justify using some external library, do enough performance test to make sure the libraries scale and so on. What you need is a dead simple way to achieve “separation of concern”, may not be full blown Aspect Oriented Programming. Remember, the purpose here is separation of concern and keep code nice and clean.

So, let me show you a dead simple way of separation of concern, writing standard C# code, no Attribute or IL manipulation black magics, simple calls to classes and delegates, yet achieve nice separation of concern in a reusable and maintainable way. Best of all, it’s light, just one small class.

public void InsertCustomerTheEasyWay(string firstName, string lastName, int age,
    Dictionary<string, string> attributes)
{
    AspectF.Define
        .Log(Logger.Writer, "Inserting customer the easy way")
        .HowLong(Logger.Writer, "Starting customer insert", 
"Inserted customer in {1} seconds") .Retry() .Do(() => { CustomerData data = new CustomerData(); data.Insert(firstName, lastName, age, attributes); }); }

If you want to read details about how it works and it can save you hundreds of hours of repeatetive coding, read on:

AspectF fluent way to add Aspects for cleaner maintainable code

If you like it, please vote for me!

Burn! kick it Shout it
Posted by omar with 15 comment(s)
Filed under: ,

ASP.NET AJAX testing made easy using Visual Studio 2008 Web Test

Visual Studio 2008 comes with rich Web Testing support, but it’s not rich enough to test highly dynamic AJAX websites where the page content is generated dynamically from database and the same page output changes very frequently based on some external data source e.g. RSS feed. Although you can use the Web Test Record feature to record some browser actions by running a real browser and then play it back. But if the page that you are testing changes everytime you visit the page, then your recorded tests no longer work as expected. The problem with recorded Web Test is that it stores the generated ASP.NET Control ID, Form field names inside the test. If the page is no longer producing the same ASP.NET Control ID or same Form fields, then the recorded test no longer works. A very simple example is in VS Web Test, you can say “click the button with ID ctrl00_UpdatePanel003_SubmitButton002”, but you cannot say “click the 2nd Submit button inside the third UpdatePanel”. Another key limitation is in Web Tests, you cannot address Controls using the Server side Control ID like “SubmitButton”. You have to always use the generated Client ID which is something weird like “ctrl_00_SomeControl001_SubmitButton”. Moreover, if you are making AJAX calls where certain call returns some JSON or updates some UpdatePanel and then based on the server returned response, you want to make further AJAX calls or post the refreshed UpdatePanel, then recorded tests don’t work properly. You *do* have the option to write the tests hand coded and write code to handle such scenario but it’s pretty difficult to write hand coded tests when you are using UpdatePanels because you have to keep track of the page viewstates, form hidden variables etc across async post backs. So, I have built a library that makes it significantly easier to test dynamic AJAX websites and UpdatePanel rich web pages. There are several ExtractionRule and ValidationRule available in the library which makes testing Cookies, Response Headers, JSON output, discovering all UpdatePanel in a page, finding controls in the response body, finding controls inside some UpdatePanel all very easy.

First, let me give you an example of what can be tested using this library. My open source project Dropthings produces a Web 2.0 Start Page where the page is composed of widgets.

image

Each widget is composed of two UpdatePanel. There’s a header area in each widget which is one UpdatePanel and the body area is another UpdatePanel. Each widget is rendered from database using the unique ID of the widget row, which is an INT IDENTITY. Every page has unique widgets, with unique ASP.NET Control ID. As a result, there’s no way you can record a test and play it back because none of the ASP.NET Control IDs are ever same for the same page on different visits. This is where my library comes to the rescue.

See the web test I did:

image

This test simulates an anonymous user visit. When anonymous user visits Dropthings for the first time, two pages are created with some default widgets. You can also add new widgets on the page, you can drag & drop widgets, you can delete a widget that you don’t like.

This Web Test simulates these behaviors automatically:

  • Visit the homepage
  • Show the widget list which is an UpdatePanel. It checks if the UpdatePanel contains the BBC World widget.
  • Then it clicks on the “Edit” link of the “How to of the day” widget which brings up some options dynamically inside an UpdatePanel. Then it tries to change the Dropdown value inside the UpdatePanel to 10.
  • Adds a new widget from the Widget List. Ensures that the UpdatePanel postback successfully renders the new widget.
  • Deletes the newly added widget and ensures the widget is gone.
  • Logs user out.

If you want to learn details about the project, read my codeproject article:

http://www.codeproject.com/KB/aspnet/aspnetajaxtesting.aspx

Please vote if you find this useful.

DotNetKicks Image
Posted by omar with 9 comment(s)
Filed under: , ,

Web 2.0 AJAX Portal using jQuery, ASP.NET 3.5, Silverlight, Linq to SQL, WF and Unity

Dropthings – my open source Web 2.0 Ajax Portal has gone through a technology overhauling. Previously it was built using ASP.NET AJAX, a little bit of Workflow Foundation and Linq to SQL. Now Dropthings boasts full jQuery front-end combined with ASP.NET AJAX UpdatePanel, Silverlight widget, full Workflow Foundation implementation on the business layer, 100% Linq to SQL Compiled Queries on the data access layer, Dependency Injection and Inversion of Control (IoC) using Microsoft Enterprise Library 4.1 and Unity. It also has a ASP.NET AJAX Web Test framework that makes it real easy to write Web Tests that simulates real user actions on AJAX web pages. This article will walk you through the challenges in getting these new technologies to work in an ASP.NET website and how performance, scalability, extensibility and maintainability has significantly improved by the new technologies. Dropthings has been licensed for commercial use by prominent companies including BT Business, Intel, Microsoft IS, Denmark Government portal for Citizens; Startups like Limead and many more. So, this is serious stuff! There’s a very cool open source implementation of Dropthings framework available at National University of Singapore portal.

Visit: http://dropthings.omaralzabir.com

Dropthings AJAX Portal

I have published a new article on this on CodeProject:

http://www.codeproject.com/KB/ajax/Web20Portal.aspx

Get the source code

Latest source code is hosted at Google code:

http://code.google.com/p/dropthings

There’s a CodePlex site for documentation and issue tracking:

http://www.codeplex.com/dropthings

You will need Visual Studio 2008 Team Suite with Service Pack 1 and Silverlight 2 SDK in order to run all the projects. If you have only Visual Studio 2008 Professional, then you will have to remove the Dropthings.Test project.

New features introduced

Dropthings new release has the following features:

  • Template users – you can define a user who’s pages and widgets are used as a template for new users. Whatever you put in that template user’s pages, it will be copied for every new user. Thus this is an easier way to define the default pages and widgets for new users. Similarly you can do the same for a registered user. The template users can be defined in the web.config.
  • Widget-to-Widget communication – Widgets can send message to each other. Widgets can subscribe to an Event Broker and exchange messages using a Pub-Sub pattern.
  • WidgetZone – you can create any number of zones in any shape on the page. You can have widgets laid in horizontal layout, you can have zones on different places on the page and so on. With this zone model, you are no longer limited to the Page-Column model where you could only have N vertical columns.
  • Role based widgets – now widgets are mapped to roles so that you can allow different users to see different widget list using ManageWidgetPersmission.aspx.
  • Role based page setup – you can define page setup for different roles. For ex, Managers see different pages and widgets than Employees.
  • Widget maximize – you can maximize a widget to take full screen. Handy for widgets with lots of content.
  • Free form resize – you can freely resize widgets vertically.
  • Silverlight Widgets – You can now make widgets in Silverlight!

Why the technology overhauling

Performance, Scalability, Maintainability and Extensibility – four key reasons for the overhauling. Each new technology solved one of more of these problems.

First, jQuery was used to replace my personal hand-coded large amount of Javascript code that offered the client side drag & drop and other UI effects. jQuery already has a rich set of library for Drag & Drop, Animations, Event handling, cross browser javascript framework and so on. So, using jQuery means opening the door to thousands of jQuery plugins to be offered on Dropthings. This made Dropthings highly extensible on the client side. Moreover, jQuery is very light. Unlike AJAX Control Toolkit jumbo sized framework and heavy control extenders, jQuery is very lean. So, total javascript size decreased significantly resulting in improved page load time. In total, the jQuery framework, AJAX basic framework, all my stuffs are total 395KB, sweet! Performance is key; it makes or breaks a product.

Secondly, Linq to SQL queries are replaced with Compiled Queries. Dropthings did not survive a load test when regular lambda expressions were used to query database. I could only reach up to 12 Req/Sec using 20 concurrent users without burning up web server CPU on a Quad Core DELL server.

Thirdly, Workflow Foundation is used to build operations that require multiple Data Access Classes to perform together in a single transaction. Instead of writing large functions with many if…else conditions, for…loops, it’s better to write them in a Workflow because you can visually see the flow of execution and you can reuse Activities among different Workflows. Best of all, architects can design workflows and developers can fill-in code inside Activities. So, I could design a complex operations in a workflow without writing the real code inside Activities and then ask someone else to implement each Activity. It is like handing over a design document to developers to implement each unit module, only that here everything is strongly typed and verified by compiler. If you strictly follow Single Responsibility Principle for your Activities, which is a smart way of saying one Activity does only one and very simple task, you end up with a highly reusable and maintainable business layer and a very clean code that’s easily extensible.

Fourthly, Unity Dependency Injection (DI) framework is used to pave the path for unit testing and dependency injection. It offers Inversion of Control (IoC), which enables testing individual classes in isolation. Moreover, it has a handy feature to control lifetime of objects. Instead of creating instance of commonly used classes several times within the same request, you can make instances thread level, which means only one instance is created per thread and subsequent calls reuse the same instance. Are these going over your head? No worries, continue reading, I will explain later on.

Fifthly, enabling API for Silverlight widgets allows more interactive widgets to be built using Silverlight. HTML and Javascripts still have limitations on smooth graphics and continuous transmission of data from web server. Silverlight solves all of these problems.

Read the article for details on how all these improvements were done and how all these hot techs play together in a very useful open source project for enterprises.

http://www.codeproject.com/KB/ajax/Web20Portal.aspx

Don’t forget to vote for me if you like it.

DotNetKicks Image
Posted by omar with 11 comment(s)

Memory Leak with delegates and workflow foundation

Recently after Load Testing my open source project Dropthings, I encountered a lot of memory leak. I found lots of Workflow Instances and Linq Entities were left in memory and never collected. After profiling the web application using .NET Memory Profiler, it showed the real picture:

image

It shows you that instances of the several types are being created but not being removed. You see the “New” column has positive value, but the “Remove” column has 0. That means new instances are being created, but not removed. Basically the way you do Memory Profiling is, you take two snapshots. Say you take one snapshot when you first visit your website. Then you do some action on the website that results in allocation of objects. Then you take another snapshot. When you compare both snapshots, you can see how many instances of classes were created between these two snapshots and how many were removed. If they are not equal, then you have leak. Generally in web application many objects are created on every page hit and the end of the request, all those objects are supposed to be released. If they are not released, then we have a problem. But that’s the scenario for desktop applications because in a desktop application, objects can remain in memory until app is closed. But you should know best from the code which objects were supposed to go out of scope and get released.

For beginners, leak means objects are being allocated but not being freed because someone is holding reference to the objects. When objects leak, they remain in memory forever, until the process (or app domain) is closed. So, if you have a leaky website, your website is continuously taking up memory until it runs out of memory on the web server and thus crash. So, memory leak is a bad – it prevents you from running your product for long duration and requires frequent restart of app pool.

So, the above screenshot shows Workflow and Linq related classes are not being removed, and thus leaking. This means somewhere workflow instances are not being released and thus all workflow related objects are remaining. You can see the number is same 48 for all workflow related objects. This is a good indication that, almost every instance of workflow is leaked because there were total 48 workflows created and ran. Moreover it indicates we have a leak from a top Workflow instance level, not in some specific Activity or somewhere deep in the code.

As the workflows use Linq stuff, they held reference to the Linq stuffs and thus the Linq stuffs leaked as well. Sometimes you might be looking for why A is leaking. But you actually end up finding that since B was holding reference to A and B was leaking and thus A was leaking as well. This is sometimes tricky to figure out and you spend a lot of time looking at the wrong direction.

Now let me show you the buggy code:

ManualWorkflowSchedulerService manualScheduler = 
workflowRuntime.GetService<ManualWorkflowSchedulerService>(); WorkflowInstance instance = workflowRuntime.CreateWorkflow(workflowType, properties); instance.Start(); EventHandler<WorkflowCompletedEventArgs> completedHandler = null; completedHandler = delegate(object o, WorkflowCompletedEventArgs e) { if (e.WorkflowInstance.InstanceId == instance.InstanceId) // 1. instance { workflowRuntime.WorkflowCompleted -= completedHandler; // 2. terminatedhandler // copy the output parameters in the specified properties dictionary Dictionary<string,object>.Enumerator enumerator =
e.OutputParameters.GetEnumerator(); while( enumerator.MoveNext() ) { KeyValuePair<string,object> pair = enumerator.Current; if( properties.ContainsKey(pair.Key) ) { properties[pair.Key] = pair.Value; } } } }; Exception x = null; EventHandler<WorkflowTerminatedEventArgs> terminatedHandler = null; terminatedHandler = delegate(object o, WorkflowTerminatedEventArgs e) { if (e.WorkflowInstance.InstanceId == instance.InstanceId) // 3. instance { workflowRuntime.WorkflowTerminated -= terminatedHandler; // 4. completeHandler Debug.WriteLine( e.Exception ); x = e.Exception; } }; workflowRuntime.WorkflowCompleted += completedHandler; workflowRuntime.WorkflowTerminated += terminatedHandler; manualScheduler.RunWorkflow(instance.InstanceId);

Can you spot the code where it leaked?

I have numbered the lines in comment where the leak is happening. Here the delegate is acting like a closure and those who are from Javascript background know closure is evil. They leak memory unless very carefully written. Here the delegate keeps a reference to the instance object. So, if somehow delegate is not released, the instance will remain in memory forever and thus leak. Now can you find a situation when the delegate will not be released?

Say the workflow completed. It will fire the completeHandler. But the completeHandler will not release the terminateHandler. Thus the terminateHandler remains in memory and it also holds reference to the instance. So, we have a leaky delegate leaking whatever it is holding onto outside it’s scope. Here the only thing outside the scope if the instance, which it is tried to access from the parent function.

Since the workflow instance is not released, all the properties the workflow and all the activities inside it are holding onto remains in memory. Most of the workflows and activities expose public properties which are Linq Entities. Thus the Linq Entities remain in memory. Now Linq Entities keep a reference to the DataContext from where it is produced. Thus we have DataContext remaining in memory. Moreover, DataContext keeps reference to many internal objects and metadata cacahe, so they remain in memory as well.

So, the correct code is:

ManualWorkflowSchedulerService manualScheduler = 
workflowRuntime.GetService<ManualWorkflowSchedulerService>(); WorkflowInstance instance = workflowRuntime.CreateWorkflow(workflowType, properties); instance.Start(); var instanceId = instance.InstanceId; EventHandler<WorkflowCompletedEventArgs> completedHandler = null; completedHandler = delegate(object o, WorkflowCompletedEventArgs e) { if (e.WorkflowInstance.InstanceId == instanceId) // 1. instanceId is a Guid { // copy the output parameters in the specified properties dictionary Dictionary<string,object>.Enumerator enumerator =
e.OutputParameters.GetEnumerator(); while( enumerator.MoveNext() ) { KeyValuePair<string,object> pair = enumerator.Current; if( properties.ContainsKey(pair.Key) ) { properties[pair.Key] = pair.Value; } } } }; Exception x = null; EventHandler<WorkflowTerminatedEventArgs> terminatedHandler = null; terminatedHandler = delegate(object o, WorkflowTerminatedEventArgs e) { if (e.WorkflowInstance.InstanceId == instanceId) // 2. instanceId is a Guid { x = e.Exception; Debug.WriteLine(e.Exception); } }; workflowRuntime.WorkflowCompleted += completedHandler; workflowRuntime.WorkflowTerminated += terminatedHandler; manualScheduler.RunWorkflow(instance.InstanceId); // 3. Both delegates are now released
workflowRuntime.WorkflowTerminated -= terminatedHandler; workflowRuntime.WorkflowCompleted -= completedHandler;

There are two changes – in both delegates, the instanceId variable is passed, instead of the instance. Since instanceId is a Guid, which is a struct type data type, not a class, there’s no issue of referencing. Structs are copied, not referenced. So, they don’t leak memory. Secondly, both delegates are released at the end of the workflow execution, thus releasing both references.

In Dropthings, I am using the famous CallWorkflow Activity by John Flanders, which is widely used to execute one Workflow from another synchronously. There’s a CallWorkflowService class which is responsible for synchronously executing another workflow and that has similar memory leak problem. The original code of the service is as following:

public class CallWorkflowService : WorkflowRuntimeService
{
    #region Methods

    public void StartWorkflow(Type workflowType,Dictionary<string,object> inparms, 
Guid caller,IComparable qn) { WorkflowRuntime wr = this.Runtime; WorkflowInstance wi = wr.CreateWorkflow(workflowType,inparms); wi.Start(); ManualWorkflowSchedulerService ss =
wr.GetService<ManualWorkflowSchedulerService>(); if (ss != null) ss.RunWorkflow(wi.InstanceId); EventHandler<WorkflowCompletedEventArgs> d = null; d = delegate(object o, WorkflowCompletedEventArgs e) { if (e.WorkflowInstance.InstanceId ==wi.InstanceId) { wr.WorkflowCompleted -= d; WorkflowInstance c = wr.GetWorkflow(caller); c.EnqueueItem(qn, e.OutputParameters, null, null); } }; EventHandler<WorkflowTerminatedEventArgs> te = null; te = delegate(object o, WorkflowTerminatedEventArgs e) { if (e.WorkflowInstance.InstanceId == wi.InstanceId) { wr.WorkflowTerminated -= te; WorkflowInstance c = wr.GetWorkflow(caller); c.EnqueueItem(qn, new Exception("Called Workflow Terminated",
e.Exception), null, null); } }; wr.WorkflowCompleted += d; wr.WorkflowTerminated += te; } #endregion Methods }

As you see, it has that same delegate holding reference to instance object problem. Moreover, there’s some queue stuff there, which requires the caller and qn parameter passed to the StartWorkflow function. So, not a straight forward fix.

I tried to rewrite the whole CallWorkflowService so that it does not require two delegates to be created per Workflow. Then I took the delegates out. Thus there’s no chance of closure holding reference to unwanted objects. The result looks like this:

public class CallWorkflowService : WorkflowRuntimeService
{
    #region Fields

    private EventHandler<WorkflowCompletedEventArgs> _CompletedHandler = null;
    private EventHandler<WorkflowTerminatedEventArgs> _TerminatedHandler = null;
    private Dictionary<Guid, WorkflowInfo> _WorkflowQueue = 
new Dictionary<Guid, WorkflowInfo>(); #endregion Fields #region Methods public void StartWorkflow(Type workflowType,Dictionary<string,object> inparms,
Guid caller,IComparable qn) { WorkflowRuntime wr = this.Runtime; WorkflowInstance wi = wr.CreateWorkflow(workflowType,inparms); wi.Start(); var instanceId = wi.InstanceId; _WorkflowQueue[instanceId] = new WorkflowInfo { Caller = caller, qn = qn }; ManualWorkflowSchedulerService ss =
wr.GetService<ManualWorkflowSchedulerService>(); if (ss != null) ss.RunWorkflow(wi.InstanceId); } protected override void OnStarted() { base.OnStarted(); if (null == _CompletedHandler) { _CompletedHandler = delegate(object o, WorkflowCompletedEventArgs e) { var instanceId = e.WorkflowInstance.InstanceId; if (_WorkflowQueue.ContainsKey(instanceId)) { WorkflowInfo wf = _WorkflowQueue[instanceId]; WorkflowInstance c = this.Runtime.GetWorkflow(wf.Caller); c.EnqueueItem(wf.qn, e.OutputParameters, null, null); _WorkflowQueue.Remove(instanceId); } }; this.Runtime.WorkflowCompleted += _CompletedHandler; } if (null == _TerminatedHandler) { _TerminatedHandler = delegate(object o, WorkflowTerminatedEventArgs e) { var instanceId = e.WorkflowInstance.InstanceId; if (_WorkflowQueue.ContainsKey(instanceId)) { WorkflowInfo wf = _WorkflowQueue[instanceId]; WorkflowInstance c = this.Runtime.GetWorkflow(wf.Caller); c.EnqueueItem(wf.qn,
new Exception("Called Workflow Terminated", e.Exception),
null, null); _WorkflowQueue.Remove(instanceId); } }; this.Runtime.WorkflowTerminated += _TerminatedHandler; } } protected override void OnStopped() { _WorkflowQueue.Clear(); base.OnStopped(); } #endregion Methods #region Nested Types private struct WorkflowInfo { #region Fields public Guid Caller; public IComparable qn; #endregion Fields } #endregion Nested Types }

After fixing the problem, another Memory Profile result showed the leak is gone:

image

As you see, the numbers vary, which means there’s no consistent leak. Moreover, looking at the types that remains in memory, they look more like metadata than instances of classes.  So, they are basically cached instances of metadata, not instances allocated during workflow execution which are supposed to be freed. So, we solved the memory leak!

Now you know how to write anonymous delegates without leaking memory and how to run workflow without leaking them. Basically, the principle theory is – if you are referencing some outside object from an anonymous delegate, make sure that object is not holding reference to the delegate in some way, may be directly or may be via some child objects of its own. Because then you have a circular reference. If possible, do not try to access objects e.g. instance inside an anonymous delegate that is declared outside the delegate. Try accessing instrinsic data types like int, string, DateTime, Guid etc which are not reference type variables. So, instead of referencing to an object, you should declare local variables e.g. instanceId that gets the value of properties (e.g. instance.InstanceId) from the object and then use those local variables inside the anonymous delegate.

DotNetKicks Image
Posted by omar with 1 comment(s)
Filed under: , , ,

Optimize ASP.NET Membership Stored Procedures for greater speed and scalability

Last year at Pageflakes, when we were getting millions of hits per day, we were having query timeout due to lock timeout and Transaction Deadlock errors. These locks were produced from aspnet_Users and aspnet_Membership tables. Since both of these tables are very high read (almost every request causes a read on these tables) and high write (every anonymous visit creates a row on aspnet_Users), there were just way too many locks created on these tables per second. SQL Counters showed thousands of locks per second being created. Moreover, we had queries that would select thousands of rows from these tables frequently and thus produced more locks for longer period, forcing other queries to timeout and thus throw errors on the website.

If you have read my last blog post, you know why such locks happen. Basically every table when it grows up to hold millions of records and becomes popular goes through this trouble. It’s just a part of scalability problem that is common to database. But we rarely take prevention about it in our early design.

The solution is simple, you should either have WITH (NOLOCK) or SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED before SELECT queries. Either of this will do. They tell SQL Server not to hold any lock on the table while it is reading the table. If some row is locked while the read is happening, it will just ignore that row. When you are reading a table thousand times per second, without these options, you are issuing lock on many places around the table thousand times per second. It not only makes read from table slower, but also so many lock prevents insert, update, delete from happening timely and thus queries timeout. If you have queries like “show the currently online users from last one hour based on LastActivityDate field”, that is going to issue such a wide lock that even other harmless select queries will timeout. And did I tell you that there’s no index on LastActivityDate on aspnet_Users table?

Now don’t blame yourself for not putting either of these options on your every stored proc and every dynamically generated SQL from the very first day. ASP.NET developers made the same mistake. You won’t see either of these used in any of the stored procs used by ASP.NET Membership. For example, the following stored proc gets called whenever you access Profile object:

ALTER PROCEDURE [dbo].[aspnet_Profile_GetProperties]
@ApplicationName
nvarchar(256),
@UserName nvarchar(256),
@CurrentTimeUtc datetime
AS
BEGIN

DECLARE
@ApplicationId uniqueidentifier
SELECT
@ApplicationId = NULL
SELECT
@ApplicationId = ApplicationId FROM
dbo.aspnet_Applications WHERE LOWER(@ApplicationName) = LoweredApplicationName
IF (@ApplicationId IS NULL)
RETURN

DECLARE
@UserId uniqueidentifier
DECLARE
@LastActivityDate datetime
SELECT
@UserId = NULL

SELECT
@UserId = UserId, @LastActivityDate = LastActivityDate
FROM dbo.aspnet_Users
WHERE ApplicationId = @ApplicationId AND LoweredUserName = LOWER(@UserName)

IF (@UserId IS NULL)
RETURN
SELECT TOP
1 PropertyNames, PropertyValuesString, PropertyValuesBinary
FROM dbo.aspnet_Profile
WHERE UserId = @UserId

IF (@@ROWCOUNT > 0)
BEGIN
UPDATE
dbo.aspnet_Users
SET LastActivityDate=@CurrentTimeUtc
WHERE UserId = @UserId
END
END

There are two SELECT operations that hold lock on two very high read tables – aspnet_Users and aspnet_Profile. Moreover, there’s a nasty UPDATE statement. It tries to update the LastActivityDate of a user whenever you access Profile object for the first time within a http request.

This stored proc alone is enough to bring your site down. It did to us because we are using Profile Provider everywhere. This stored proc was called around 300 times/sec. We were having nightmarish slow performance on the website and many lock timeouts and transaction deadlocks. So, we added the transaction isolation level and we also modified the UPDATE statement to only perform an update when the LastActivityDate is over an hour. So, this means, the same user’s LastActivityDate won’t be updated if the user hits the site within the same hour.

So, after the modifications, the stored proc looked like this:

ALTER PROCEDURE [dbo].[aspnet_Profile_GetProperties]
@ApplicationName
nvarchar(256),
@UserName nvarchar(256),
@CurrentTimeUtc datetime
AS
BEGIN
-- 1. Please no more locks during reads
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

DECLARE @ApplicationId uniqueidentifier
--SELECT @ApplicationId = NULL
--SELECT @ApplicationId = ApplicationId FROM dbo.aspnet_Applications
WHERE LOWER(@ApplicationName) = LoweredApplicationName
--IF (@ApplicationId IS NULL)
-- RETURN

-- 2. No more call to Application table. We have only one app dude!
SET @ApplicationId = dbo.udfGetAppId()

DECLARE @UserId uniqueidentifier
DECLARE
@LastActivityDate datetime
SELECT
@UserId = NULL

SELECT
@UserId = UserId, @LastActivityDate = LastActivityDate
FROM dbo.aspnet_Users
WHERE ApplicationId = @ApplicationId AND LoweredUserName = LOWER(@UserName)

IF (@UserId IS NULL)
RETURN
SELECT TOP
1 PropertyNames, PropertyValuesString, PropertyValuesBinary
FROM dbo.aspnet_Profile
WHERE UserId = @UserId

IF (@@ROWCOUNT > 0)
BEGIN
-- 3. Do not update the same user within an hour
IF DateDiff(n, @LastActivityDate, @CurrentTimeUtc) > 60
BEGIN
-- 4. Use ROWLOCK to lock only a row since we know this query
-- is highly selective
UPDATE dbo.aspnet_Users WITH(ROWLOCK)
SET LastActivityDate=@CurrentTimeUtc
WHERE UserId = @UserId
END
END
END

The changes I made are numbered and commented. No need for further explanation. The only tricky thing here is, I have eliminate call to Application table just to get the ApplicationID from ApplicationName. Since there’s only one application in a database (ever heard of multiple applications storing their user separately on the same database and the same table?), we don’t need to look up the ApplicationID on every call to every Membership stored proc. We can just get the ID and hard code it in a function.

CREATE FUNCTION dbo.udfGetAppId()
RETURNS uniqueidentifier
WITH EXECUTE AS
CALLER
AS
BEGIN
RETURN CONVERT
(uniqueidentifier, 'fd639154-299a-4a9d-b273-69dc28eb6388')
END;

This UDF returns the ApplicationID that I have hardcoded copying from the Application table. Thus it eliminates the need for quering on the Application table.

Similarly you should do the changes in all other stored procedures that belong to Membership Provider. All the stroc procs are missing proper locking, issues aggressive lock during update and too frequent updates than practical need. Most of them also try to resolve ApplicationID from ApplicationName, which is unnecessary when you have only one web application per database. Make these changes and enjoy lock contention free super performance from Membership Provider!

kick it on DotNetKicks.com

Posted by omar with 8 comment(s)

Linq to SQL solve Transaction deadlock and Query timeout problem using uncommitted reads

When your database tables start accumulating thousands of rows and many users start working on the same table concurrently, SELECT queries on the tables start producing lock contentions and transaction deadlocks. This is a common problem in any high volume website. As soon as you start getting several concurrent users hitting your website that results in SELECT queries on some large table like aspnet_users table that are also being updated very frequently, you end up having one of these errors:

Transaction (Process ID ##) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

Or,

Timeout Expired. The Timeout Period Elapsed Prior To Completion Of The Operation Or The Server Is Not Responding.

The solution to these problems are – use proper index on the table and use transaction isolation level Read Uncommitted or WITH (NOLOCK) in your SELECT queries. So, if you had a query like this:

SELECT * FORM aspnet_users 
where ApplicationID =’xxx’ AND LoweredUserName = 'someuser'

You should end up having any of the above errors under high load. There are two ways to solve this:

SET TRANSACTION LEVEL READ UNCOMMITTED;
SELECT * FROM aspnet_Users 
WHERE ApplicationID =’xxx’ AND LoweredUserName = 'someuser'

Or use the WITH (NOLOCK):

SELECT * FROM aspnet_Users WITH (NOLOCK) 
WHERE ApplicationID =’xxx’ AND LoweredUserName = 'someuser'

The reason for the errors are that since aspnet_users is a high read and high write table, during read, the table is partially locked and during write, it is also locked. So, when the locks overlap on each other from several queries and especially when there’s a query that’s trying to read a large number of rows and thus locking large number of rows, some of the queries either timeout or produce deadlocks.

Linq to Sql does not produce queries with the WITH (NOLOCK) option nor does it use READ UNCOMMITTED. So, if you are using Linq to SQL queries, you are going to end up with any of these problems on production pretty soon when your site becomes highly popular.

For example, here’s a very simple query:

using (var db = new DropthingsDataContext())
{
    var user = db.aspnet_Users.First();
    var pages = user.Pages.ToList();
}

DropthingsDataContext is a DataContext built from Dropthings database.

When you attach SQL Profiler, you get this:

image

You see none of the queries have READ UNCOMMITTED or WITH (NOLOCK).

The fix is to do this:

using (var db = new DropthingsDataContext2())
{
    db.Connection.Open();
    db.ExecuteCommand("SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;");

    var user = db.aspnet_Users.First();
    var pages = user.Pages.ToList();
}

This will result in the following profiler output

image

As you see, both queries execute within the same connection and the isolation level is set before the queries execute. So, both queries enjoy the isolation level.

Now there’s a catch, the connection does not close. This seems to be a bug in the DataContext that when it is disposed, it does not dispose the connection it is holding onto.

In order to solve this, I have made a child class of the DropthingsDataContext named DropthingsDataContext2 which overrides the Dispose method and closes the connection.

   class DropthingsDataContext2 : DropthingsDataContext, IDisposable
    {
        public new void Dispose()
        {
            if (base.Connection != null)
                if (base.Connection.State != System.Data.ConnectionState.Closed)
                {
                    base.Connection.Close();
                    base.Connection.Dispose();
                }

            base.Dispose();            
        }
    }

This solved the connection problem.

There you have it, no more transaction deadlock or lock contention from Linq to SQL queries. But remember, this is only to eliminate such problems when your database already has the right indexes. If you do not have the proper index, then you will end up having lock contention and query timeouts anyway.

There’s one more catch, READ UNCOMMITTED will return rows from transactions that have not completed yet. So, you might be reading rows from transactions that will rollback. Since that’s generally an exceptional scenario, you are more or less safe with uncommitted read, but not for financial applications where transaction rollback is a common scenario. In such case, go for committed read or repeatable read.

There’s another way you can achieve the same, which seems to work, that is using .NET Transactions. Here’s the code snippet:

using (var transaction = new TransactionScope(
    TransactionScopeOption.RequiresNew,
    new TransactionOptions()
    {
        IsolationLevel = IsolationLevel.ReadUncommitted,
        Timeout = TimeSpan.FromSeconds(30)
    }))
{
    using (var db = new DropthingsDataContext())
    {
        var user = db.aspnet_Users.First();
        var pages = user.Pages.ToList();

        transaction.Complete();
    }
}

Profiler shows a transaction begins and ends:

image

The downside is it wraps your calls in a transaction. So, you are unnecessarily creating transactions even for SELECT operations. When you do this hundred times per second on a web application, it’s a significant over head.

Some really good examples of deadlocks are given in this article:

http://www.code-magazine.com/article.aspx?quickid=0309101&page=2

I highly recommend it.

kick it on DotNetKicks.com
Posted by omar with 12 comment(s)

Strongly typed workflow input and output arguments

When you run a Workflow using Workflow Foundation, you pass arguments to the workflow in a Dictionary form where the type of Dictionary is Dictionary<string, object>. This means you miss the strong typing features of .NET languages. You have to know what arguments the workflow expects by looking at the Workflow public properties. Moreover, there’s no way to make arguments required. You pass parameter, expect it to run, if it throws exception, you pass more arguments, hope it works now. Similarly, if you are running workflow synchronously using ManualWorkflowSchedulerService, you expect return arguments from the Workflow immediately, but there again, you have to rely on the Dictionary key and value pair. No strong typing there as well.

In order to solve this, so that you could pass Workflow arguments as strongly typed classes, you can establish a format that every Workflow has only two arguments named "Request” and “Response” and none other. Whatever needs to be passed to the Workflow and expected out of it, must be passed via Request and must be expected via Response properties. Now the type of these arguments can be workflow specific, it can be any class with one or more parameters. This way, you could write code like this:

Running workflow with strongly typed argument

The advantages of these strongly typed approach are:

  • Compile time validation of input parameters passed to workflow. No risk of passing unexpected object in Dictionary’s object type value.
  • Enforce required values by creating Request objects with non-default constructor.
  • Establish a fixed contract for Workflow input and output via the strongly typed Request and Response classes or interfaces.
  • Validate input arguments for the Workflow directly from the Request class, without going through the overhead of running a workflow.

If we follow this approach, we create workflows with only two DependencyProperty, one for Request and one for Response. Showing you an example from my open source project Dropthings, which uses Workflow for the entire Business Layer. Below you see the Workflow that executes when a new user visits Dropthings.com, creates a new user and setups all the pages and widgets for the user. It has only two Dependency property – Request and Response.

image

The Request parameters is of type IUserVisitWorkflowRequest. So, you can pass any class as Request argument that implements the interface.

image

Here I have used fancy inheritance to create Request object hierarchy. You don’t need to do that. Just remember, you can pass any class. You don’t even need to use interface for Request parameter. It can be a class directly. I use all these interfaces in order to facilitate Dependency Inversion.

Similarly, the Response object is also a class.

image

The Response returns quite some properties. So, it’s kinda handy to wrap them all in one property.

So, there you have it, strongly typed Workflow arguments. You can attach properties of the Request object to any activity directly form the designer:

image

There’s really no compromise to make in this approach. Everything works as before.

In order to make workflow execution simpler, I use a helper method like the following, that takes the Request and Response object and creates the Dictionary for me. This Dictionary always contains one “Request” and one “Response” entry.

image

This way, I can run Workflow in strongly typed fashion:

image

Here I can specify the Request, Response and Workflow type using strong typing. This way I get strongly typed return object as well as pass strongly type Request object. There’s no dictionary building, no risky string key and object type value passing.  You can ignore the ObjectContainer.Resolve() stuff, because that’s just returning me an existing reference of WorkflowRuntime.

Hope you like this approach.

Posted by omar with 2 comment(s)
Filed under: , ,

99.99% available ASP.NET and SQL Server SaaS Production Architecture

You have a hot ASP.NET+SQL Server product, growing at thousand users per day and you have hit the limit of your own garage hosting capability. Now that you have enough VC money in your pocket, you are planning to go out and host on some real hosting facility, maybe a colocation or managed hosting. So, you are thinking, how to design a physical architecture that will ensure performance, scalability, security and availability of your product? How can you achieve four-nine (99.99%) availability? How do you securely let your development team connect to production servers? How do you choose the right hardware for web and database server? Should you use Storage Area Network (SAN) or just local disks on RAID? How do you securely connect your office computers to production environment?

Here I will answer all these queries. Let me first show you a diagram that I made for Pageflakes where we ensured we get four-nine availability. Since Pageflakes is a Level 3 SaaS, it’s absolutely important that we build a high performance, highly available product that can be used from anywhere in the world 24/7 and end-user gets quick access to their content with complete personalization and customization of content and can share it with others and to the world. So, you can take this production architecture as a very good candidate for Level 3 SaaS:

Hosting_environment

Here’s a CodeProject article that explains all the ideas:

99.99% available ASP.NET and SQL Server SaaS Production Architecture

Hope you like it. Appreciate your vote.

kick it on DotNetKicks.com

Linq to SQL: Delete an entity using Primary Key only

Linq to Sql does not come with a function like .Delete(ID) which allows you to delete an entity using it’s primary key. You have to first get the object that you want to delete and then call .DeleteOnSubmit(obj) to queue it for delete. Then you have to call DataContext.SubmitChanges() to play the delete queries on database. So, how to delete object without getting them from database and avoid database roundtrip?

Delete an object without getting it - Linq to Sql

You can call this function using DeleteByPK<Employee, int>(10, dataContext);

First type is the entity type and second one is the type of the primary key. If your object’s primary key is a Guid field, specify Guid instead of int.

How it works:

  • It figures out the table name and the primary key field name from the entity
  • Then it uses the table name and primary key field name to build a DELETE query

Figuring out the table name and primary key field name is a bit hard. There’s some reflection involved. The GetTableDef<TSource>() returns the table name and primary key field name for an entity.

Every Linq Entity class is decorated with a Table attribute that has the table name:

Lint entity declaration

Then the primary key field is decorated with a Column attribute with IsPrimaryKey = true.

Primary Key field has Column attribute with IsPrimaryKey = true

So, using reflection we can figure out the table name and the primary key property and the field name.

Here’s the code that does it:

Using reflection find the Table attribute and the Column attribute

Before you scream “Reflection is SLOW!!!!” the definition is cached. So, reflection is used only once per appDomain per entity. Subsequent call is just a dictionary lookup away, which is as fast as it can get.

You can also delete a collection of object without ever getting any one of them. The the following function to delete a whole bunch of objects:

Delete a list of objects using Linq to SQL

The code is available here:

http://code.msdn.microsoft.com/DeleteEntitiesLinq

kick it on DotNetKicks.com

Posted by omar with 18 comment(s)
Filed under: , ,

How to convince developers and management to use automated unit test for AJAX websites

Everyone agrees that unit testing is a good thing, we should all write unit tests. We read articles and blogs to keep us up-to-date on what’s going on in the unit test world so that we can sound cool talking to peers at lunch. But when we really sit down and try to write unit tests ourselves – “Naaah, this is waste of time, let’s ask my QA to test it; that’s much more reliable and guaranteed way to test this. What’s the point testing these functions when there are so many other functions that we should unit test first?” Had such moment yourself or with someone else? Read on.

I had a conversation with our development lead Mike (using a highly generic name since my last post caused some trouble), who runs “the show” in our engineering team. As usual there was reservation in introducing unit test to regular development schedule. Mike also had valid points about lack of powerful tools for doing unit test on AJAX websites. He also had confusion on ‘what’ and ‘how’ to unit test our code so that we aren’t just testing database failures but real user actions that executes both business and rendering logics. So, the discussion has a lot of useful information, that will help you take the right decision when you want to sell unit test to your ASP.NET and/or AJAX development team and finally to higher management so that you can buy enough time for the effort.

Friday, Jan 2007 – hallway
Omar
: Hey Mike, we need to start doing unit testing at least on our web services. We are wasting way too much time on manual QA. Since we are an AJAX shop, unit testing all our web services should give us pretty well coverage.

Mike: Sure, that sounds fun. I will do some feasibility check and see how can we chip this in into our next sprint.

Friday, Feb 2007 – washroom 
Omar: Hey Mike, let’s start doing unit tests. I haven’t seen any tests last month. Can we start from this sprint?

Mike: Sure, we can surely start from this sprint. Let me find out which tool is the right one for us.

Friday, March 2007 – meeting room
Omar: Hey Mike, haven’t seen any unit tests in the solution so far. Let’s seriously start writing unit tests. Did you make any plan how you want to start unit testing the webservices?

Mike: Yeah, I did some digging around and found some tools. But most of them are for non-AJAX sites where you can programmatically hit a URL or programmatically do HTTP POST on a URL. You can also record button clicks and form posts from the browser. There’s Visual Studio’s Web Test, which does pretty good job recording regular ASP.NET site, but poor on AJAX sites. Moreover, you need to buy Team Suite edition to get that Web Test feature. Besides, recording tests and playing them back really does not help us because all those tests contain hard coded data. We can’t repeat a particular step many times with random data, at least not using any off-the-shelf tools. We need to test things carefully and systematically using random data set and sometimes use real data from database. For example, a common scenario is loading 100 random user accounts from database and programmatically log those users into their portal and test whether the portal shows those users’ personalized data. All these need to be done from AJAX, without using any browser redirect or form post, because there’s one page that allows user to login using Ajax call and then dynamically renders the portal on the same page after successful login. The UI is rendered by Javascript, so only a real browser can render it and we have to test the output looking at the browser.

Omar: I see, so you can’t use Visual Studio Web Test to run unit tests on a browser because it does not let you access the html that browser renders. You can only test the html that’s returned by webserver. As we are AJAX website, most of our stuffs are done by Javascripts – they call Webservice and they render the UI. Hmm, thinking how we can do this using VS. We can at least hit the webservices and see if they are returning the right JSON. This way we can pretty much test the entire webservice, business and data access layer. But it does not really replace the need  for manual QA since there’s a lot of rendering logic in Javascript.

Mike: Now there’s a new project called Watin that seems promising. You can write C# code to instruct a browser to do stuffs like click on a button, run some javascript and then you can check what the browser rendered in its DOM and run your tests. But still, it’s in its infancy. So, there’s really no good tool for unit testing AJAX sites. Let’s stick to manual QA, which is proven to be more accurate than anything developers can come up with. We can handover a set of data to QA and ask them to enter and check the result.

Omar: We definitely need to figure out ways to reduce our dependency on manual QA. It simply does not scale. Every sprint, we have to freeze code and then hand over to QA. They run their gigantic test scripts for a whole day. Then next day, we get bug reports to fix. If there’s severe regression bug we have to either cancel sprint or work whole night to fix it and run overnight QA to meet deployment date. For last one year, every sprint we ended up having some bug that made dev and QA work over night. We have to empower our developers with automated unit test tool so that they can run the whole regression test script automatically.

Mike: You are talking about a very long project then. Writing so many unit tests for complete regression test is going to be more than a month long project. We have to find the right set of tool, plan what areas to unit test and how, then engage both dev and QA to work together and prepare the right tests. And then we have to keep the test suite up-to-date after every sprint to catch the new bugs and features.

Omar: Yes, this is certainly a complex project. We have to get to a stage that can empower a developer to run automated unit tests and not ask QA to test every task for regression bugs. In fact, we should have automated build that runs all unit tests and does the regression test for us automatically after every checkin.

Mike: We have automated build and deploy. So, that’s done. We need to add automated unit test to it. Seriously, given our product size, this is absolutely impossible to engage in writing so many unit tests so that we can do the entire regression test automatically. It’s not worth the time and money. Our QA team is doing fine. They can take one day leave after deployment when they do overnight work.

Omar: Actually QA team is at the edge of quitting. They seem to have endless work load. After deployment, they have to do manual regression test on production site to ensure nothing broke on production. While they are at it, they have to participate in sprint initiation meetings and write test plans. When they are about to complete that, devs checkin stuffs and ask for regression test of different modules. Before they can finish that , we reach code freeze and they have to finish all those task level tests as well as the entire regression test. So, they end up working round-the-clock several days every sprint. They simply can’t take it anymore.

Mike: How is it different than our life? After spending sleepless night on the deployment date, next day we have to attend 8 hours long sprint planning meeting. Then we have to immediately start working on the tasks from the next day and have to reach code-freeze within a week. Then QA comes up with so many bugs at the last moment. We have to work round-the-clock last 3 days of sprint to get those bugs fixed. Then after a nerve wrecking deployment day, we have to stay up at night to wait for QA to report any critical bug and fix it immediately on production. We are at the brink of destruction as well.

Omar: That’s understood. The whole team is surely getting pushed to their limit. So, that’s why we urgently need automated test so that it addresses the problems of both dev and QA team. Dev will get tests done at a faster rate so that they don’t get bug reports at the very end and then work over-night to fix them. Similarly, we offload QA team’s continuous overwork by letting the system do the bulk of their test.

Mike: This is going to kill the team for sure. We have so many product features and bug fixes to do every sprint. Now, if we ask everyone to start writing unit tests for every task they do, it’s a lot of burden. We can’t do both at the same time.

Omar: Agree. We have to cut down product features or bug fixes. We have to make room in every sprint to write unit tests.

Mike: Good luck with that. Let’s see how you convince product team.

Omar: First let me convince you. Are you convinced that we should do it.

Mike: Not yet. I don’t really see its fruit in near future, even after two months. There’s so many features we have to do and so many customers to ship to, we just can’t do enough unit tests that will really shed off QA load. It’ll just be a distraction and delay in every sprint, heck, in every task.

Omar: Let me show you a graph which I believe is going to make an impact:

image

So, you see the more automates test we write, the less time spent on Manual QA. That time can be spent on doing new tests or task level tests and increase quality of every new feature shipped and drastically reduce new bugs shipped to production. Thus we get less and less bugs after every successful sprint.

Mike: Ya, I get it, you don’t need to convince me for this. But I don’t see the benefit from overall gain perspective. Are we shipping better product faster over next two months? We aren’t. We are shipping less features and bug fixes by spending a lot of time on writing unit tests that has no impact on end-user.

Omar: Let me see if your assumption is correct:

image

You see here, the more automated tests kick in, the more time QA can spend on new features or new bugs. I agree that the speed of testing new features/bugs decrease first one or two sprints, but then they gradually get picked up and get even better. In the beginning, there’s a big overhead of getting started with automated test. But as sprints go by, the number of unit tests to write gradually gets stable and soon it becomes proportional to new features/bugs. No more time spent on writing tests for old stuff. So, the number of unit tests you write after four sprints is exactly what needed for the new tasks you did on that sprint.

Mike: Let’s see what if we just don’t do any automated test and keep things manual. How does the graph look like?

image

Omar: The future looks quite gloomy. We will be spending so much time on regression test as we keep adding stuffs to the product that at some point QA will end up doing regression test full time. They will not spend time on new features and we will end up having a lot more new bugs slipped from QA to production due to lack of attention from QA.

Mike: OK, how do we start?

Omar: First step is to get the regression tests done so that we can get rid of that 24 hour long marathon QA period end of every sprint. Moreover, I see too many devs asking QA to do regression test here and there after they commit some tasks. So, QA is always doing regression tests from the beginning to the end of each sprint. They should only test new things for which automated test is not yet written and let the automated test do the existing tests.

Mike: This will be hard to sell to management. We are going to say “Look for next one month, we will be half productive because we want to spend time automating our QA process so that from second month, we can do tests automatically and QA can have more free time.”

Omar: No, we say it like this, “We are going to spend 50% of our time automating QA for next oen month so that QA can spend 50% more time on testing new features. This will prevent 50% new bugs from occurring every sprint. This will give developers 50% more time to build new features after one month.” We show them this graph:

image

Mike: Seems like this will sell. But for first couple of sprints, we will be so dead slow that some of us might get fired. Think about it, from management point of view, the development team has suddenly become half productive. They aren’t building only few new features and bugs are not getting fixed as fast rate. Customer are screaming, investors asking for money back. It’s going to get really dirty. Do you want to take this risk?

Omar: I can see that this decision is a very hard decision to take. I know what CEO will say, “We need to be double productive from tomorrow, otherwise we might as well pack our bags and go home. Tell me something that will make us double productive from tomorrow, not half productive.” But you can see what will happen after couple of months. Situation will be so bad that doing this after couple of months will be out of question. We won’t be in a position to even propose this. Now, at least we can argue and they still have the mind to listen to long term ideas. But in future, when our QA team is doing full time regression test, new buggy features going to production, ratio of new bugs increasing after every release, more customers screaming, half baked features running on the production – we might have to shut down the company to save our life.

Mike: We should have started doing automated tests from day one.

Omar: Yes, unfortunately we haven’t and the more we delay, the harder it is going to get. I am sure we will write automated tests from day one in our next project, but we have to rescue this project.

Mike: OK, I am sold. How do we start? We surely need to unit test the business and data access layer. Do we start writing unit test for every function in DAL and Business layer?

Omar: Writing unit test for DAL seems pointless to me. Remember, we have very little time. We will get max two sprints to automate unit tests. After that, we won’t get the luxury to spend half of our time writing unit tests. We will have to go back to our feature and bug fix mode. So, let’s spend the time wisely. How about we only test the business layer function?

Mike: So, we test functions like CreateCustomer, EditCustomer, DeleteCustomer, AddNewOrder in business layer?

Omar: Is that the final layer in business layer? Is there another high level layer that aggregates such CRUD like functions?

Mike: For many areas, it’s like CRUD, a dumb wrapper on DAL with some minor validation and exception handling. But there are places where there are complex functions that do a lot of different DAL call. For example, UpdateCustomerBalance – that calls a lot of DAL classes to figure out customer’s current balance.

Omar: Does webservices call multiple business classes? Do they act like another level that aggregates business layer?

Mike: Yes, webservices are called mostly from user actions and they generally call multiple business layer classes to get the job done.

Omar: Where’s the caching done?

Mike: Webservice layer.

Omar: That sounds like a good place to start unit testing. We will write small number of unit tests and still test majority of business layer and data access classes and we ensure validation, caching, exception handling code are working fine.

Mike: But there are other tools and services that call the business layer. For example, we have a windows service running that directly calls the business layer.

Omar: Can we refactor it to call webservices instead?

Mike: No, that’ll be like creating 10 more webservices. A lot more development effort.

Omar: OK, let’s write unit tests for those business layer classes separately then. I suppose there will be some overlap. Some webservice call will test those business classes as well. But that’s fine. We *should be* unit testing from business layer. But we don’t have time, so we are starting from one level up. Webservices aren’t really “unit” but you have to do what you have to do. At least testing webservices will give us guarantee that we covered all user actions under unit test.

Mike: Yes, testing webservices will at least ensure user actions are tested. The background windows service is not much of our headache. Now how do we test presentation logic? We have ASP.NET pages and there’s all those Javascript rendering code.

Omar: Let’s use Watin for that.

Mike: How to make that part of a unit test suite?

Omar: Watin integrates nicely with NUnit, mbUnit. mbUnit is pretty good. I used it before. It has more test attributes and Assert functions than NUnit.

Mike: OK, so how do we unit test UI? A test function will click on Login link, fill up the email, password box and click “OK”. Then wait for one sec and then see if Javascript has rendered the UI correctly?

Omar: Something like that. We can discuss later exactly how we test it. But how do you test if UI is rendered correctly?

Mike: We check from browser’s DOM for user’s data like name, email, balance etc are available in browser’s HTML.

Omar: Does that really test presentation logic? What if the data is misplaced? What if due to CSS error, it does not render correctly.

Mike: Well, there’s really no way to figure it out if things are rendered correctly. We can ask the QA guys to keep watching the UI while Watin runs the tests on the browser. You can see on the browser what Watin is doing.

Omar: OK, that’s one way and certainly faster than QA doing the whole step. But can it be done automatically like matching browser’s screen with some screenshot?

Mike: Yeah, we need AI for that.

Omar: Seriously, can we write a simple UI capture and comparison tool? Say we take a screenshot of correct output and then clear up some areas which can vary. Then Watin runs the test, it takes the screenshot of current browser’s view and then matches with some screenshot? Here’s the idea:

image

Say this is a template screenshot that we want to match with the browser. We are testing Google’s search result page to ensure the page always returns a particular result when we provide some predefined query. So, when Watin runs the test and takes browser to Google search result page, it takes a screenshot and ignores whatever is on those gray area. Then it does a pixel by pixel match on the rest of the template. So, no matter what the search query is and no matter what ad Google serves on top of results, as long as the first result is the one we are looking for, test passes.

Mike: As I said, this is AI stuff. Some highly sophisticated being will be matching two screenshots to say, Yah, they more or less match, test pass.

Omar: I think a pretty dumb bitmap matching will work in many cases. Just an idea, think about it. This way we can test if CSS is giving us pixel perfect result. QA takes a screenshot of expected output and then let the automated test to match with browser’s actual output.

Mike: OK, all good ideas. Let’s see how much we can do. We will be starting from webservice unit testing. Then we will gradually move to Watin based testing. Now it’s time to sell this proposal to product team and then to management team.

Omar: Yep, at least get the webservices tested, that will catch a lot of bugs before QA spends time on testing. Goal is to get as much testing done by developers, really fast, automatically then letting QA spend time on them.  Also we can run those webservice unit tests in a load test suite and load test the entire webservice layer. That’ll give us guaranty our code is production quality and it can survive the high traffic.

Mike: Understood, see ya.

. . .

March 2008, Friday - The Code Freeze Day

Omar: Hey Mike, how are we doing this sprint?

Mike: Pretty good. 3672 unit tests out of 3842 passed. We know why some of them failed. We can get them fixed pretty soon and run the complete regression tests once during lunch and once before we leave. QA has completed testing new features pretty well yesterday and they can check again today. We got some of the new features covered by unit tests as well. Rest we can finish next sprint, no worries.

Omar: Excellent. Enjoy your weekend. See you on Monday.

------------------------------

Suggested Reading:

DotNetKicks Image
Posted by omar with 4 comment(s)
Filed under: , , ,

Solving common problems with Compiled Queries in Linq to Sql for high demand ASP.NET websites

If you are using Linq to SQL, instead of writing regular Linq Queries, you should be using Compiled Queries. if you are building an ASP.NET web application that’s going to get thousands of hits per hour, the execution overhead of Linq queries is going to consume too much CPU and make your site slow. There’s a runtime cost associated with each and every Linq Query you write. The queries are parsed and converted to a nice SQL Statement on *every* hit. It’s not done at compile time because there’s no way to figure out what you might be sending as the parameters in the queries during runtime. So, if you have common Linq to Sql statements like the following one throughout your growing web application, you are soon going to have scalability nightmares:

var query = from widget in dc.Widgets
where widget.ID == id && widget.PageID == pageId
select widget;

var widget = query.SingleOrDefault();

There’s a nice blog post by JD Conley that shows how evil Linq to Sql queries are:

image

You see how many times SqlVisitor.Visit is called to convert a Linq Query to its SQL representation? The runtime cost to convert a Linq query to its SQL Command representation is just way too high.

Rico Mariani has a very informative performance comparison of regular Linq queries vs Compiled Linq queries performance:

image

Compiled Query wins on every case.

So, now you know about the benefits of compiled queries. If you are building ASP.NET web application that is going to get high traffic and you have a lot of Linq to Sql queries throughout your project, you have to go for compiled queries. Compiled Queries are built for this specific scenario.

In this article, I will show you some steps to convert regular Linq to Sql queries to their Compiled representation and how to avoid the dreaded exception “Compiled queries across DataContexts with different LoadOptions not supported.”

Here are some step by step instruction on converting a Linq to Sql query to its compiled form:

First we need to find out all the external decision factors in a query. It mostly means parameters in the WHERE clause. Say, we are trying to get a user from aspnet_users table using Username and Application ID:

Query to get a user from aspnet_users table

Here, we have two external decision factor – one is the Username and another is the Application ID. So, first think this way, if you were to wrap this query in a function that will just return this query as it is, what would you do? You would create a function that takes the DataContext (dc named here), then two parameters named userName and applicationID, right?

So, be it. We create one function that returns just this query:

Converting a LInq Query to a function

Next step is to replace this function with a Func<> representation that returns the query. This is the hard part. If you haven’t dealt with Func<> and Lambda expression before, then I suggest you read this and this and then continue.

So, here’s the delegate representation of the above function:

Creating a delegate out of Linq Query 

Couple of things to note here. I have declared the delegate as static readonly because a compiled query is declared only once and reused by all threads. If you don’t declare Compiled Queries as static, then you don’t get the performance gain because compiling queries everytime when needed is even worse than regular Linq queries.

Then there’s the complex Func<DropthingsDataContext, string, Guid, IQueryable<aspnet_User>> thing. Basically the generic Func<> is declared to have three parameters from the GetQuery function and a return type of IQueryable<aspnet_User>. Here the parameter types are specified so that the delegate is created strongly typed. Func<> allows up to 4 parameters and 1 return type.

Next comes the real business, compiling the query. Now that we have the query in delegate form, we can pass this to CompiledQuery.Compile function which compiles the delegate and returns a handle to us. Instead of directly assigning the lambda expression to the func, we will pass the expression through the CompiledQuery.Compile function.

Converting a Linq Query to Compiled Query

Here’s where head starts to spin. This is so hard to read and maintain. Bear with me. I just wrapped the lambda expression on the right side inside the CompiledQuery.Compile function. Basically that’s the only change. Also, when you call CompiledQuery.Compile<>, the generic types must match and be in exactly the same order as the Func<> declaration.

Fortunately, calling a compiled query is as simple as calling a function:

Running Compiled Query

There you have it, a lot faster Linq Query execution. The hard work of converting all your queries into Compiled Query pays off when you see the performance difference.

Now, there are some challenges to Compiled Queries. Most common one is, what do you do when you have more than 4 parameters to supply to a Compiled Query? You can’t declare a Func<> with more than 4 types. Solution is to use a struct to encapsulate all the parameters. Here’s an example:

Using struct in compiled query as parameter

Calling the query is quite simple:

Calling compiled query with struct parameter

Now to the dreaded challenge of using LoadOptions with Compiled Query. You will notice that the following code results in an exception:

Using DataLoadOptions with Compiled Query

 

The above DataLoadOption runs perfectly when you use regular Linq Queries. But it does not work with compiled queries. When you run this code and the query hits the second time, it produces an exception:

Compiled queries across DataContexts with different LoadOptions not supported

A compiled query remembers the DataLoadOption once its called. It does not allow executing the same compiled query with a different DataLoadOption again. Although you are creating the same DataLoadOption with the same LoadWith<> calls, it still produces exception because it remembers the exact instance that was used when the compiled query was called for the first time. Since next call creates a new instance of DataLoadOptions, it does not match and the exception is thrown. You can read details about the problem in this forum post.

The solution is to use a static DataLoadOption. You cannot create a local DataLoadOption instance and use in compiled queries. It needs to be static. Here’s how you can do it:

image

 

Basically the idea is to construct a static instance of DataLoadOptions using a static function. As writing function for every single DataLoadOptions combination is painful, I created a static delegate here and executed it right on the declaration line. This is in interesting way to declare a variable that requires more than one statement to prepare it.

Using this option is very simple:

image

Now you can use DataLoadOptions with compiled queries.

kick it on DotNetKicks.com

Posted by omar with 16 comment(s)
Filed under: , ,

Tips and tricks to rescue overdue projects

One of my friends, who runs his own offshore development shop, was having nightmare situation with one of his customers. He's way overdue on a release, the customer is screaming everyday, he's paying his team from his own pocket, customer is sending an ever increasing list of changes and so on. Here's how we discussed some ideas to get out of such a situation and make sure it does not repeat in future:

Kabir: Hey, can you help me? My customer is making us work for free for extra two months to fix bugs from our last delivery. We did what he said. But after he saw the output, he came up with hundred changes, which he somehow presents as bugs or missing features and make them look like they are all our fault and making us work for last two months for free. He is sending new changes every week. We have no idea when we will complete the iteration.

Omar: I see. Did you get a signed list of requirements from customer before you started the development?

Kabir: Of course, I did. He sent us a word document explaining what he wants and we sent him a task breakup with hour estimates and total duration of three months. Now after three months when we showed him the product, he said, it's no where close to what he had expected. Then he sent a gigantic list of things to change.

Omar: All of those are bugs?

Kabir: Of course not. Most of them are new features.

Omar: Then why don't you say those are new features? You have the original word document to prove. Just ask him to show where in the word document did he said X needs to be done?

Kabir: Well..., he's tricky. He somehow makes things look like it is obvious that X needs to be done and he's not going to accept a requirement as done until X is done. For example, he said there must be a complete login form in the homepage. So, we did a typical login form with user name, password and OK, Cancel button. Now he says where's the email verification thing? We said, you did not ask for it. He said, "this is obvious, every login form has a forgot password and email verification; I said *complete* login form, not half-baked login form". So, you see, we can't really argue to keep our image. Then, we did the login form exactly how he said. Now he says, where the client side validations of proper email address, username length, password confirmation? We said, you never asked for it! He says, "come on, every single website nowadays has AJAX enabled client side validation, do I have to tell you every single thing? Aren't you guys smart enough to figure this out? You are already doing this for the third time, can't you do it really well this time?"

Omar: OK, stop. I see what's your problem. Some customer will always try to make you work more for less money. They will try to squeeze out every bit of development they can for their bucks. So, you have to be extra careful on how much you commit to them and make sure they cannot chip in more requirements while development is going on or when you deliver a version. Mockups are one good way to make sure things are crystal clear between you and customer.  Did you not show him mockups of the features that you will be building and make him sign those mockups?

Kabir: Yes, I made some mockups. But they were simple mockups. I did not show the validations or all those side jobs like sending verification emails.

Omar: Did you run those mockups through your engineers? They could have told you about those details.

Kabir: No, I did not because developers don't work on the project until I get a signoff from client. So, I prepare all the mockups myself to save cost.

Omar: So, this is the first problem. The mockups were as ambiguous as the customer's word document. Basically the mockups just reflected the sentences in word document. Mockups did not really show all possible navigations (ok, cancel, forgot, signup), system messages, system actions behind the scene, workflows etc. Are you getting what I am saying?

Kabir: Yes. Come on, I am not a developer. I can't think of every single details. That's what developers do when they start working on it.

Omar: But you provide estimates based on your mockups right? So, if mockup shows there's only a simple login form and change password link, you charge 5 hours for it. But then when you realize you have to send email for change password, email needs to contain a tokenized URL, that URL needs to show a change password form, where you need to validate using CAPTCHA etc, it becomes 20 hours of work. Right?

Kabir: Well yes. Generally I multiply all estimates by 1.5 just to be safe. But things have gone 3X to 10X off original estimate.

Omar: Yes, I just gave you an example how a login form estimate can go 4X off when the mockup is not run through an engineer and the important issues are not addressed.

Kabir: So, you are saying I have to prepare all mockup with an engineer?

Omar: In general, yes, since you aren't good enough to figure those out yourself; no offence. You will get good enough after you build couple of products and get your a** kicked couple of times, like mine. Mine got kicked about 17 times. After that it became so hard that when I sit on it, I produce really good mockups. After some more kicks, I hope to get 100% perfect in my mockups.

Kabir: Ok, so the process is, I get word doc from customer. I produce mockups from it. Then I run them through engineers to add more details to them. Then after review with customer, I run them through engineers again to estimate. Then I ask customer to sign-off on the mockups and the estimate, correct?

Omar: Well, first let me say, you don't do a three month long iteration since you are far away from your customer. You do short two weeks sprints. Do you know SCRUM?

Kabir: Yes, one of our team does it.

Omar: I assume the team that got their a** kicked don't do it?

Kabir: right, they don't.

Omar: OK, then first you start doing SCRUM. I won't teach you details. You can study about SCRUM online. Now, you collect 'user stories' from customer. If customer does not give you user stories, just vague paras of requirements, you break the requirements into small user stories. Understood?

Kabir: No, give example.

Omar: OK, say customer wants a *complete* login form. You break it into couple of stories like:

  • User clicks on "login" link from homepage so that user can login to the system
    • User enters username (min 5, max 255 chars, only alphanumeric) in the username text field
    • User enters password (min 5, max 50 chars, only alphanumeric) in the password field
    • User clicks on "OK" button after entering username and password.
    • System validates username and password and shows the secure portal if credential is valid and user has permission to login and account is not locked.

Understood what user stories are?

Kabir: Yes, but you are missing all the validations that we also overlooked and now we are working two months for free. This “user stories” do not help at all.

Omar: Hold on, you just saw basic steps of a user story. Now you describe each user story with the following:

  • All possible inputs of user and their valid format
  • All possible system generated messages for invalid input
  • All possible alternate navigation from the main user story. For example, while entering password, user can click on a help icon so that user can see what kind of passwords are allowed.

Got it?

Kabir: Now it's starting to make sense. Then what? Show these user stories to customers?

Omar: No, show them to your lead engineer who has enough experience to identify if you missed something. Your Engr should point out all the alternative system actions at least.

Kabir: What if my Engr can't figure them out? What if he's just as dumb as me?

Omar: Fire him. Get a pay cut for yourself.

Kabir: Seriously, what do I do if that's the case?

Omar: Your engineers will *always* come up with issues with your mockups. You should always use another pair of eyes to verify your mockups and add more details to it. You aren't the only smart guy in the world you know?

Kabir: I thought I was, ok. What's next?

Omar: File those user stories in your issue tracking system in some special category. Say "User Stories" category. What do you use for your issue tracking system?

Kabir: Flyspray

Omar: Good enough. Create a new project in Filespray named "User Stories". File tasks for user stories. Each story, one task. Attach the mockup to the tasks. Then create one account for your customer so that customer can login and see the user stories, make comments, suggest changes etc. You will get the conversation with your customer recorded as comments in the task. This comes handy for engineers and for resolving dispute later on. Moreover, get your customer to prioritize the tasks properly. Understood?

Kabir: I don't think customer will go through that trouble. Customer will ask for some word document that has all the user stories and she will write in the document what are the changes. I will have to reflect them in Flyspray. Is it really necessary to file user stories in Flysrpay? Can't I just maintain one word doc with customer?

Omar: Absolutely not. Word documents suffer from versioning problem. You have one version, your customer has another version, your engineers have another version. it becomes a nightmare to move around with word docs which has many user stories in it and keep them in sync all the time. Moreover, referencing a particular use case also becomes a problem. Say at later stage of the project, there's a bug which needs to refer to User Story #123. You will have to say User Story #123 in \\centralserver\fileshare\user stories1.doc. Now if \\centralserver dies, or you put it somewhere else, all these references are gone. Don't go for word doc. Keep everything on the web that you can refer to it using a URL or small number. Another problem is numbering stories in Word Doc. Word won't produce unique ID for you. You will end up with duplicate user story numbers. If you use Flyspray, it's will generate unique ID for you.

Kabir: OK, let me see how I convince my customer to use Flyspray.

Omar: Yes, you should. If Flyspray is hard for customers, use some simple issue tracking system that's a no-brainer for non-engineers. Some fancy AJAX based todolist site will be good enough if it has picture attachment feature and auto task number feature.

Kabir: OK, I will find such a website. So, I got the user stories done. Now I show them to customer, review, make changes. Finally I get customer to sign off on User story #X to #Y for a two weeks sprint. Then what happens?

Omar: On your first day of sprint, you do a sprint planning meeting where you present those user stories to your engineers and ask them to break each story into small tasks and estimate each task. Make sure no engineer put 1 day or 2 day for any task. Break them into even smaller tasks like 4 hours of tasks. This will force your engineers to give enough thought into the stories and identify possible problems upfront. Generally when someone says this is going to take a day or two, s/he has no idea how to do it. S/he has not thought about the steps need to be done to complete that task. Your are getting an estimate that's either overestimated or underestimated. Forcing an engineer to allocate tasks in less than 4 hours slot makes an engineer think about the steps carefully. 

Kabir: If engineers do this level of estimate, they will think about each task for at least an hour. This is going to take days to finish estimating so many tasks. How do you do it in a day?

Omar: We do 4 hours Planning meeting where Product Owner explains the stories to engineers and then after 30 mins break, another 4 hours meeting where engineers pickup stories and breaks them into tasks and estimate on-the-fly. This 4 hours deadline is strictly maintained. If Product team cannot explain the tasks for a sprint in 4 hours, we don't do the tasks in the sprint. If the tasks are so complex or there are so many tasks that they cannot be explained in 4 hours, engineers unlikely to do them within one/two week long sprint. Similarly if engineers cannot estimate the tasks in their 4 hours slot, the tasks are just too complex to estimate and thus have high probability of not getting done in the sprint. So, we drop them as well.

Kabir: This is impossible! No one's going to attend 8 hours meeting on a day. Besides, telling them to estimate a task on the spot is super inefficient. They won't produce more than 60% correct estimates. They will give some lump sum estimate and then go away.

Omar: Incorrect, if engineers cannot make estimates of a task in 10 to 20 minutes, they don't have the capability of estimating at all. If your engineers are habituated to take a task from you for estimating and then go to their office, talk to their friends on the phone, drink soda, walk around, gossip with colleagues and end of the day if they have the mood to sit and think about the estimate then open a new mail, write some numbers and email it to you; they better learn to do this on-demand, when requested, within time constraint. It's a discipline that they need to learn and implement in their life. Estimates are something they do from the moment they wake up to the moment they go to sleep. Besides, the planning meeting is the best place to estimate tasks - all engineers are there, product team is there, your architects should be there, QA team is there. It's easy to ask questions, get ideas and helps from others.

Kabir: I have engineers who just can't do well under pressure. They need some undisturbed moment, where they can sit and think about tasks without anyone staring at them.

Omar: Train them to learn how to keep their head cool and do their job in the midst of attention. Anyway, let's stop talking about these auxiliary issues and talk about the most important issues. Where were we?

Kabir:About dropping tasks, I already negotiated with customer that we are going to do story A, B, C in this sprint. Now after the sprint planning meetings, engineers say they can't do B. Problem is I have already committed to deliver A, B, and C to customer within 2 weeks and sent him the invoice. How do I handle this?

Omar: How do you commit when you don't know how long A, B, and C are going to take?

Kabir: Customer tells me to do A, B, C within two weeks. And after doing some preliminary discussion with engineers, I commit to customer and then do the sprint planning meeting. I can’t wait until the sprint meeting is done and developers have given me estimates of all the tasks.

Omar: Wrong. You commit to customer after the sprint planning meeting is done. Before that, you give customer just a list of things that you believe you can try to do in following two weeks. Tell customer that you will be able to confirm after the sprint planning meeting. The time to do a sprint meeting is only 8 hours = a day. So, end of the day, you have some concrete stuff to commit to customer. From your model where you give engineers days to estimate, it won’t work. You have to finish planning within a day and end of the day, commit to customer.

Kabir: What if customer does not agree? What if he says, "I must get A, B and C in two weeks, otherwise I am going somewhere else?"

Omar: This is a hard situation. I am tempted to say that you tell your customer, "Go away!", but in reality you can't. You have to negotiate and come to a mutual agreement. You cannot just obey customer and say "Yes Mi Lord, we will do whatever you say" because you clearly cannot do it. The fact is, end of the sprint, you *will* get only A and C done and B not done. Then customer will Fedex you his shoes so that you can ask someone to kick you with it.

Kabir: Correct, so what do I do?

Omar: There are tricky solutions and non-tricky honest solutions to this. Tricky solution is, say you engaged 5 engineers in the project who can get A and C done in time. But you realize you need another engineer to do B, otherwise there's no way you can finish A, B and C in two weeks. So, you invoice customer with 6 engineers and get A, B and C done. Now customer may not agree with you paying for the 6th engineer. Then you do a clever trick. You engage the 6th engineer free of cost in this sprint. Don't tell customer that there's an extra head working in the project. Or you can tell customer that out of good will, you want to engage another engineer free of cost to make sure customer gets a timely delivery. This boosts your image. Later on, when you get a sprint that's more or less relaxed and 4 engineers can do the job, you secretly engage one engineer to some other project but still charge for 5 engineers to your customer. This way  you cover the cost for the 6th engineer that you secretly engaged earlier sprint. This is dirty. But when you have so hard a** customer who's forcing you "what", "when" and "how" all at the same time and not open to negotiation, you have no choice but to do these dirty tricks. You can also add extra one hour to every task for every engineer in a sprint or add some vague tasks like "Refactor User object to allow robust login". This way you will get quite some amount of extra hours that will compensate for the hidden free engineer that you engage. You get the idea right?

Kabir: Ingenious! And what's the honest and clear way to do these?

Omar: You negotiate with customer. You tell your customer that he or she can only have any two choices from Money-Scope-Time. This is called the project management triangle. Do you know about this?

Kabir: Googling...

Omar: Read this article:

http://office.microsoft.com/en-us/project/HA010211801033.aspx

It shows a triangle like this:

clip_image002

So, your customer can specify any two. If customer specifies Scope and Time ("what" and "when"), then customer must be flexible on Money or "how" you do it within those two constraints. If customer specifies Money and Scope, then you are free to decide on time. You engage lower resource and take more time to get things done. Got the idea?

Kabir: Yes, understood. Nice, I can show this to customer and educate him. Is there any book for the evil tricks that you just gave me?

Omar: No, I might write one soon. I will name it "Customers are evil, so be you".

Raisul: Hey, I have fixed people engaged in a project. I can't change the number of people sprint-to-sprint to compensate for change in money. So, the triangle does not work for me. What do I do here?

Omar: Right. I also made a slightly different version of it. Here's my take:

image

This is for situation where you have fixed resource engaged for a particular customer. In that case, you cannot reduce people on-demand because you cannot reassign them. Such a case requires different strategy. If customer forces you Quality and Time, customer must be willing to sacrifice Quantity. Customer cannot say, produce perfect login form in 2 weeks and add cool ajax effects to it. Customer has to sacrifice cool ajax effects, or sacrifice *perfection* of login form, or sacrifice number of days.

From the above two triangles, which one's more appropriate for you?

Kabir: Second one because customer hires 5 engineers from me. I cannot take one away and engage in a different project. Well, not openly of course.

Omar: OK, sounds fair. What else do you need from me?

Kabir: Let me think about all these. This is definitely worth thinking. I have to figure out whether to play fair or play clever. End of the day, I need to produce great product, so that, I get good recommendation and future projects from customer. So, I need to do whatever it takes. It's hard to run an offshore dev shop where we kinda have to work like slaves and like a bunch of zombies mumble every 10 mins - "Customer is always right". You are very lucky to have your own company.

Omar: I had two offshore dev shops before Pageflakes. I know how it feels. Wish you good luck. I have seen your product, you guys are building a great ASP.NET MVC+jQuery application. Release it. It's worth showcasing.

Raisul: Thank you very much. See ya...

(End of chat)

This is the diagram my friend produced, which shows the steps to do before a sprint is started:

Workflow for Product Manager

Handy for Product Managers. Enlightening for developers.

kick it on DotNetKicks.com
Posted by omar with 18 comment(s)
Filed under: ,

An Agile Developer's workflow when SCRUM is used

If you are planning to start SCRUM at your company, you might need to train developers and QA to get into the mindset of an Agile developer. SCRUM is only successful when the developers and QA get into the habit of following the principles of SCRUM by heart. So, sometimes you need to offer training or do trial sprints to give some room to your developers how to get used to the working fashion of SCRUM. Giving them a handy workflow diagram that shows how they should work helps soothe the steep learning curve required for non-super star developers. I made such a workflow while I was teaching SCRUM at my friend's company. The following diagram was printed and hung over the desk of each and every developer to help them grasp the culture of SCRUM quickly:

image

We use Flyspray for issue tracking, so you will see the mention of it frequently.

You will see the step to "Update Sprint backlog with remaining hours" is missing. This is done kinda verbally and scrum master (sometimes same person who is the product owner) keeps track of it.

Hope you find this useful.

kick it on DotNetKicks.com

Posted by omar with 6 comment(s)
Filed under: ,

ASP.NET website Continuous Integration+Deployment using CruiseControl.NET, Subversion, MSBuild and Robocopy

You can setup continuous integration and automated deployment for your web application using CruiseControl.NET, Subversion, MSBuild and Robocopy. I will show you how you can automatically build the entire solution, email build report to developers and QA, deploy latest code in IIS all using CruiseControl.NET every N minutes.

First get the following:

  • CruiseControl.NET
  • Subversion (install the command line tools and add the Subversion bin path to PATH environment variable)
  • Robocopy (Windows Vista/2008 has it built-in, here's the link for Windows 2003)
  • Install .NET Framework. You need it for MSBuild.

You will learn how I have configured Continuous Integration and Deployment for my open source AJAX Portal project www.Dropthings.com. The code is hosted at CodePlex. When some developer makes a commit, CruiseControl downloads the latest code, builds the entire solution, emails build report and then deploys the latest web site to IIS 6.0.

After installing CruiseControl.NET, go to Programs -> Cruise Control -> CruiseControl.NET Config.

Now keep copying and pasting the following XML blocks and make sure you understand each block and make necessary changes:

   1: <cruisecontrol>
   2:     <project name="Dropthings" queue="DropthingsQueue" queuePriority="1">
   3:         <!-- 
   4:         Path to the trunk folder where the full solution starts from. This is where
   5:         subversion checkout and incremental update is performed 
   6:         -->
   7:         <workingDirectory>d:\cc\dropthings\code\trunk\</workingDirectory>
   8:         <!-- Some path where CCNet writes its logs and stuffs. It can be outside the log folder -->
   9:         <artifactDirectory>d:\cc\dropthings\artifact\</artifactDirectory>
  10:         <category>Dropthings</category>
  11:         <!-- CCNet installs a web dashboard. Enter the URL of that dashboard here -->
  12:         <webURL>http://localhost/ccnet/</webURL>
  13:         <modificationDelaySeconds>60</modificationDelaySeconds>
  14:         <labeller type="defaultlabeller">
  15:             <prefix>0.1.</prefix>
  16:             <incrementOnFailure>true</incrementOnFailure>
  17:             <labelFormat>000</labelFormat>
  18:         </labeller>
  19:         <state type="state" directory="State" />

First change the working directory. It needs to be the path of the folder where you will have the solution downloaded. I generally create folder structure like this:

  • D:\CC - Root for all CC.NET enabled projects
    • \ProjectName - Root project folder
      • \Code - Code folder where code is downloaded from subversion
      • \Artifact - CC.NET generates a lot of stuff. All goes here.

Next comes the Subversion integration block:

   1: <sourcecontrol type="svn">
   2:     <!-- Subversion trunk repository to keep checking for latest code -->
   3:     <trunkUrl>http://localhost:8081/tfs02.codeplex.com/dropthings/trunk</trunkUrl>
   4:     <workingDirectory></workingDirectory>
   5:     <username>***** SUBVERSION USER NAME *****</username>
   6:     <password>***** SUBVERSION PATH *****</password>
   7: </sourcecontrol>

Here specify the subversion location where you want to download code to the working folder. You should download the entire solution because you will be building the entire solution using MSBuild soon.

I left <workingDirectory> empty. This means whatever is specified earlier in the <workingDirectory> is used. Otherwise you can put some relative folder path here or any absolute folder.

Now we start building the tasks that CC.NET executes - Build, Email, and Deploy.

   1: <tasks>
   2:     <artifactcleanup   cleanUpMethod="KeepLastXBuilds"   cleanUpValue="5" />
   3:     <modificationWriter>
   4:         <filename>mods.xml</filename>
   5:         <path></path>
   6:     </modificationWriter>
   7:  
   8:     <!-- MSBuild task to build a .msbuild file that basically builds a .sln file -->
   9:     <msbuild>
  10:         <executable>C:\windows\Microsoft.NET\Framework64\v3.5\MSBuild.exe</executable>
  11:         <workingDirectory></workingDirectory>
  12:         <projectFile>Dropthings.msbuild</projectFile>
  13:         <targets>Build</targets>
  14:         <timeout>300</timeout>
  15:         <logger>C:\Program Files (x86)\CruiseControl.NET\server\ThoughtWorks.CruiseControl.MsBuild.dll</logger>
  16:     </msbuild>

This block first says, keep artifacts for last 5 build and remove olders. Artifacts are like build reports, logs etc. You can increase the value for longer history.

Then the most important <msbuild> task. The executable path is to the MSBuild.exe. I am using .NET 3.5 Framework 64bit edition. You might have .NET 2.0 and 32bit version. So, set the right path here for the MSbuild.exe.

<projectFile> maps to a MSBuild file. It's a skeleton MSBuild file which basically says build this Visual Studio solution file. Here's how the msbuild file looks like:

<Project DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <Target Name="Build"> 
    <!-- Rebuild entire solution -->
    <MSBuild Projects="Dropthings.sln" Targets="Rebuild" />
  </Target>
</Project>

The Dropthings.msbuild and Dropthings.sln file exists in the same trunk folder. This file says - build Dropthings.sln and do a rebuild.

Now you got the build done. Next is to deploy it. You will be using robocopy to copy files from the code folder to a destination folder which is mapped in IIS to a website. Robocopy will do a synchronization of the directories. It will add new files, overwrite old files and removes files from destination folder which no longer exists in the source folder.

Before you can deploy, you need to stop the website or restart IIS. Otherwise some files may be in use and you will not be able to delete or overwrite the files. Here's how to stop IIS using the iisreset command line tool:

<!-- 
Stop IIS before copying over the latest web project files so that there's no write lock and IIS does not
start restarting the site half way through
-->
<exec>
    <executable>iisreset</executable>
    <buildArgs>/stop</buildArgs>
</exec>

If you do not want to stop the entire IIS, instead just stop a website and recycle an application pool, you can use the iisweb.vbs script for stopping a website and iisapp.vbs script for recycling application pool. Here's an example:

<exec>
    <executable>iisweb</executable>
    <buildArgs>/stop "Dropthings"</buildArgs>
</exec>
 
<exec>
    <executable>iisapp</executable>
    <buildArgs> /a "Dropthings" /r</buildArgs>
</exec>

You need to first register cscript as the default script runtime. In order to do this, go to command line and enter iisweb. It will tell you that it cannot use wscript to run this script and it needs to make cscript default. Let it make cscript as default.

Now the time to do the deployment of latest web site files. The following task launches robocopy to do the deployment:

<!--
Sync the web project folder with the deployment folder. The deployment folder is where IIS
is mapped to serve the site. The deployment folder is at the buildArgs node. The robocopy 
utility does an exact sync, adding new files, updating old files, deleting files that no longer
exist.
-->
<exec>
    <!--<executable>C:\Program Files (x86)\Windows Resource Kits\Tools\robocopy.exe</executable>-->
    <executable>robocopy.exe</executable>
    <baseDirectory>Dropthings\</baseDirectory>
    <buildArgs>.\ d:\cc\Dropthings\Deploy *.* /E /XA:H /PURGE /XO /XD ".svn" /NDL /NC /NS /NP</buildArgs>
    <buildTimeoutSeconds>60</buildTimeoutSeconds>
    <successExitCodes>1,0</successExitCodes>
</exec>

First you need to correct the robocopy.exe path. For Windows Vista/Windows 2008, keep it as it is. For Windows 2003, you need to specify the full path. You also need to remove the (x86) from the path if you have 32bit OS.

Next is the <baseDirectory>. This is relative to the working directory. It's the path of the website folder. Dropthings website folder is located under the Dropthings folder under trunk. So, I have specified Dropthings\ as the subfolder where the website files are located. You need to specify your project's website folder's relative path here form the <workingDirectory>.

Next change the path in the <buildArgs> node. First one is the source ".\" which you keep as it is. It means copy files from the baseDirectory. Next is the absolute path to the deployment folder where the web site is mapped in IIS. You can use both relative or absolute path here. While using relative path, just keep in mind the robocopy is running from the <workingDirectory>\<baseDirectory> folder.

After the path keep the *.* and the remaining flags intact. The flags mean:

  • Copy all subdirectories /E
  • Copy hidded files /XA:H
  • Do not copy old files /XO
  • Exclude .svn directory while copying files /XD ".svn"
  • Do not show list of files and directories being copie /NDL, /NC, /NP

After the deployment, you need to turn IIS back on or start the website that you stopped:

<!-- Turn IIS back on -->
<exec>
    <executable>iisreset</executable>
    <buildArgs>/start</buildArgs>
</exec>
 
<!--
<exec>
    <executable>iisweb</executable>
    <buildArgs>/start "Dropthings"</buildArgs>
</exec>            
-->

Now we got the build and deployment done. Next is to email a nice report to developers and QA. If build succeeds, email both developers and QA so that they can check out the latest build. But if build fails, email only developers.

<publishers>
    <rss/>
    <xmllogger />
    <statistics />
 
    <!-- Email build report to development and QA team -->
    <email from="admin@yourcompany.com" mailhost="localhost" mailport="25" includeDetails="TRUE"
           mailhostUsername="" mailhostPassword="" useSSL="FALSE">
 
        <users>
            <user name="Developer1" group="devs" address="dev1@yourcompany.com"/>
            <user name="Developer2" group="devs" address="dev2@yourcompany.com"/>
            <user name="Developer3" group="devs" address="dev3@yourcompany.com"/>
            
            <user name="QA1" group="qa" address="qa1@yourcompany.com"/>
            <user name="QA2" group="qa" address="qa2@yourcompany.com"/>
            <user name="QA3" group="qa" address="qa3@yourcompany.com"/>                    
            
        </users>
 
        <groups>
            <group name="devs" notification="Always"/>
            <group name="qa" notification="Success"/>
        </groups>
 
        <converters>
            <!--<regexConverter find="$" replace="@dropthings.com" />-->
        </converters>
 
        <modifierNotificationTypes>
            <NotificationType>Always</NotificationType>
        </modifierNotificationTypes>
 
    </email>
    <modificationHistory  onlyLogWhenChangesFound="true" />
</publishers>

First you need to change the <email> tab where you specify the from address, mail server name, and optionally a user account for the email address that you need to use to send out emails.

Then edit the <users> node and put your developers and QA.

That's it! You got the configuration file done. Next step is to launch the CruiseControl.NET from Programs -> CruiseControl.NET -> CruiseControl.NET. It will launch a process that will execute the tasks according to the configuration. On Windows Vista, you will have to run it with Administrative privilege.

There's also a Windows Service that gets installed. It's named CruiseControl.NET. You can start the service as well on a server and go to sleep. It will do continuous integration and automated deployment for you.

There's also a web based Dashboard that you can use to force a build or stop a build or see detail build reports.

image

You can create multiple projects. You can have one project to build trunk code only, but do no deployment. Then you can create another project to build, deploy some branch that's ready for production. You can create another project to build and deploy on QA server and so on.

Here's the full configuration file that you can use as your baseline.

 

kick it on DotNetKicks.com

Posted by omar with 18 comment(s)
Filed under: , ,

Using multiple broadband connections without using any special router or software

I have two broadband connections. One cheap connection, which I mostly use for browsing and downloading. Another very expensive connection that I use for voice chat, remote desktop connection etc. Now, using these two connections at the same time required two computers before. But I figured out a way to use both connections at the same time using the same computer. Here's how:

Connect the cheap internet connection that is used mostly for non-critical purpose like downloading, browsing to a wireless router.

Connect the expensive connection that is used for network latency sensitive work like Voice Conference, Remote Desktop directly via LAN.

When you want to establish a critical connection like starting voice conf app (Skype) or remote desktop client, momentarily disconnect the wireless. This will make your LAN connection the only available internet. So, all the new connections will be established over the LAN. Now you can start Skype and initiate a voice conference or use Remote Desktop client and connect to a computer. The connection will be established over LAN.

Now turn on wireless. Wireless will now become the first preference for Windows to go to internet. So, now you can start Outlook, browser etc and they will be using the wireless internet connection. During this time, Skype and Terminal Client is still connected over the LAN connection. As they use persisted connection, they keep using the LAN connection and do not switch to the wireless.

This way you get to use two broadband connections simultaneously.

image 

Here you see I have data transfer going on through two different connection. The bottom one is the LAN which is maintaining a continuous voice data stream. The upper one is the wireless connection that sometimes consumes bandwidth when I browse.

image

Using Sysinternal's TCPView, I can see some connection is going through LAN and some through Belkin router. The selected ones - the terminal client and the MSN Messenger is using LAN where the Internet Explorer and Outlook is working over Wireless connection.

Posted by omar with 9 comment(s)
Filed under: , ,
More Posts Next page »