Client side Page Fragment Output cache, reduce page download time significantly

When you have a page full of lots of HTML, it will be best if the whole page can be cached on the browser. You can do this using HTTP Response caching header either by injecting them manually or by using @OutputCache tag directive on ASPX pages.

<% @ OutputCache Location ="Client" Duration ="86400" VaryByParam ="*" VaryByHeader ="*" %>

But if part of the page is dynamic and part of the page is static (like header, left side menu, foother, right side menu, bottom part) etc where static parts of the page occupy a significant amount of html, then if you could cache those parts on the browser, you could save a lot of bytes that gets downloaded every time the page downloads. On most of the websites, you will find the header, navigation menu, footer, bottom part, advertisements are mostly static and thus easily cacheable. If the whole page size is say 50KB, at least 20KB is static and 30KB might be dynamic. If you can use client side caching of page fragments (not ASP.NET’s server side page output cache), you can save 40% download time easily.

ASP.NET offers you page fragment caching using @Outputcache which is good, but that caching is on server side. It cache the output of user controls and serves them from server side cache. You cannot eliminate the download of those costly bytes. It just saves some CPU process. Nothing much for users in it.

The only way to cache part of page is by allowing browser to download those parts separately and making those parts cacheable just like images/CSS/javascripts get cached. So, we need to download page fragments separately and make them cached on the browser’s cache. IFRAME is an easy way to do this but IFRAME makes the page heavy and thus not follow CSS of the page. There are many reasons why IFRAME can’t work. So, we have a better way, we will use Javascript to render content of the page and javascript will get cached on the browser’s cache.

So, here's the idea:

  • We will split the whole page into multiple parts
  • We will generate page content using Javascript. Each cacheable part comes from javascript and javascript renders the HTML of it.
  • The parts which are cachable gets cached by the browser and thus never downloaded again (until you want them to be). But those parts which are non-cachable and highly dynamic, does not get cached by browser.

So, let's think of a page setup like this:  

Logo

Header

Left navigation Menu

Dynamic part of the page

Footer


Here only one part is dynamic and the rest is fully cacheable. So, the Default.aspx which renders this whole page looks like this:

<%@PageLanguage="VB"AutoEventWireup="false"%>
<%@OutputCacheNoStore="true"Location="None"%>
<!DOCTYPEhtmlPUBLIC"-//W3C//DTDXHTML 1.0 Transitional//EN""http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<htmlxmlns="http://www.w3.org/1999/xhtml">
<headrunat="server">
    <title>MyBig Fat Page</title>
</head>
<body>
 
<formid="form1"runat="server">
<tablewidth="100%"border="1">   
<tr>
<td>Somelogo here</td>
<td><scriptid="Script1"src="Header.aspx"type="text/javascript"></script></td>
</tr>       
<tr>
<td><scriptid="LeftMenu"src="LeftMenu.aspx"type="text/javascript"></script></td>
<tdbgcolor="lightgrey"><div>
This is the dynamic part which gets changed on every load. Checkout the time when
it was generated: <%=DateTime.Now %></div></td>
</tr>       
<tr>
<tdcolspan="2"><scriptid="Footer"src="Footer.aspx"type="text/javascript"></script></td>
</tr>
</table>
</form>
 
</body>
</html>

The page looks like this:

You see, the cached parts are 30 mins older. Browser has not downloaded those parts at all and thus saved a significant amount of data transfer. The only part that was downloaded was the dynamic part.

When you load the page first time, all 4 files are downloaded. But the last 3 files get cached and never downloaded until browser’s cache expires. So, on second visit, only one file downloaded and thus saves a significant amount of data transfer.

Let’s look at one of the files Header.aspx which gets cached. Nothing fancy here, it’s a regular ASPX page:

 

The interesting thing here is the “ContentType” which I have set to “text/html/javascript”. This is not something built-in, I have introduced this type.

When you put an ASPX inside a Script tag, it surely does not work because < script id ="Script1" src ="Header.aspx" type ="text/javascript"> expects javascript output, not html output. If html output is provided, browsers simply ignores it. So, we need to convert the output of Header.aspx into Javascript which when downloaded and executed by the browser, emits the original html that was generated when ASP.NET executed the page.

We use HTTP Module to intercept all .aspx calls and when the page is about to be written to the output, we check if the content type is “text/html/javascript”. If it is, this is our cue to convert the page output to javascript representation.

If you want to know details about HTTP Module and how to use Response Filter to modify page output, please read this wonderful article:

http://www.aspnetresources.com/articles/HttpFilters.aspx

It really explains all the things. I would recommend you read this article first and then continue with the rest.

We have made a response filter named Html2JSPageFilter.js (available in the code download), which overrides the Write method of Stream and converts the entire HTML of the page to javascript representation:

    publicoverridevoidWrite(byte[]buffer, intoffset, intcount)
    {
        stringstrBuffer = System.Text.UTF8Encoding.UTF8.GetString(buffer, offset, count);
 
        //---------------------------------
        //Wait for the closing </html> tag
        //---------------------------------
        Regexeof = newRegex("</html>",RegexOptions.IgnoreCase);
 
        if(!eof.IsMatch (strBuffer))
        {
            responseHtml.Append (strBuffer);
        }
        else
        {
            responseHtml.Append (strBuffer);
            string finalHtml = responseHtml.ToString ();
 
            //extract only the content inside the form tag tag ASP.NET generatesin all .aspx
            intformTagStart = finalHtml.IndexOf("<form");
            intformTagStartEnd = finalHtml.IndexOf('>',formTagStart);
            intformTagEnd = finalHtml.LastIndexOf("</form>");
 
            stringpageContentInsideFormTag = finalHtml.Substring(formTagStartEnd + 1,formTagEnd - formTagStartEnd - 1);

First we get the entire page output, then we get only what is inside the <form> tag that ASP.NET generates for all .aspx pages.

Next step is to remove the viewstate hidden field because this will conflict with the view state on the default.aspx.

            //Remove the __VIEWSTATE tag because page fragments don't needviewstate
            //Note this will make all ASP.NET controls in the page fragments gomad which 
            //needs viewstate to do their work.
            Regexre = newRegex("(<input.*?__VIEWSTATE.*?/>)",RegexOptions.IgnoreCase);
            pageContentInsideFormTag =re.Replace(pageContentInsideFormTag, string.Empty);

Now we convert the entire html output to javascript string format:

            ///Convert the HTML to javascript string representation
            stringjavascript2Html = 
                pageContentInsideFormTag.Replace("\r","")
                .Replace("\n","")
                .Replace("   ","")
                .Replace(" ","")
                .Replace("  ","")
                .Replace("\\","\\\\")
                .Replace("'","\\'");
 

Final touch is to put that javascript string inside a “document.write(‘...’);” call. When you call document.write to emit html, it gets part of the page html:

            //Generate the document.write('...') which adds the content in thedocument
            stringpageOutput = "document.write('"+ javascript2Html + "');";
 

This is basically the trick. Use a Response filter to get the .aspx output, and then convert it to Javascript representation.

For convenience, I have used a HttpModule to hook into ASP.NET pipeline and wait for .aspx files which try to emit content type of “text/html/javascript”. Again this content type is nothing special, you could use “text/Omar Al Zabir”.

    voidIHttpModule.Init(HttpApplicationcontext)
    {
        context.ReleaseRequestState += newEventHandler(InstallResponseFilter);
    }
 
    privatevoidInstallResponseFilter(objectsender, EventArgse) 
    {
     HttpResponseresponse = HttpContext.Current.Response;
 
     if(response.ContentType == "text/html/javascript")
     {
         response.ContentType = "text/javascript";
         response.Filter = newHtml2JSPageFilter(response.Filter);
     }
    }

And finally in web.config, we have to register the HttpModule so that it gets called:

        <httpModules>
            <addname="Html2JSModule"type="Html2JavascriptModule"/>
        </httpModules>

The entire source code is available in this URL:

Download Source code of: Client side Page Fragment Output cache, reduce page download time significantly

Enjoy. Use this approach in your aspx and html files and save significant amount of download time on users end. Although it slightly increases first time visit download time (200+ms for each script tag), but it makes second time visit a breeze. See the performance difference yourself. First visit www.pageflakes.com. Then close your browser, open it again and enter www.pageflakes.com. See how fast it loads. If you use a HTTP debugger to monitor how much data is transferred, you will see it's only 200 bytes!

Published Thu, Aug 10 2006 7:34 by omar
Filed under: