Client side Page Fragment Output cache, reduce page download time significantly

When you have a page full of lots of HTML, it will be best if the whole page can be cached on the browser. You can do this using HTTP Response caching header either by injecting them manually or by using @OutputCache tag directive on ASPX pages.

<% @ OutputCache Location ="Client" Duration ="86400" VaryByParam ="*" VaryByHeader ="*" %>

But if part of the page is dynamic and part of the page is static (like header, left side menu, foother, right side menu, bottom part) etc where static parts of the page occupy a significant amount of html, then if you could cache those parts on the browser, you could save a lot of bytes that gets downloaded every time the page downloads. On most of the websites, you will find the header, navigation menu, footer, bottom part, advertisements are mostly static and thus easily cacheable. If the whole page size is say 50KB, at least 20KB is static and 30KB might be dynamic. If you can use client side caching of page fragments (not ASP.NET’s server side page output cache), you can save 40% download time easily.

ASP.NET offers you page fragment caching using @Outputcache which is good, but that caching is on server side. It cache the output of user controls and serves them from server side cache. You cannot eliminate the download of those costly bytes. It just saves some CPU process. Nothing much for users in it.

The only way to cache part of page is by allowing browser to download those parts separately and making those parts cacheable just like images/CSS/javascripts get cached. So, we need to download page fragments separately and make them cached on the browser’s cache. IFRAME is an easy way to do this but IFRAME makes the page heavy and thus not follow CSS of the page. There are many reasons why IFRAME can’t work. So, we have a better way, we will use Javascript to render content of the page and javascript will get cached on the browser’s cache.

So, here's the idea:

  • We will split the whole page into multiple parts
  • We will generate page content using Javascript. Each cacheable part comes from javascript and javascript renders the HTML of it.
  • The parts which are cachable gets cached by the browser and thus never downloaded again (until you want them to be). But those parts which are non-cachable and highly dynamic, does not get cached by browser.

So, let's think of a page setup like this:  

Logo

Header

Left navigation Menu

Dynamic part of the page

Footer

Here only one part is dynamic and the rest is fully cacheable. So, the Default.aspx which renders this whole page looks like this:


<%
@


Page


Language
="VB"


AutoEventWireup
="false"


%>

<%
@


OutputCache


NoStore
="true"


Location
="None"


%>

<!
DOCTYPE


html


PUBLIC


"-//W3C//DTD
XHTML 1.0 Transitional//EN"


"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<
html


xmlns
="http://www.w3.org/1999/xhtml"


>

<
head


runat
="server">

    
<
title
>
My
Big Fat Page
</
title
>

</
head
>

<
body
>
 

<
form


id
="form1"


runat
="server">

<
table


width
="100%"


border
="1">

   

<
tr
>

<
td
>
Some
logo here
</
td
>

<
td
><
script


id
="Script1"


src
="Header.aspx"


type
="text/javascript"></
script
></
td
>

</
tr
>

       

<
tr
>

<
td
><
script


id
="LeftMenu"


src
="LeftMenu.aspx"


type
="text/javascript"></
script
></
td
>

<
td


bgcolor
="lightgrey"><
div
>

This is the dynamic part which gets changed on every load. Check
out the time when

it was generated: 
<%
=

DateTime.Now 
%>
</
div
></
td
>

</
tr
>

       

<
tr
>

<
td


colspan
="2"><
script


id
="Footer"


src
="Footer.aspx"


type
="text/javascript"></
script
></
td
>

</
tr
>

</
table
>

</
form
>
 

</
body
>

</
html
>

The page looks like this:

You see, the cached parts are 30 mins older. Browser has not downloaded those parts at all and thus saved a significant amount of data transfer. The only part that was downloaded was the dynamic part.

When you load the page first time, all 4 files are downloaded. But the last 3 files get cached and never downloaded until browser’s cache expires. So, on second visit, only one file downloaded and thus saves a significant amount of data transfer.

Let’s look at one of the files Header.aspx which gets cached. Nothing fancy here, it’s a regular ASPX page:

 

The interesting thing here is the “ContentType” which I have set to “text/html/javascript”. This is not something built-in, I have introduced this type.

When you put an ASPX inside a Script tag, it surely does not work because < script id ="Script1" src ="Header.aspx" type ="text/javascript"> expects javascript output, not html output. If html output is provided, browsers simply ignores it. So, we need to convert the output of Header.aspx into Javascript which when downloaded and executed by the browser, emits the original html that was generated when ASP.NET executed the page.

We use HTTP Module to intercept all .aspx calls and when the page is about to be written to the output, we check if the content type is “text/html/javascript”. If it is, this is our cue to convert the page output to javascript representation.

If you want to know details about HTTP Module and how to use Response Filter to modify page output, please read this wonderful article:

http://www.aspnetresources.com/articles/HttpFilters.aspx

It really explains all the things. I would recommend you read this article first and then continue with the rest.

We have made a response filter named Html2JSPageFilter.js (available in the code download), which overrides the Write method of Stream and converts the entire HTML of the page to javascript representation:


    
public


override


void

Write(
byte
[]
buffer, 
int

offset, 
int

count)

    {

        
string

strBuffer = System.Text.
UTF8Encoding
.UTF8.GetString
(buffer, offset, count);
 

        
//
---------------------------------

        
//
Wait for the closing </html> tag

        
//
---------------------------------

        
Regex

eof = 
new


Regex

(
"</html>"
,

RegexOptions
.IgnoreCase);
 

        
if

(!eof.IsMatch (strBuffer))

        {

            responseHtml.Append (strBuffer);

        }

        
else

        {

            responseHtml.Append (strBuffer);

            
string

 finalHtml = responseHtml.ToString ();
 

            
//
extract only the content inside the form tag tag ASP.NET generates
in all .aspx

            
int

formTagStart = finalHtml.IndexOf(
"<form"
);

            
int

formTagStartEnd = finalHtml.IndexOf(
'>'
,
formTagStart);

            
int

formTagEnd = finalHtml.LastIndexOf(
"</form>"
);
 

            
string

pageContentInsideFormTag = finalHtml.Substring(formTagStartEnd + 1,
formTagEnd - formTagStartEnd - 1);

First we get the entire page output, then we get only what is inside the <form> tag that ASP.NET generates for all .aspx pages.

Next step is to remove the viewstate hidden field because this will conflict with the view state on the default.aspx.


            
//
Remove the __VIEWSTATE tag because page fragments don't need
viewstate

            
//
Note this will make all ASP.NET controls in the page fragments go
mad which 

            
//
needs viewstate to do their work.

            
Regex

re = 
new


Regex
(
"(<input.*?__VIEWSTATE.*?/>)"
,
RegexOptions
.IgnoreCase);

            pageContentInsideFormTag =
re.Replace(pageContentInsideFormTag, 
string
.Empty);

Now we convert the entire html output to javascript string format:


            
///

Convert the HTML to javascript string representation

            
string

javascript2Html = 

                pageContentInsideFormTag.Replace(
"\r"
,

""
)

                .Replace(
"\n"
,

""
)

                .Replace(
"
   "
,

"
"
)

                .Replace(
"
 "
,

"
"
)

                .Replace(
"
  "
,

"
"
)

                .Replace(
"\\"
,

"\\\\"
)

                .Replace(
"'"
,

"\\'"
);
 

Final touch is to put that javascript string inside a “document.write(‘...’);” call. When you call document.write to emit html, it gets part of the page html:


            
//
Generate the document.write('...') which adds the content in the
document

            
string

pageOutput = 
"document.write('"

+ javascript2Html + 
"');"
;
 

This is basically the trick. Use a Response filter to get the .aspx output, and then convert it to Javascript representation.

For convenience, I have used a HttpModule to hook into ASP.NET pipeline and wait for .aspx files which try to emit content type of “text/html/javascript”. Again this content type is nothing special, you could use “text/Omar Al Zabir”.


    
void


IHttpModule
.Init(
HttpApplication

context)

    {

        context.ReleaseRequestState += 
new


EventHandler
(InstallResponseFilter);

    }
 

    
private


void

InstallResponseFilter(
object

sender, 
EventArgs

e) 

    {

     
HttpResponse

response = 
HttpContext
.Current.Response;
 

     
if

(response.ContentType == 
"text/html/javascript"
)

     {

         response.ContentType = 
"text/javascript"
;

         response.Filter = 
new


Html2JSPageFilter
(response.Filter);

     }

    }

And finally in web.config, we have to register the HttpModule so that it gets called:



        <
httpModules
>


            <
add


name
=
"
Html2JSModule
"


type
=
"
Html2JavascriptModule
"

/>

        </
httpModules
>

The entire source code is available in this URL:

Download Source code of: Client side Page Fragment Output cache, reduce page download time significantly

Enjoy. Use this approach in your aspx and html files and save significant amount of download time on users end. Although it slightly increases first time visit download time (200+ms for each script tag), but it makes second time visit a breeze. See the performance difference yourself. First visit www.pageflakes.com. Then close your browser, open it again and enter www.pageflakes.com. See how fast it loads. If you use a HTTP debugger to monitor how much data is transferred, you will see it's only 200 bytes!

Published Thursday, August 10, 2006 1:34 PM by omar
Filed under:

Leave a Comment

(required) 
(required) 
(optional)
(required)