September 2006 - Posts

My university life

When I was in univerisity, from day 1 at my first semester, I looked for oppotunities not to attend classes or do homeworks instead go and work at my office more. Attending classes, taking class tests, collecting lecture notes, paying dues in long queues, doing course registration all seemed like a waste of my valuable time. I could do so many things if I were not doing those things. So, I was thinking about automating all these so that I can do them over the web from my office or home and thus save going to university. This resulted in a state-of-the-art web based, smart client powered univerisity automation and collaboration system which now runs at my univeristy web site www.aiub.edu.

It created a lot of buzz in the country because it was the first ever complete implementation of such a system not only in Bangladesh but also in many other countries. You will hardly find such feature rich automated system in Universites in US, Europe or Australia. We got so many proposals from famous universities in US who wanted to buy this project. This project was also praised in many daily newspapers. There's a recent review published in a daily which suddenly brought back old memories and made it to this post:

http://www.thedailystar.net/campus/2006/09/04/acad...

It was also the first ever .NET project completed in my country.

 

Here's what a student sees when logged in:

 

 

Here's details of a course:

Posted by omar with 1 comment(s)
Filed under:

Atlas 7: Caching web service response on browser and save bandwidth significantly

Browser can cache images, javascripts, css files on users hard drive and it can also cache Xml Http calls if the call is a Http Get. The cache is based on Url. If it's the same Url and it's cached on the computer then the response is loaded from cache, not from the server when it is requested again. Basically, browser can cache any Http Get call and return cached data based on Url. If you make a Xml Http call as Http GET and server returns some special header which informs the browser to cache the response; on future calls, the response will be immediately returned from the cache and thus save the delay of network roundtrip and download time.

At Pageflakes, we cache user's state so that when user visits again the following day, user gets a cached page which loads instantly from browser cache, not from the server. Thus second time load becomes very fast. We also cache several small parts of the page which appears on users action. When user does the same action again, a cached result is loaded immediately from local cache and thus saves the network roundtrip time. User gets a fast loading site and a very responsive site. The perceived speed increases dramatically.

The idea is to make Http Get calls while making Atlas web service calls and return some specific Http Response headers which tells the browser to cache the response for some specific duration. If you return "Expires" header during the response, browser will cache the Xml Http response. There are 2 headers that you need to return with response which will instruct browser to cache the response:

HTTP/ 1.1 200 OK Expires: Fri , 1 Jan 2030 Cache-Control: public

This will instruct browser to cache the response till Jan 2030. As long as you make the same Xml Http call with the same parameters, you will get cached response from the computer and no call will go to the server. There are more advanced ways to get further control over response caching. For example, here is a header which will instruct browser to cache for 60 seconds but do contact server and get a fresh response after 60 seconds. It will also prevent proxies from returning cached response when browser local cache expires after 60 seconds.

HTTP/ 1.1 200 OK Cache-Control: private , must-revalidate , proxy-revalidate , max-age = 60

Let's try to produce such response headers from ASP.NET web service call:

This will result in the following response headers:

Expires header is set properly. But the problem is with Cache-Control. It is showing "max-age" is set to zero which will prevent browser from doing any kind of caching. If you seriously want to prevent caching, you should emit such cache-control header. Looks like exactly the opposite thing happened.

There's a bug in ASP.NET 2.0 that you cannot change "max-age" header. As max-age is set to zero, ASP.NET 2.0 sets Cache-Control to private because max-age = 0 means no cache needed. So, there's no way you can make ASP.NET 2.0 return proper headers which caches the response.

Time for a hack. After decompiling the code of HttpCachePolicy class (Context.Response.Cache object's class), I found the following code:

Somehow, this._maxAge is getting set to zero and the check: "if (!this._isMaxAgeSet || (delta < this._maxAge))" is preventing it from getting set to a bigger value. Due to this problem, we need to bypass the SetMaxAge function and set the value of the _maxAge field directly using Reflection.

This will return the following headers:

Now max-age is set to 60 and thus browser will cache the response for 60 seconds. If you make the same call again within 60 seconds, it will return the same response. Here's a test output which shows the date time returned from the server:

The client side code is like this:

function cachedHttpGet() { WebService.CachedGet( { useGetMethod: true , onMethodComplete: function (result) { debug.dump(result); } } ); }

Here you see, the response is cached for 60 seconds and after the time elapsed, there was a server call made and new date was returned. That response was again cached for 60 seconds.

Posted by omar with 50 comment(s)
Filed under:

Atlas 6: When 'this' is not really 'this'

Atlas callbacks are not executed on the same context where they are called. For ex, if you are making a Page method call from a javascript class like this:

function SampleClass() { this .id = 1 ; this .call = function () { PageMethods.DoSomething( " Hi " , function (result) { debug.dump( this .id ); } ); } }

What happens when you call the "call" method? Do you get "1" on the debug console? No, you get "null" on the debug console because "this" is no longer the instance of the class. This is a common mistake everyone makes. As this is not yet documented in Atlas documentations, I have seen many developers spend time finding out what's wrong.

Here's the reason. We know whenever Javascript events are raised "this" refers to the html element which produced the event. So, if you do this:

 

function SampleClass() { this .id = 1 ; this .call = function () { PageMethods.DoSomething( " Hi " , function (result) { debug.dump( this .id ); } ); } }

<input type="button" id="ButtonID" onclick="o.onclick" />

If you click the button, you see "ButtonID" instead of "1". The reason is that, the button is making the call. So, the call is made within button object's context and thus "this" maps to the button object.

Similarly, when Xml Http raises the event onreadystatechanged which Atlas traps and fires the callback, the code execution is still on the Xml Http's context. It's Xml Http object which raises the event. As a result, "this" refer to the Xml Http object, not to your own class where the callback is declared.

In order to make the callback fire on the context of the instance of the class so that "this" refers to the instance of the class, you need to make the following change:

function SampleClass() { this .id = 1 ; this .call = function () { PageMethods.DoSomething( " Hi " , Function.createDelegate( this , function (result) { debug.dump( this .id ); } ) ); } }

Here, the Function.createDelegate is used to create a delegate which calls the given function under the "this" context. Function.createDelegate is defined in AtlasRuntime:

Function.createDelegate = function (instance, method) { return function () { return method.apply(instance, arguments); } }
Posted by omar with 3 comment(s)
Filed under:

Atlas 4: Only 2 calls at a time and don't expect any order

Browsers make 2 concurrent AJAX calls at a time to a domain. If you make 5 AJAX calls, browser is going to make 2 calls first, then wait for any one of them to complete and then make another call until all remaining 4 calls are complete. Moreover, you cannot expect calls to execute in the same order as you make the calls. Here's why:

Here you see, call 3's response download is quite big and thus takes longer than Call 5. So, Call 5 actually gets executed before Call 3.

So, the world of HTTP is unpredictable.

Posted by omar with 6 comment(s)
Filed under:

Atlas 5: Bad calls make good calls timeout

If 2 http calls somehow get stuck for too long, those two bad calls are going to make some good calls expire too which in the meantime got queued. Here's a nice example:

function timeoutTest()

{

PageMethods.Timeout( { timeoutInterval : 3000, onMethodTimeout: function() { debug.dump("Call 1 timed out"); } } );

PageMethods.Timeout( { timeoutInterval : 3000, onMethodTimeout: function() { debug.dump("Call 2 timed out"); } } );

PageMethods.DoSomething( 'Call 1', { timeoutInterval : 3000, onMethodTimeout: function() { debug.dump("DoSomething 1 timed out"); } } );

PageMethods.DoSomething( 'Call 2', { timeoutInterval : 3000, onMethodTimeout: function() { debug.dump("DoSomething 2 timed out"); } } );

PageMethods.DoSomething( 'Call 3', { timeoutInterval : 3000, onMethodTimeout: function() { debug.dump("DoSomething 3 timed out"); } } );

}

I am calling a method named "Timeout" on the server which does nothing but to wait for a long time so that the call gets timed out. After that I am calling a method which does not timeout. But guess what the output is:

Only one call succeeded "Do Something 1". Try again and you might see this:

Now two calls succeeded. So, if at any moment, browser's two connections get jammed, then you can expect other waiting calls are going to timeout also.

In Pageflakes, we used to get nearly 400 to 600 timeout error reports from users' browsers. We could never find out how this can happen. First we suspected slow internet connection. But that cannot happen for so many users. Then we suspected something is wrong with the hosting providers network. We did a lot of network analysis to find out whether there's any problem on the network. But we could not detect any. We used SQL Profiler to see whether there's any long running query which times out ASP.NET request execution time. But no luck. We finally discovered that, it mostly happened due to some bad calls which got stuck and made the good calls expire too. So, we modified the Atlas Runtime and introduce automatic retry on it and the problem disappeared completely. However, this auto retry requires a sophisticated open heart bypass surgery on Atlas Runtime javascript code which you have to perform again and again whenever Microsoft releases newer version of Atlas Runtime. You also can no longer use the <atlas:scriptmanager> tag which produces Atlas runtime references instead you have to manually put links to Atlas runtime and compatibility javascript files. So, you better do auto retry yourself in your own code from Day 1. On the onMethodTimeout method, just make one retry all the time to be on the safe side.

Posted by omar with 1 comment(s)
Filed under:

Atlas 3: Atlas batch calls are not always faster

Atlas provides you batch call feature which combines multiple web service calls into one call. It works transparently, you won't notice anything nor do you need not write any special code. Once you turn on the Batch feature, all web service calls made within a duration gets batched into one call. Thus saves roundtrip time and total response time.

The actual response time might be reduced but the perceived delay is higher. If 3 web service calls are batched, the 1st call does not finish first. All 3 calls finish at the same time. If you are doing some UI updates upon completion of each WS calls, it does not happen one by one. All of the calls complete in one shot and then the UI gets updated in one shot. As a result, you do not see incremental updates on the UI, instead a long delay before the UI updates. If any of the call, say the 3rd call downloads a lot of data, user sees nothing happening until all 3 calls complete. So, the duration of the 1st call becomes nearly the duration of the sum of all 3 calls. Although actual total duration is reduced, but the perceived duration is higher. Batch calls are handy when each call is transmitting small amount of data. Thus 3 small calls gets executed in one roundtrip.

Let's work on a scenario where 3 calls are made one by one. Here's how the calls actually get executed.

The second call takes a bit time to reach the server because first call is eating up bandwidth. The same reason it takes longer to download. Browsers open 2 simultaneous connections to the server. So at a time, only 2 calls are made. Once the second/first call completes, the third call is made.

When these 3 calls are batched into one:

Here the total download time is reduced (if IIS compression enabled) and there's only one network latency overhead. All 3 calls get executed on the server in one shot and the combined response is downloaded in one call. But to the user, the perceived speed is slower because all the UI update happens after the entire batch call completes. The total duration the batch call will take to complete will always be higher than 2 calls. Moreover, if you do a lot of UI update one after another, Internet Explorer freezes for a while giving user a bad impression. Sometimes expensive update on the UI makes the browser screen go blank and white. But Firefox and Opera does not have this problem.

Batch call has some advantages too. Total download time is less than downloading individual call responses because if you use gzip compression in IIS, the total result is compressed instead of individually compressing each result. So, generally batch call is better for small calls. But if a call is going to send a large amount of data or is going to return say 20KB of response, then it's better not to use batch. Another problem with batch call is, say 2 calls are very small but the 3rd call is quite big. If these 3 call gets batched, the smaller calls are going to suffer from long delay due to the 3rd larger call.

Posted by omar with 6 comment(s)
Filed under:

Atlas 2: HTTP POST is slower and it's default in Atlas

Atlas by default makes HTTP POST for all AJAX calls. Http POST is more expensive than Http GET. It transmits more bytes over the wire, thus taking precious network time and it also makes ASP.NET do extra processing on the server end. So, you should use Http Get as much as possible. However, Http Get does not allow you to pass objects as parameters. You can pass numeric, string and date only. When you make a Http Get call, Atlas builds an encoded url and makes a hit to that url. So, you must not pass too much content which makes the url become larger than 2048 chars. As far as I know, that's what is the max length of any url.

Another evil thing about http post is, it's actually 2 calls. First browser sends the http post headers and server replies with "HTTP 100 Continue". When browser receives this, it sends the actual body. Here's what the headers look like:

Request:

POST / Atlas1 / Default . aspx HTTP / 1.1 __serviceMethodName = Timeout&__serviceMethodParams = {}&__VIEWSTATE =/ wEPDwUJMTY3NTY1MjM2ZGSlWDXIdYj44hhTUd0z8yyp1q %2 bUtw %3 d %3 d Response: HTTP / 1.1 100 Continue Server: ASP . NET Development Server / 8.0 . 0.0 Date : Mon , 11 Sep 2006 15 : 04 : 13 GMT Content-Length: 0

After getting this clearance from the server, the actual body is sent. This is done in order to prevent long posts which are not going to succeed anyway if the server cannot accept it. But it comes at the cost of network latency.

So, Http Get should be at least twice faster than Http Post.

Here's how you make Http GET calls in Atlas:

Step 1: Decorate web method with attribute

First you need to put a special attribute on the web method:

[WebMethod] [WebOperation( true )] public string HelloWorld() { return " Hello World " ; }

Step 2: Set useGetMethod=true while making the call

Here's how you call the web method using Http Get:

WebService.HelloWorld( { useGetMethod: true , onMethodComplete: function (result) { debug.dump(result); } } );

Note: It needs to be a web service (.asmx). It will not work if you use it on page methods.

Please see this page in Atlas Quickstart for further info.

Posted by omar with 102 comment(s)
Filed under:

Atlas 1: Try not to use page methods

One of the easiest thing in Atlas is to use the Page Method feature. If you use Atlas on your web page say Default.aspx, you can directly call public methods on Default.aspx from javascript. Just put a [WebMethod] attribute on a public method in Default.aspx and then you can them from Javascript using PageMethods.MethodName().

public partial class _Default : System.Web.UI.Page { protected void Page_Load( object sender, EventArgs e) { } [WebMethod] public string DoSomething( string param) { return param; } }

On the client side you can call this method like this:

PageMethods.DoSomething( ' Hi ' , function (result) { alert(result); } );

Here's the catch, Page method calls are always HTTP POST calls. You can never make HTTP GET call to the Page methods but you can make HTTP GET calls to Web service method. So, on later stage of your project when you will need Http response caching in order to save roundtrips, you will have to refactor all page methods to web service methods and for this you will have to move all public methods and related code from default.aspx to some web service. So, try not to use Page methods from Day 1.

Posted by omar with 8 comment(s)
Filed under:

Beginning Atlas series: Why Atlas?

This is the first question everyone asks me when they see Pageflakes. Why not Protopage or Dojo library? Microsoft Atlas is a very promising AJAX library. They are putting a lot of effort on Atlas, making lots of reusable components that can really save you a lot of time and give your web application a complete face lift at reasonably low effort on changes. It integrated with ASP.NET v very well and it is compatible with ASP.NET Membership and Profile provider. 

When we first started developing Pageflakes, Atlas was in infant stage. We were only able to use the Page Method and Webservice Method call feature of Atlas. We had to make our own drag & drop, component architecture, popups, collapse/expand features etc. But now you can have all these from Atlas and thus save a lot of development time. The web service proxy feature of Atlas is a marvel. You can point a <script> tag to a .asmx file and you get a javascript class generated right out of the web service definition. The Javascript class contains the exact methods that you have on the web service class. This makes it really easy to add/remove new webservices, add/remove methods in webservices which does not require any changes on the client side. It also offers a lot of control over the AJAX calls and provides rich exception trapping feature on the javascript. Server side exceptions are nicely thrown to client side javascript code and you can trap it and show nicely formatted error messages to the user. Atlas works really well with ASP.NET 2.0 eliminating the integration problem completely. You need not worry about authentication and authorization on page methods and web service methods. So, you save a lot of code on the client side (of course Atlas Runtime is huge for this reason) and you can concentrate more on your own code then building up all these framework related codes.

Recent version of Atlas works nicely with ASP.NET Membership and Profile services giving you login/logout features from Javascript without requiring page postbacks and you can read/write Profile object directly from Javascript. This comes very handy when you heavily use ASP.NET membership and profile providers in your web application which we do at Pageflakes.

On earlier versions of Atlas, there was no way to make HTTP GET calls. All calls were HTTP POST and thus quite expensive calls. Now you can say which calls should be HTTP GET. Once you have HTTP GET, you can utilize Http response caching features which I will explain soon.

I will be writing about lots of Atlas tips and tricks. I am assuming you are familiar with Atlas and you have already tried some quick start tutorials and you know the concepts of Page Method, Web service Proxy, Script Manager etc.

Posted by omar with 2 comment(s)
Filed under:

Do you have problems with users who cannot use Forgot Password option?

Here's a scenario. We use Email address as user name in ASP.NET 2.0 Membership provider. There were several places where we used to create user accounts using this:

Membership.CreateUser( email, password );

We did not notice what it was doing. After some days, users started complaining. This is what users said whose account was automatically created by the above code:

"Hi,

I got the email invitation. I went to your site. I tried login, it said user name or password is wrong. So, I tried Signup. Signup said user name already taken. Then I went to forgot password to retrieve the password. It shows something is wrong and password email cannot be sent.

I am stuck. Please help!"

Here's the problem. When we use the above code, it creates a row in aspnet_users table using the email address as user name. Fine no problem. But in aspnet_membership table, the row it creates contains Email is NULL. So, user cannot use "Forgot Password" option to request the password because the email address is null. Out database contained 908 of such unfortunate users, so we had to run the following SQL to fix it:

update aspnet_membership set email = ( select username from aspnet_users where applicationID = ' ... ' and userID = aspnet_membership.userID) ,loweredemail = ( select loweredusername from aspnet_users where applicationid = ' ... ' and userid = aspnet_membership.userID) where loweredemail is null and applicationID = ' ... '

The applicationID is something which you need to specify for your own application. You can find the ID from aspnet_application table.

Then we changed the code to create user accounts to this:

Membership.CreateUser( email, password, email );

The 3rd parameter is the email address. We did not notice this.

Posted by omar with 5 comment(s)
Filed under:

Large log file can bring SQL Server down when transaction log shipping runs

We were having very poor performance when we turned on transaction log shipping on our SQL Server. We are using SQL Server 2005. The transaction log file was around 30 GB because the database was in Full Recovery mode. The server became very slow, every 15 mins when we were doing the log shipping, it used to become very slow and sometimes nonresponsive. The event log was getting full of SqlTimeout exceptions generated by the web site. The web site started to show asp.net error page very frequently. We could not use SQL Server Management Studio to login to SQL Server so that we could do something about it.

Here's how the connection time was reported from an external monitoring site:

The peaks are 30 seconds which mean they timed out.

So, here's what we did:

  1. Turned off Log shipping
  2. Restarted SQL Server.
  3. Switched Database to Simple recovery model. Shrunk the log file. This made the log file come down to couple of megabytes.
  4. Ran for some days. All looked ok.
  5. Then switched DB to Full Recovery model and configured log shipping again.

So far running fine. But we go down for an hour every Saturday when we run INDEX DEFRAG on the indexes. The log ships show around 5 or 6 log backups which are each 1 or 2 GB in size when the index defrag happens.

Posted by omar with no comments
Filed under:

How to setup SQL Server 2005 Transaction Log Ship on large database that really works

I tried a lot of combinations in my life in order to find out an effective method for implementing Transaction Log Shipping between servers which are in a workgroup, not under domain. I realized the things you learn from article and books are for small and medium sized databases. When you database become 10 GB or bigger, thing's become a lot harder than it looks. Additionally many things changed in SQL Server 2005. So, it's even more difficult to configure log shipping properly nowadays.

Here's the steps that I finally found that works. Let's assume there are 2 servers with SQL Server 2005. Make sure both servers have latest SP. There's Service Pack 1 released already.

1. Create a new user Account named "SyncAccount" on both computers. Use the exact same user name and password.

2. Make sure File Sharing is enabled on the local area connection between the server. Also enable file sharing in Firewall.

3. Make sure the local network connection is not regular LAN. It must be a gigabit card with near zero data corruption. Both cable and switch needs to be perfect. If possible, connect both servers using Fibre optic cable directly on the NIC in order to avoid a separate Switch.

4. Now create a folder named "TranLogs" on both servers. Let's assume the folder is on E:\Tranlogs.

5. On Primary Database server, share the folder "Tranlogs" and allow SyncAccount "Full Access" to it. Then allow SyncAccount FullAccess on TranLogs folder. So you are setting the same permission from both "Sharing" tab and from "Security" tab.

6. On Secondary database server, allow SyncAccount "Full Access" right on TranLogs folder. No need to share it.

7. Test whether SyncAccount can really connect between the servers. On Secondary Server, go to Command Prompt and do this:

8.

9. Now you have a command prompt which is running with SyncAccount privilege. Let's confirm the account can read and write on "TranLog" shares on both servers.

10.

11. This is exactly what SQL Agent will be doing during log ship. It will copy log files from primary server's network share to it's own log file folder. So, the SyncAccount needs to be able to both read files from primary server's network share and write onto its own tranlogs folder. The above test verifies the result.

12. This is something new in SQL Server 2005: Add SyncAccount in SQLServer Agent group "SqlServer2005SqlAgentUser$....". You will find this Windows User Group after installing SQL Server 2005.

13. Now go to Control Panel->Administrative Tools->Services and find the SQL Server Agent service. Go to its properties and set SyncAccount as the account on the Logon tab. Restart the service. Do this on both servers.

14.

15. I use sa account to configure the log shipping. So, do this on both servers:

a. Enable "sa" account. By default, sa is disabled in SQL Server 2005.

b. On "sa" account turn off Password Expiration Policy. This prevents sa password from expiring automatically.

16. On Secondary server, you need to allow remote connections. By default, SQL Server 2005 disables TCP/IP connection. As a result, you cannot login to the server from another server. Launch the Surface Area Configuration tool from Start->Programs->MS SQL Server 2005 and go to "Remote Connection" section. Choose the 3rd option which allows both TCP/IP based remote connection and local named pipe based connections.

17. On Secondary Server firewall, open port 1433 so that primary server can connect to it.

18. Restart SQL Server. Yes, you need to restart SQL Server.

18. On Primary server, go to Database properties->Options and set Recovery Model to "Full". If it was already set to full before, it will be wise to first set it to Simple, then shrink the transaction log file and then make it "Full" again. This will truncate the transaction log file for sure.

19. Now take a Full Backup of the database. During backup, make sure you put the backup file on a physically separate hard drive than the drive where MDF is located. Remember, not different logical drives, different physical drives. So, you should have at least 2 hard drives on the server. During backup, SQL Server reads from MDF and writes on the backup file. So, if both MDF and the backup is done on the same hard drive, it's going to take more than double the time to backup the database. It will also keep the Disk fully occupied and server will become very slow.

20. After backup done, RAR the database. This ensures when you copy the database to the other server there's no data corruption while the file was being transferred. If you fail to unRAR the file on the secondary server, you get assurance that there's some problem on the network and you must replace network infrastructure. The RAR also should be done on a separate hard drive than the one where the RAR is located. Same reason, read is on one drive and write is on another drive. Better if you can directly RAR to the destination server using network share. It has two benefits:

a. Your server's IO is saved. There's no write, only read.

b. Both RAR and network copy is done in one step.

21.

22. By the time you are done with the backup, RAR, copy over network, restore on the other server, the Transaction Log file (LDF) on the primary database server might become very big. For us, it becomes around 2 to 3 GB. So, we have to manually take a transaction log backup and ship to the secondary server before we configure Transaction Log Shipping.

23.

24. When you are done with copying the transaction log backup to the second server, first restore the Full Backup on the secondary server:

25.

26. But before restoring, go to Options tab and choose RESTORE WITH STANDBY:

27.

28. When the full backup is restored, restore the transaction log backup.

29. REMEMBER: go to options tab and set the Recovery State to "RESTORE WITH STANDBY" before you hit the OK button.

30. This generally takes a long time. Too long in fact. Every time I do the manual full backup, rar, copy, unrar, restore, the Transaction Log (LDF) file becomes 2 to 3 GB. As a result, it takes a long time to do a transaction log backup, copy and restore and it takes more than an hour to restore it. So, within this time, the log file on the primary server again becomes large. As a result, when log shipping starts, the first log ship is huge. So, you need to plan this carefully and do it only when you have least amount of traffic.

31. I usually have to do this manual Transaction Log backup twice. First one is around 3 GB. Second one is around 500 MB.

32. Now you have a database on the secondary server ready to be configured for Log shipping.

33. Go to Primary Server, select the Database, right click "Tasks" -> "Shrik". Shrink the Log File.

34. Go to Primary server, bring on Database options, go to Transaction Log option and enable log shipping.

35.

36. Now configure the backup settings line this:

37.

38. Remember, the first path is the network path that we tested from command prompt on the secondary server. The second path is the local hard drive folder on the primary server which is shared and accessible from the network path.

39. Add a secondary server. This is the server where you have restored the database backup

40.

41. Choose "No, the secondary database is initialized" because we have already restored the database.

42. Go to second tab "Copy Files" and enter the path on the secondary server where log files will be copied to. Note: The secondary server will fetch the log files from the primary server network share to it's local folder. So, the path you specify is on the secondary server. Do not get confused from the picture below that's it's the same path as primary server. I just have same folder configuration on all servers. It can be D:\tranlogs if you have the tranlogs folder on D: drive on secondary server.

43.

44. On third tab, "Restore Transaction Log" configure it as following:

45.

46. It is very important to choose "Disconnect users in database…". If you don't do this and by any chance Management Studio is open on the database on secondary server, log shipping will keep on failing. So, force disconnect of all users when database backup is being restored.

47. Setup a Monitor Server which will automatically take care of making secondary server the primary server when your primary server will crash.

48.

49. In the end, the transaction log shipping configuration window should look like this:

50.

51. When you press OK, you will see this:

52. Do not be happy at all if you see everything shows "Success". Even if you did all the paths, and settings wrong, you will still see it as successful. Login to the secondary server, go to SQL Agents->Jobs and find the Log Ship restore job. If the job is not there, your configuration was wrong. If it's there, right click and select "View History". Wait for 15 mins to have one log ship done. Then refresh and see the list. If you see all OK, then it is really ok. If not, then there are two possibilities:

a. See if the Log Ship Copy job failed or not. If it fails, then you entered incorrect path. There can be one of the following problem:

  1. The network location on primary server is wrong
  2. The local folder was specified wrong
  3. You did not set SyncAccount as the account which runs SQL Agent or you did but forgot to restart the service.

b. If restore fails, then the problems can be one of the following:

i. SyncAccount is not a valid login in SQL Server. From SQL Server Management Studio, add SyncAccount as a user.

ii. You forgot to restore the database on secondary server as Standby.

iii. You probably took some manual transaction log backup on the primary server in the meantime. As a result, the backup that log shipping took was not the right sequence.

53. If everything's ok, you will see this:

Posted by omar with 23 comment(s)
Filed under:

Careful when querying on aspnet_users, aspnet_membership and aspnet_profile tables used by ASP.NET 2.0 Membership and Profile provider

Such queries will happily run on your development environment:

Select * from aspnet_users where UserName = ' blabla '

Or you can get some user's profile without any problem using:

Select * from aspnet_profile where userID = ' …... '

Even you can nicely update a user's email in aspnet_membership table like this:

Update aspnet_membership SET Email = ' newemailaddress@somewhere.com ' Where Email = ' '

But when you have a giant database on your production server, running any of these will bring your server down. The reason is, although these queries look like very obvious ones that you will be using frequently, none of these are part of any index. So, all of the above results in "Table Scan" (worst case for any query) on millions of rows on respective tables.

Here's what happened to us. We used such fields like UserName, Email, UserID, IsAnonymous etc on lots of marketing reports at Pageflakes. These are some reports which only marketing team use, no one else. Now, the site runs fine but several times a day marketing team and users used call us and scream "Site is slow", "Users are reporting extreme slow performance", "Some pages are getting timed out" etc. Usually when they call us, we tell them "Hold on, checking right now" and we check the site thoroughly. We use SQL profiler to see what's going wrong. But we cannot find any problem anywhere. Profile shows queries running file. CPU load is within parameters. Site runs nice and smooth. We tell them on the phone, "We can't see any problem, what's wrong?"

So, why can't we see any slowness when we try to investigate the problem but the site becomes really slow several times throughout the day when we are not investigating?

Marketing team sometimes run those reports several times per day. Whenever they run any of those queries, as the fields are not part of any index, it makes server IO go super high and CPU also goes super high - something like this:

We have SCSI drives which have 15000 RPM, very expensive, very fast. CPU is Dual core Dual Xeon 64bit. Both are very powerful hardware of their kind. Still these queries bring us down due to huge database size.

But this never happens when marketing team calls us and we keep them on the phone and try to find out what's wrong. Because when they are calling us and talking to us, they are not running any of the reports which brings the servers down. They are working somewhere else on the site, mostly trying to do the same things complaining users are doing.

Let's look at the indexes:

Table: aspnet_users
Clustered Index = ApplicationID, LoweredUserName
NonClustered Index = ApplicationID, LastActivityDate
Primary Key = UserID

Table: aspnet_membership
Clustered Index = ApplicationID, LoweredEmail
NonClustered = UserID

Table: aspnet_Profile
Clustered Index = UserID

Most of the indexes have ApplicationID in it. Unless you put Application='…' in the WHERE clause, it's not going to use any of the indexes. As a result, all the queries will suffer from Table Scan. Just put ApplicationID in the where clause (Find your applicationID from aspnet_Application table) and all the queries will become blazingly fast.

DO NOT use Email or UserName fields in WHERE clause. They are not part of the index instead LoweredUserName and LoweredEmail fields are in conjunction with ApplicationID field. All queries must have ApplicationID in the WHERE clause.

Our Admin site which contains several of such reports and each contains lots of such queries on aspnet_users, aspnet_membership and aspnet_Profile tables. As a result, whenever marketing team tried to generated reports, they took all the power of the CPU and HDD and the rest of the site became very slow and sometimes non-responsive.

Make sure you always cross check all your queries WHERE and JOIN clauses with index configurations. Otherwise you are doomed for sure when you go live.

Posted by omar with 7 comment(s)
Filed under:

Calculate code block execution time using &quot;using&quot;

Here's an interesting way to calculate the execution time of a code block:

private void SomeFunction() { using ( new TimedLog(Profile.UserName, " Some Function " )) { ... ... } }

You get an output like this:

6/14/2006         10:58:26 AM         4b1f6098-8c9d-44a5-93d8-e37394b6ef18       SomeFunction         9.578125

You can measure execution time of not only a function, but also smaller blocks of code. Whatever is inside the "using" block, gets logged.

Here's how the TimedLog class do the work:

public class TimedLog : IDisposable { private string _Message; private long _StartTicks; public TimedLog( string userName, string message) { this ._Message = userName + ' \t ' + message; this ._StartTicks = DateTime.Now.Ticks; } #region IDisposable Members void IDisposable.Dispose() { EntLibHelper.PerformanceLog( this ._Message + ' \t ' + TimeSpan.FromTicks(DateTime.Now.Ticks - this ._StartTicks).TotalSeconds.ToString()); } #endregion }

We are using Enterprise Library to do the logging. You can use anything you like on the Dispose method.

The benefit of such log is, we get a tab delimited file which we can use to do many types of analysis using MS Excel. For example, we can generate graphs to see how the performance goes up and down during peak hours and non peak hours. We can also see whether there are high response times or not and what is the pattern. All these gives us valuable indications where the bottle-neck is. You can also find out which calls take most of the time by doing sort on the duration column.

Posted by omar with 2 comment(s)
Filed under: