WCF+Windows Forms = lots of problems with some services
In these last days I've been really busy finishing an application which consists of a Windows Forms app that gets its data from serveral WCF services (which, btw, use basic HTTP binding + SSL). Now, most of the services work really smoothly, but...there are 2 which are really making me go crazy: they are SLOW when run in Windows XP Pro machines! Now, I'm talking about really simple services which should really execute under 300 ms!
As you might expect, the first thing I did was use fiddler to see what was going on. According to it, the request was taking 15 seconds, which seemed to confirm my initial impression that there was something wrong with the service. The next logical step was enabling tracing on the WCF service. I couldn't really believe in what my eyes were seeing: instead of seeing several seconds on processing time, the traced shown <300 ms. Could it be something wrong with IIS (which, btw, is IIS 6 on Windows 2003 Server machine)? To answer this, I enabled IIS logging. Guess what: IIS logging confirms the WCF tracing entries: requests were served in <300 ms!
Maybe there something wrong with the network? Pinging revealed nothing wrong...So I went ahead and installed netmon on my machine. After filtering the traffic, I was able to see that request/response from client to server were consistent with the logging info I was getting from WCF and IIS. So, I've just went ahead and tried using ngen with my client windows forms app. It didn't made any difference though...it was time to try one more thing: logging on the client side.
The SvcTraceViewer tool has a really great feature: you can open several trace files at the same time. When I did this, I got the following results (just copy/paste it here: on each line you'll see a description, the number of traces, the start and end time of the action):
000000000000 8 13s 9:21:20 9:21:34
Construct ChannelFactory. Contract type: 'SRA.Mercados.Servicos.IServicoUtilizador'. 4 328ms 9:21:20 9:21:21
Open ChannelFactory. Contract type: 'SRA.Mercados.Servicos.IServicoUtilizador'. 4 31ms 9:21:21 9:21:21
Process action 'http://sra.pt/Mercados/IServicoUtilizador/ObtemUtilizador'. 24 14s 9:21:21 9:21:35
Receive bytes on connection 'http://bksrv.sranet.local:8080/ServicosMercados/UtilizadorImpl.svc'. 17 437ms 9:21:34 9:21:35
Close ChannelFactory. Contract type: 'SRA.Mercados.Servicos.IServicoUtilizador'. 4 0ms 9:21:34 9:21:34
Construct ServiceHost 'SRA.Mercados.ServicosImpl.ServicoUtilizador'. 10 109ms 9:21:34 9:21:35
Open ServiceHost 'SRA.Mercados.ServicosImpl.ServicoUtilizador'. 64 93ms 9:21:35 9:21:35
Listen at 'http://bksrv.sranet.local:8080/ServicosMercados/UtilizadorImpl.svc'. 6 62ms 9:21:35 9:21:35
Listen at 'https://bksrv.sranet.local/ServicosMercados/UtilizadorImpl.svc/mex'. 4 15ms 9:21:35 9:21:35
Processing message 1. 5 15ms 9:21:35 9:21:35
Execute 'SRA.Mercados.Servicos.IServicoUtilizador.ObtemUtilizador'. 4 265ms 9:21:35 9:21:35
The bold lines are where the problems are happening. Now, I really don't understand why it takes 13 seconds for the process action on the client to hit the server (notice that server will only start its processing at 9:21:34 and that the client opens the channel at 9:21:21!). Now, I'm not really an expert on WCF, but it does seem like there's something wrong going on here. What's bothering me is that I don't know what it is :)
It was getting late, so I only had time for testing one more thing: using perfmon and see if I can get any info from it. I've only tried seeing one thing: the % of JIT time spent on a process. Guess what: the counter goes through the roof when I call a method of one of those services that take a long time! I'm still trying to understand what's going on with those XP machines (specially because on my Windows 2008 server everything is working out smoothly), but it seems like there's something that is making the Jitter do a lot of work when one of those methods gets called. I'm not sure on what it is since, as I've said, it's really a simple method and I've already run ngen on my code.
If anyone has any idea on what's going on here please do share it with me! I'm running out of them and I still don't understand what's going on...thanks.