While I'm on the subject of using the Control.Invoke method, I'd like to mention a pet peeve of mine. That is, the technique that everywhere on MSDN is proposed for dealing with using that method.

In particular, according to Microsoft, the preferred technique is to write one method that does two completely different things, depending on the value returned by the Control.InvokeRequired property. If the property is true, then Invoke is called using the same method as the target, and if it's false, then whatever code was really desired to execute is in fact executed. It looks like this:

void DoSomethingMethod()
    if (InvokeRequired)
        // actually do the "something"

Now, I'm not a big fan of methods that do more than one thing in any case. It's my opinion that a method should do one thing and do it well. There are exceptions to every rule of course, but they are few and far between. More importantly, the above pattern does not rise to the requirements of being such an exception.

To understand why, consider what happens if you call Invoke when InvokeRequired returns false. The designers of .NET could have implemented it to throw an exception if you try to call Invoke when not necessary. But there's no obvious reason that they should have, nor did they. In fact, the Invoke method will "do the right thing" in that case, and simply invoke the target delegate directly rather than trying to marshall it onto the thread that owns the Control instance.

Note that the Invoke method can't just always do the marshaling. It has to know whether doing so is necessary, because if it tried to marshall the invocation when it wasn't necessary, it would wind up stuck waiting on the marshaled invocation to happen, which it never would because it's waiting using the same thread that's needed for the invocation.

So Invoke simply invokes the target directly, and to decide to do this it checks the same state your own code would be checking when it looks at the InvokeRequired property.

In other words, by using the MSDN-prescribed pattern, you're duplicating the exact same effort that already is made by the .NET Framework.

My opinion is that it's better to not have the redundant code, and to take advantage of the fact that .NET is already doing what MSDN proposes you do: just always call Invoke and let .NET sort it out. Now, you may be thinking "but if I always call Invoke using my own method as the target, won't that cause an infinite recursion?" Yes, it would, so don't do that. Smile

Instead, take advantage of C#'s anonymous methods, wrapping all of your invoked logic inside one and invoke that:

void DoSomethingMethod()
        // actually do the "something"

Note: I prefer anonymous methods for this situation, but some may prefer a lambda expression instead, and especially if what you're invoking actually returns some value, it might even be more expressive to use that. A lambda expression is a fine alternative to an anonymous method, and winds up compiled to basically the same thing.

One final thought: don't judge Microsoft too harshly for promoting their inefficient approach so broadly. It's unfortunate that new examples continue to be created, and that the newer versions of the documentation haven't been changed to replace the older examples. But prior to C#/.NET 2.0, using an anonymous method in the invocation just wasn't an option.

I wasn't using .NET in those early days, and even then I would not have used the "one method, two behaviors" technique. Instead, I would have broken the functionality into two different methods, one that invokes the other. But the MSDN-promoted technique is less inappropriate in that context; in fact, it was as close as they could come to the benefit that anonymous methods offer of keeping all the functionality in a single method. Since I value that feature of anonymous methods so highly, I can hardly fault them for striving for that goal even when they didn't have anonymous methods to use. Smile

Posted by Peter | 7 comment(s)
Filed under: , ,

In my previous post, I mentioned that it turns out that there is a way for a form to be closed without the ...Closing/...Closed methods/events being called. Any code (including the synchronized flag technique I described earlier) that relies on this notification will fail under that scenario.

So, what to do? Well, as I mentioned before, one solution is simply to not depend on those methods or events. For example, in the "invoking/closing race condition" scenario where I started this whole thing, it's reasonable to just allow an exception to occur if the worker thread loses the race, catching and ignoring it. But sometimes, that's not an option.

In those situations, it's useful to know how to take things into your own hands. The problem is that when you pass a form instance to Application.Run, when that form is closed, the Application class interrupts the Run method's message loop and closes all the windows owned by the thread. Without the message loop running, when the windows associated with other forms are closed, those forms never get to process their close-related methods and events.

To fix this, we simply need to do basically the same work, but change the order of operations so that when the other forms are closed, they still get to do their close-related processing. The basic technique looks like this:

while (Application.OpenForms.Count > 0)


The call to Application.ExitThread is necessary to cause the Application.Run method to exit.

The logic can be put in one of two places. Either in the main form's own OnFormClosed method, or in the Program class as an event handler. Either way is fine, but you might prefer doing it as an event handler in the Program class if you would prefer for the main form to not have to know its relationship to the UI generally. This would be especially useful in scenarios where more than one form class might be used as the "main" form, but it's a nice abstraction in any case.

Using the latter approach as our example, if we assume that the above code is already in an event handler named _CloseHandler, then in the Program.Main method instead of something like this (the default from the IDE):

Application.Run(new Form1());

You'd have something like this:


Form formShow = new Form1();

formShow.FormClosed += _CloseHandler;


And that's all there is to it. Smile

Posted by Peter | with no comments
Filed under: , ,

Okay, I know last post I said next I would be writing about more network stuff. But I've been away from the blog, and in returning to it I had to revisit an issue that came up when I was doing the GUI stuff. The issue is a race condition between a thread that may need to invoke code on the GUI thread, and the GUI thread itself.

Some brief background: in Windows, GUI objects are tied to the thread on which they were created. There are a variety of ways to deal with this, but in the .NET Forms API, it's addressed using the Control.Invoke or Control.BeginInvoke method. These methods take a delegate and ensure that it's executed on the same thread that created the Control instance used to call the method. Invoke is synchronous, not returning until the delegate has been executed, while BeginInvoke returns right away, with the delegate being executed at some later time.

This allows some thread other than the GUI thread to execute code that is only permitted to be executed on the GUI thread. Most commonly, this would be used to allow some worker thread to update the user interface, but another use is to implement an easy form of synchronization.

If you're reading this, you probably already knew all that. But if you don't, you probably will want to make sure you understand the above, or review the related concepts on MSDN.

So, what's the race condition I'm talking about? Consider a form that starts a worker thread, where the form can be closed by the user before the worker thread is done. A natural thing for such a form to do when being closed is to cause the worker thread to interrupt its work and quit. Now consider such a situation, but in addition to all the above the worker thread notifies the form upon completion via an invoked delegate.

Invoking a delegate requires a valid window handle, but a form only has a valid window handle after it's been displayed but before it's been closed. If the thread is terminating because the form is being closed, that's simple enough to deal with: just have the form set a flag, checked before invoking the delegate, before telling the thread to terminate, or simply check the Control.IsDisposed property before trying to invoke the delegate and make sure the thread isn't instructed to terminate until the form is safely disposed.

But, what if the thread was already terminating on its own, at the same instant that the form is being closed by the user? Racers, start your engines! Smile

When that happens, there's a possibility that the thread that is in the process of closing the form will dispose it, but do so just after the worker thread has checked the flag (special-purpose or IsDisposed property) and just before the worker thread actually tries to call Invoke.

If you've dealt with race conditions before, you already know what's coming. One way to address this is to synchronize access to the flag, so that it's assured to not be changed by one thread while it's being used by another. It would be nice if we could do this with the IsDisposed property, but we don't have control over the code that modifies that, so to use this approach would require a new flag. We can set it in the OnFormClosed method and check it before invoking the delegate of interest:

bool _fClosed = false;
object _objLock = new object();

protected override void OnFormClosed(FormClosedEventArgs e)
    lock (_objLock)
        _fClosed = true;


void SomeMethod()
    lock (_objLock)
        if (!_fClosed)
            BeginInvoke((MethodInvoker) delegate { });

One thing you might notice is that the above code uses BeginInvoke instead of Invoke. With this technique, using Invoke would be a major no-no. Why? Because Invoke introduces a circular lock dependency, which could lead to a deadlock condition. That is, any code executing on the GUI thread has essentially locked the GUI thread, taking the lock that any code trying to call Invoke will want. At the same time, we've introduced an explicit lock (_objLock).

If the worker thread gets the explicit lock while at the same time that the GUI thread starts to execute the OnFormClosed method, then we'll wind up with two different threads, each holding the lock that the other thread needs in order to continue.

Using BeginInvoke gets around this problem by simply queuing the delegate for execution, avoiding the need for the worker thread to ever need to get the lock that's implicit in the behavior of the GUI thread.

But wait! There's still a problem. It turns out that there's a way for a form to be closed without the OnFormClosing or OnFormClosed methods ever being called. Personally, this seems like a design flaw to me. But that's the way .NET works and we have to live with it. This means that we could conceivably wind up in the method trying to call Invoke (or BeginInvoke) with a disposed form that hasn't ever set the special-purpose _fClosed flag.

How can this be? Well, it turns out that by default, a .NET Forms application is set up such that there's one main form, and when that form closes, it terminates the message pump and closes all of the other windows owned by that thread. Because of the way it does that, the other forms never get a chance to execute their ...Closing/...Closed methods and events.

One way to deal with this, as well as the original race condition issue, is to just give up and let .NET do it. Wrap the call to Invoke with a try/catch block and let it fail if the form has been disposed before the code trying to call Invoke gets a chance to.

This is actually a fine way to deal with the race condition, I think. The fact is, the overhead of the exception isn't going to matter, and it's actually about the simplest way to implement a fix.

But, there's another technique that when used in conjunction with the explicit synchronized flag will ensure that the synchronized flag technique is reliable. It's kind of interesting in its own right, and so in my next post I'll describe that approach.

Posted by Peter | with no comments
Filed under: , ,

For this post, I'll be showing network code that is about as simple as it can get. Frankly, when it comes to the i/o itself, it's my opinion that the code never gets all that complicated. The complexities tend to be with respect to managing all the other stuff that's hooked to the i/o code. But, this network code is really simple.

So, what does the most basic server look like? Well, first the server needs to make itself available for connections:

Socket sockListen = new Socket(AddressFamily.InterNetwork,
    SocketType.Stream, ProtocolType.Tcp);


Socket sockConnect = sockListen.Accept();

All TCP sockets start out the same, allocated as above. AddressFamily.InterNetwork refers to TCP/IP, while SocketType.Stream and ProtocolType.Tcp go hand in hand (all TCP sockets are also stream sockets). These parameters are passed to the System.Net.Socket constructor to instantiate a socket (SocketType.Datagram and ProtocolType.Udp would be used for a UDP socket; other address families have other combinations of socket and protocol types that are valid).

The Socket.Bind method assigns a specific address to the socket. The epListen variable references an instance of IPEndPoint that's been initialized to the desired server address.

It's very important that whatever address the server uses, the client knows how to get it. Usually the server will specify the IPAddress component of the address to IPAddress.Any (meaning any IP address on the local computer can be used), and a port appropriate to the server. Depending on the configuration of the network, the client may find the IP address using a name service (e.g. DNS) that maps a textual name to an IP address, may simply depend on the user to specify the IP address, or may have it hard-coded. In this code sample, it's hard-coded in the client (see below) as the IPAddress.Loopback value, which simply means that the server is on the same computer as the client. Similarly, the port must either be a constant that the client simply always uses, or that the user can configure for both the server and the client (so that they match).

The call to Socket.Listen makes the socket actually available for connection requests. Until this is called, a client trying to connect will simply get an "unreachable" error. Once Listen is called, the network driver will allow a number of clients up to the number specified in the call to have pending connection requests. Clients that attempt to connect once this backlog queue has been filled will receive immediate rejections.

The queue is emptied as the server actually accepts connection requests, which it does by calling Socket.Accept. The Accept method returns a Socket instance that represents the actual connection to a client. Obviously, it can't do that until some client tries to connect, so this method will block until that happens.

Once Accept does return, we can start receiving data from the client. In a most typical scenarios, the network i/o is a back-and-forth affair, but for this sample each end will do all of its sending and all of its receiving each as a single operation. The server will receive everything and then send it all back, while the client will send everything and then receive the response from the server. For the server, that looks like this:

while ((cb = sockConnect.Receive(rgb)) > 0)
    _ReceivedBytes(rgb, cb);

foreach (string strMessage in _lstrMessages)
    sockConnect.Send(_code.GetBytes(strMessage + '\0'));

We simply pass a byte[] (rgb) to the Socket.Receive method, which will return to the caller once there's any amount of data available on the socket, having copied the data into the byte[] parameter. If more data is available than the length of the byte[], as much will fit will be copied; by calling Receive repeatedly, the caller can eventually get all of the bytes that are available. Once all of the available data has been received, as long as the sender hasn't initiated a shutdown of the connection, the Receive method will block until data becomes available again.

There are a couple of things in there not specific to the Socket class, including a helper method called _ReceivedBytes that deals with processing the stream of bytes.

As I mentioned before, TCP is strictly a stream of bytes. There is no inherent delimiting of bytes, and one can't count on receiving bytes grouped the same way in which they were sent. The bytes will be in the same order, but any given call to receive can return any number of bytes from 1 up to the number of bytes that have already been sent but not yet received.

In all of my examples, we will be using null-terminated strings to deal with this. .NET allows null characters in an actual String instance, so rather than scanning the bytes as they come in for bytes of value 0, we'll go ahead and convert the bytes first, and then look for the nulls.

Speaking of converting the bytes, that's the other thing in there. The _code variable references an instance of the Encoding class. It's been initialized from the Encoding.UTF8 property. Generally speaking, the Encoding class is used for converting bytes to and from some specific character encoding. See the MSDN documentation for more details. The important things here are:

  • I've chosen UTF-8 as my character encoding for my application protocol.
  • Strings need to be converted to bytes before sending and from bytes after receiving.
  • Because UTF-8 is a character encoding that uses more than one byte to represent some characters, and because TCP may deliver the parts of a byte sequence that represents one of these characters in multiple receives, we need to maintain state between calls to Receive so that one of these broken characters can be reassembled once all the bytes have been received.

If we use the same Encoding instance for each receive, it will not only address the basic character encoding issues, it will also maintain the state we need in case a multi-byte character gets broken apart during transmission. So, I create a single instance and reuse it for all character encoding/decoding operations. This would be useful for performance anyway, but it's critical for ensuring correct handling of the byte stream.

The _ReceivedBytes method takes care of both of these needs. It uses our single Encoding instance to convert the bytes to a string and then scans for nulls to break apart the received text into individual String instances:

private void _ReceivedBytes(byte[] rgb, int cb)
    int ichNull;

    _strMessage += _code.GetString(rgb, 0, cb);

    while ((ichNull = _strMessage.IndexOf('\0')) >= 0)
        _lstrMessages.Add(_strMessage.Substring(0, ichNull));

        _strMessage = _strMessage.Substring(ichNull + 1);

The method accumulates the text that's received into a single string (_strMessage), and then extracts individual null-terminated strings, adding them to a list of strings (_lstrMessages). Those are then sent back to the client (the second loop in the previous code block), terminating each one with a null character and converting back to bytes so that the Socket.Send method, which deals only with bytes, can actually use the data.

Finally, once we have finished sending all the strings back to the client, we clean things up:


The call to Socket.Shutdown indicates to the network driver that we're done with the connection. Different network protocols use this differently, and some not at all (e.g. UDP). At a minimum, the Socket class will disable sending, receiving, or both on the instance based on the SocketShutdown parameter (an exception will be thrown if a disabled operation is attempted). At the network driver level, for a TCP connection the call to Shutdown results in negotiation between the endpoints to indicate the end of the stream in each direction. Each endpoint that shuts down with SocketShutdown.Send or SocketShutdown.Both will be seen by the other endpoint as the end of the stream. That is, once all the bytes that have been sent by the endpoint calling Shutdown are received, a call to Receive method by the other endpoint will return 0, indicating the end of the stream.

That's the server. What about the other end? Almost the same! The client code is nearly identical:

Socket sockConnect = new Socket(AddressFamily.InterNetwork,
    SocketType.Stream, ProtocolType.Tcp);


foreach (string strMessage in rgstrMessages)
    sockConnect.Send(_code.GetBytes(strMessage + '\0'));


while ((cb = sockConnect.Receive(rgb)) > 0)
    _ReceivedBytes(rgb, cb);


The Socket instance is created the same way, but instead of binding, listening, and then accepting, the code just calls Socket.Connect. The _ReceivedBytes method is even identical in this case. It decodes the received bytes in exactly the same way, adding each delimited string to a list of strings for later processing. (Obviously a more sophisticated network application would have significantly different handling for the received data between the server and client).

You can also see that the loops are swapped. The sending loop is first for the client, the receiving loop second. The client indicates that it's done sending by calling Shutdown with just the SocketShutdown.Send value, because it still needs to receive data from the server. Which it does, following the call to Shutdown. Of course, as with the server, the Socket is closed once we're done.

In this sample, all of the above code is wrapped up in a couple of classes, one for the server and one for the client, called ServerBasicEcho and ClientBasicEcho, respectively. For future samples, I'll be adding a System.Windows.Forms GUI to allow for testing of the classes. But this sample is so simple, a console application suffices nicely. Here's the Main method:

static void Main(string[] args)
        ServerBasicEcho server = new ServerBasicEcho();
        ClientBasicEcho client = new ClientBasicEcho();
        AutoResetEvent areServerStarted = new AutoResetEvent(false);
        IPEndPoint ep = new IPEndPoint(IPAddress.Any, 5005);

        Thread threadServer = new Thread(delegate() { server.Start(ep, areServerStarted); });

        threadServer.IsBackground = true;


        client.Start(new IPEndPoint(IPAddress.Loopback, ep.Port), _krgstrTestData);


        Console.WriteLine("client-server test succeeded!");
    catch (Exception exc)
        Console.WriteLine("client-server test failed. error: \"" + exc.Message + "\"");


Unlike the snippets above, this really is the entire Main method. The basic steps implemented here are:

  • Create the server and client instances
  • Initialize the desired server IPEndPoint address
  • Start the server on a separate thread
  • Start the client on the current thread
  • Run until the client and server both exit

There's some additional logic in there to ensure that the client isn't started until we're sure the server is, as well as a little bit of output to reassure the user that everything went according to plan.

And that's it! As you can see, in spite of the length of this post, there's really only about a half-dozen lines of code in each of the server and client that is directly related to doing the network i/o. The rest of the code is all just logic to deal with the actual bytes that were received, and to manage the server and client implementations themselves.

As we'll see in future posts, things get iteratively more involved as we want to add features. An interactive network connection is more complicated, as is dealing with more than one client. Oddly enough though, as we get nearer the conceptually most complicated implementation, the code actually starts to get simpler again, at least with respect to how many lines of code there actually are. It will be a good demonstration of how even though multi-threaded code can be harder to reason about, in some ways it can actually simplify the design of the program.

But that's for another day. For now, you can find the complete console application for this particular sample by clicking here.

In the next sample, I'll continue the theme of having a single client and a single server, but add some interactivity, and show some techniques for connecting the network objects to a GUI.

Posted by Peter | with no comments
Filed under: , ,

Since you're reading this, I'll assume the previous post didn't scare you off. Smile  I know even that abridged version of details may seem daunting but really, with some practice writing network code, it's all stuff that will come naturally. With that in mind, let's get to know what the basic structure of a network application looks like, and then practice!

In every network connection, there is an endpoint that is waiting ("listening") for someone to contact it, and an endpoint that does the contacting ("initiates the connection"). Once contact has been established, the roles of "server" and "client" can be less well-defined, but generally speaking the endpoint that's waiting is considered the "server" and the endpoint that doing the contacting is considered the "client".

With that in mind, here's the usual sequence of events:

  • Server creates a socket for the purpose of waiting for contacts

    (time passes)
  • Client creates a socket for the purpose of contacting a server
  • Client contacts the server
  • Server responds
  • Client and server exchange information by sending and receiving bytes

There are some subtle differences between UDP and TCP in the above. Using UDP, the "contact" is simply the transmission of a datagram to the server. Data exchange is done in exactly the same way as the initial contact, and there may be no well-defined termination of communications. In fact, each endpoint can use a single socket to communicate with an arbitrarily large number of remote endpoints.

With TCP, the initial contact is more explicit, in the form of a "connect" operation. The server's socket is a special type of socket that is never actually used for exchanging data. Instead, it acts as a kind of "connection factory", creating a new socket for each connection. For each client that connects to the server, the listening socket generates a socket that's actually used for exchanging data.

At each end, the socket associated with the actual connection can be used only for communicating with the remote endpoint of that connection. And TCP provides for a well-defined termination of communications, known as "graceful closure" or "graceful shutdown".

Of course, as mentioned previously, UDP is connectionless, it isn't reliable, and it always delivers data in the same unit (a datagram) as it was sent. But the basic techniques for moving data and managing the i/o generally are otherwise very similar to those used in TCP. For the sake of simplicity, I will stick mainly to TCP for code examples. Once I've gotten through the main samples, I'll wrap up with some that illustrate techniques specific to UDP: broadcast, and multicast.

For the next post, I'll describe the most basic code that can implement all of the above.

Posted by Peter | with no comments
Filed under: , ,

As promised, here begins my attempt to help shine some light on how to write network code for .NET in C#. Before we begin though, I would like to share a couple of links to resources that, while not specifically about .NET, are still in my opinion "must-read" material for anyone writing socket-based code on Windows. Those resources are:

In particular, the .NET stuff is mostly built on top of Winsock and so a lot of what's described in those references is pertinent to .NET too. In addition, a lot of what's important to know in Winsock is actually related to TCP/IP, the main network protocol used these days, and many issues surrounding the use of TCP/IP and which are not actually unique to Winsock are nevertheless discussed in the above references.

Now, that said, even among those issues there are a handful that seem to come up on a regular basis. They are described more fully in those references, but I'd like to start by touching on them here before getting to any actual coding. The rest of this post will be devoted to that...

What does "protocol" mean? For better or worse, the word "protocol" is used in a number of ways. Generally, it always means "a defined standard for the interchange of data". But in the context of network programming, there are many levels at which this can be applied. TCP/IP is a protocol. But then, so too are TCP and UDP, which are implemented on top of TCP/IP. And then there's what I call the "application protocol" — that is, the application-defined format of the data, such as FTP, HTTP, or some custom protocol unique to the application — which is itself implemented using TCP, UDP, or perhaps even some other protocol.

It's my hope that in context, the word "protocol" will always have a clear meaning. Please feel free to point out if it doesn't.

TCP or UDP? One of the first questions that will come up when writing a new network application is which protocol to use. Now, as it happens Winsock supports a much broader range of network protocols than just TCP/IP, and so there are actually more options than just "TCP or UDP". But for most people, and I feel even especially for beginners, TCP/IP is likely to be the main, if not the only, network protocol being used and the choice really is for practical purposes limited to "TCP or UDP".

So how do you choose? Well, it depends mainly on the needs of the application, and on how much work the programmer wants to have to do. UDP ("User Datagram Protocol") is "unreliable". That is, it provides no guarantees other than that if you receive a datagram (a single self-contained message), it's a datagram that was sent by the remote endpoint. In particular, a couple of important guarantees it does not make are the order of the datagrams, and the uniqueness of the datagrams. That's right. Not only may datagrams not be received at all, or received in a different order than that in which they were sent, you might actually receive a datagram more than once!

Some applications can tolerate this sort of issue very well, sometimes without even doing much, if any, extra work. For those kinds of applications, UDP works very well. But for others, TCP is often a better option. You could in fact write a reliable protocol on top of UDP, but why reinvent the wheel?

The one thing TCP does not guarantee is that the data is received in the same grouping as that used when sending it (more on that in a moment). However, it does guarantee that data will be received uniquely, in the same order, and without gaps. That is, you can be sure that you won't received byte N until you've already received bytes 0 through N-1 and that any bytes received will be exactly the bytes that were sent.

If you cannot afford to have any of your data go missing, then TCP is usually the way to go.

What's a connection? How do I know if it's broken/reset/lost? In addition to the above, there's another crucial difference between UDP and TCP: UDP is "connectionless" while TCP is "connection-oriented". That is, with UDP you just send a datagram to a given address and hope it gets there. Each datagram is treated independently of any other datagram. With TCP, you establish a logical connection with the remote endpoint and this connection is used to preserve state with respect to the communication between endpoints (for example, to handle all the lower-level packet reordering, confirmation, and verification needed to make TCP reliable).

One implication of the above is that once a TCP connection has been established, data can only be sent and received on that connection by one of the two endpoints involved in creating the connection. That is, for each endpoint, the socket associated with the connection will only ever be used to communicate with the other endpoint to which the connection was made. With UDP, a single socket can send to any endpoint, and can receive data from any endpoint.

Since UDP has no connection, obviously the question of the connection being reset is irrelevant. The OS will in fact generate an error if it can tell right away that a specific datagram is undeliverable (e.g. by virtue of there simply not being a route to the recipient). But otherwise, errors that might occur during delivery go unreported to the sender. But with TCP, the network driver is doing some work to manage the connection, and can report back if some unexpected failure occurs.

The one thing that gets beginners a bit surprised though is that this error detection only happens if you try to send some data. The physical connection between endpoints can come and go without any problem being noted, as long as neither end tries to actually use the connection during an interruption. It's only if one end tries to send data and fails that a connection error will be noted. This is actually a good thing — it means that connect is more robust — but some just learning network programming are surprised when they don't get errors they thought they would.

It's very unusual to need to change this behavior, but if one decides that's a requirement, the solution is simple: send data periodically. This can either be part of the application protocol, or enabled specifically for the TCP socket. In either case, the technique is known as "keep alive", which is ironic because the main thing it does is kill your connection in situations when it otherwise would have been fine. Smile

Why are my messages sent over TCP getting all squashed together? Why are my messages sent over TCP getting broken apart? These two questions are really part of the same behavior: TCP guarantees the order of the bytes you send, but not the grouping. If you send several logical "messages" in quick succession, the network driver may coalesce them into a single transmission or, if they are large, at least into groups that fit the underlying network protocol instead of whatever grouping you sent them in. Likewise, even a single send of some block of data can be broken apart and received in smaller pieces, especially if the block of data is large and there are delays in transmitting some of the pieces of the data.

Note: while the two above questions are the precise manifestations of this behavior, at first glance when this is going on it often looks to the programmer as though some of the data is simply not being sent at all (usually because the code has received the data, but then ignored it because it wasn't designed to deal with multiple sends being received together). So, if you're using TCP and you think that you're losing data, there's a good chance this is the mistake you made.

When using TCP, if some sort of message-based communication is desired, it's up to the application to implement that. The simplest mechanism is to send one block of data with each connection, closing the connection when the block has been completed. This is inefficient, but if the blocks are large and the number of them is small, it can work fine.

The other options are either to preface any transmission with a description of the length of bytes to follow (which description of length would itself need to be well-defined, either by being fixed-size or a terminated string), or to delimit the data in some way (for example, sending null-terminated strings).

Note that if delimiting is used, this often means including some way to quote the delimiter. If the data being sent is textual, a null-terminator may be sufficient and not need quoting because it will never show up in the data actually being sent. But most other situations involve sending data without any restrictions, and so some way to distinguish a true delimiter from just some data that happens to look like one is required.

Why is my data getting corrupted? It's important to keep in mind that the code executing at each end of a networked system is not only running on a different computer, it might not have even been compiled with the same compiler, written in the same language, etc. It's unusual to run into issues when you're writing both ends yourself using the same tools and especially unusual if you're running both ends on the same physical computer. But even that's not impossible. The important thing to keep in mind is that you can't take anything for granted. Structure layout, data type sizes, character encoding, etc. are all very much language-, compiler-, and environment-dependent.

The solution is to decide ahead of time on a precise definition of how the data will be formatted, and then make sure that each end of the networked code is written to translate (if necessary) between that precise definition and whatever the "natural" format for the data is in that environment.

For example, when sending text data you might be using an environment in which either ASCII or Unicode are permissible formats. Failing to standardize your application protocol on one or the other can lead to each end sending data in a different format than that expected by the other end. In C#, this is less of a problem because there's no practical way to get directly at the bytes in the string data type; you have to go through some form of text encoding/decoding anyway, and so it's simple enough to just declare ahead of time what character encoding will be used. But do make sure you make that decision and stick with it.

How do I make sure data I send is sent from a specific IP address? You don't. It's the network driver's job to decide what the best way to send data is.

Well, what can I control then? You can control the address that others must use in order to send data to you. Every network protocol has a standard way of specifying addresses. For TCP/IP, this is an IP address and a port number. Generally speaking, the IP address describes an actual network adapter and the port number describes some specific application using that adapter.

When creating a network object (e.g. a socket), both of these need to be specified, either explicitly or implicitly. Most common would be for an application expecting to receive connections (often described as the "server") to decide on a port (these are well-defined for protocols like HTTP, FTP, POP3, SMTP, etc. which use ports 80, 21, 110, and  25, respectively) and then use the special "any" IP address to indicate that it wants to receive traffic sent to the specified port on any of the network adapters present. An application initiating a connection (often described as the "client") would specify not only "any" for the IP address, but also 0 for the port number. This allows the network driver to select an available port for the client to use.

As a general rule, if you create a socket and use it before you've bound it to a particular endpoint address, the platform (.NET, Winsock, etc.) will attempt to bind the socket implicitly to the "any:0" address the first time the socket is used in a way when a bound address is required. Note that for servers, it generally is easier if you use a consistently defined port. Numbers between 5000 and 49151 are best. See the Winsock FAQ for more details.


The above doesn't even come close to covering all the potential "gotchas" to be found when writing network code. But I hope that it does a sufficient job of describing the most common and/or most important ones. At the very least, it should help emphasize that while the essentials of network i/o are actually reasonably simple, there are lots of little details that are important to get right. Otherwise, things simply don't work.

For the rest of this series, I'll be posting code and explanations for a variety of different kinds of network applications. For simplicity, in all cases, the server will do nothing but just send back to the client whatever it received (sometimes called an "echo server"). We'll start with a simple peer-to-peer, single-connection implementation and move up from there. See you next time!

Posted by Peter | with no comments
Filed under: , ,

I hope this post isn't too much of a disappointment. After promising to demonstrate how to implement a custom Forms control we're only just now, in the third of the series, getting to drawing the control contents on the screen. And as you'll see, there's so little to it you may wind up wondering "what's all the fuss?" Smile

Since we just talked about the text buffer management, an appropriate way to lead to the drawing part would be the glue that connects the two together. So, let's start there.

Whenever the buffer itself changes, we need to recompute the size of the virtual display area of the control. There are a few pieces of information that influence this: the font used to display the text, the longest measured width of all the lines of text, and the number of lines of text.

One of those pieces of information has two ways to become invalid. Specifically, the longest measured width of all the lines is dependent both on the font being used, and the actual lines of text. So, we have a helper method to deal with changes to either of those data:

private void _ComputeVirtualSize(bool fMeasureText)
    // If the font changed, we need to recalculate the widths
    // of every single line
    _dxLineMax = fMeasureText ? _DxLineMaxMeasured() : _DxLineMax();



Logically, this method is called any time something happens to invalidate the measurements based on the buffer data itself. These changes affect the "virtual size" of the control; that is, the whole size of the displayed data, as opposed to the bit you can see at any given time as determined by the scroll bars when the virtual size exceeds the on-screen size. Once the method is done with calculations, it calls Control.Invalidate to ensure that whatever changes prompted this recalculation are shown by forcing the control to redraw itself.

Of course, the real work is done in the helper methods. Here's one of them:

private int _DxLineMaxMeasured()
    int dxMax = 0;

    using (Graphics gfx = this.CreateGraphics())
        for (int iline = 0; iline < _cdlCur; iline++)
            int idl = (_idlTail + iline) % _rgdlBuffer.Length;
            string str = _rgdlBuffer[idl].str;
            int dx = (int)gfx.MeasureString(str, this.Font).Width;

            _rgdlBuffer[idl] = new DisplayLine(str, dx);

            if (dx > dxMax)
                dxMax = dx;

    return dxMax;

The _DxLineMax method is similar, except that it doesn't actually remeasure each line of text. It just looks at the current computed length for each line.

The _UpdateScrollBars method handles the logic for connecting the text measurements to the user-interface:

private void _UpdateScrollBars()
    bool fScrollToEnd = -AutoScrollPosition.Y + ClientSize.Height >= AutoScrollMinSize.Height;

    AutoScrollMinSize = new Size(_dxLineMax, this.Font.Height * _cdlCur);

    if (fScrollToEnd)
        AutoScrollPosition = new Point(-AutoScrollPosition.X, AutoScrollMinSize.Height - ClientSize.Height);

This method first checks to see if the last line of text is visible, then updates the virtual size for the control (the AutoScrollMinSize property). Finally, if the last line of text was visible, it resets the scroll bars to the end of their range again. 

That last bit may be the trickiest part of this whole control. The AutoScrollPosition is an odd property indeed. When assigned, you must pass positive values to it. But when it's retrieved, it returns negative values.

This behavior sort of makes sense. That is, the AutoScrollPosition's return value is tailor-made to be passed straight to a translation transformation to be used when drawing, so the negative values seem reasonable. Also, it would be odd to think of moving the scroll bars forward through the document by decrementing their position, so assignment of the position uses positive, increasing numbers. But putting both of those ideas together in the same property seems odd to me. I sort of wish Microsoft had just made two different properties, each with their own specific, consistent behavior.

But they didn't, and this is how it is. So it's important to be aware of this little idiosyncrasy.

The font itself is managed by the base Control class. But we can watch for changes by overriding the OnFontChanged method, and so we do:

protected override void OnFontChanged(EventArgs e)


Any time the font changes, that invalidates any measurement we made of the text, so we have to pass "true" to the _ComputeBufferData method so that it knows to remeasure every line of text rather than just searching through the lines for the current maximum.

Now that we've gotten all the glue out, what was it again that we're gluing? That's right: the text buffer management, which we've already seen, and the control drawing, which we haven't.

In a custom control, there's exactly one place where you actually draw to the screen. That's the Control.OnPaint method, which you override in your own Control sub-class. In Windows, drawing to the screen is done on an "on-demand" basis. That is, the control must always be prepared to draw its current state, and Windows will ask it to draw that state any time something has caused the current on-screen representation of the control to be incorrect.

Note that one way this can happen is if we say so explicitly, by calling the Control.Invalidate method, as described earlier. Other ways this can happen is if the window containing the control is moved partially on- or off-screen, or is rearranged relative to other windows such that the areas that are visible change.

(By the way, this design is not unique to Windows. Many GUI systems employ the same paradigm, including the Mac OS and the standard GUI frameworks in Java).

(Also by the way, a window — that is, a Form sub-class when using .NET Forms — is just a special case of a Control, so all the same redraw rules apply).

So, what does our OnPaint method look like? Here it is:

protected override void OnPaint(PaintEventArgs e)

    Graphics gfx = e.Graphics;
    int ilineDraw = (int)(-AutoScrollPosition.Y / this.Font.Height);
    int yDrawCur = ilineDraw * this.Font.Height,
        yDrawMax = -AutoScrollPosition.Y + this.Height;

    gfx.TranslateTransform(AutoScrollPosition.X, AutoScrollPosition.Y);

    using (Brush brush = new SolidBrush(this.ForeColor))
        while (yDrawCur < yDrawMax && ilineDraw < _cdlCur)
            int idlCur = (_idlTail + ilineDraw++) % _rgdlBuffer.Length;

            gfx.DrawString(_rgdlBuffer[idlCur].str, this.Font, brush, 0, yDrawCur);

            yDrawCur += this.Font.Height;

Simple, isn't it? Maybe even surprisingly so. When the OnPaint method is called, .NET has already created a System.Drawing.Graphics instance and configured it for the current drawing situation (including setting up the clipping so that only the parts of your control that are visible are drawn). This Graphics instance is passed to the OnPaint method via the PaintEventArgs parameter.

Our control is actually reasonably simple, being just a single list of lines of text. So, in our custom OnPaint method we just need to draw each line in the right spot. In this particular case, I've also included a bit of logic to compute the first and last lines that are actually visible, so that we don't waste time drawing lines that can't be seen. In truth, graphics that are drawn outside the clipped area are handled much faster than graphics that actually wind up on-screen. But it's an easy optimization to make, so we might as well.

So what are the key elements of this OnPaint method?

It starts out by calculating the range of lines, as I mentioned. It actually does this by calculating the line index of the first line that would be visible in the control, and then by calculating the vertical pixel coordinate of the last line that would be visible. Since we're updating the current pixel coordinates as we draw the lines anyway, I find it simpler to just do the straight comparison rather than the full calculation.

Then, we adjust the transformation of the Graphics instance to account for the scroll bars. This shifts all of the drawing so that the part that winds up drawing into the visible area is correct according to the position of the scroll bars.

Finally, we actually draw. We create a new brush based on the ForeColor property of the control (note the "using" statement — it's very important to ensure that disposable objects like a Brush are disposed of when you're done with them!). Then we just loop while we have strings to draw and room to draw them in, drawing each one as we go and updating our drawing position according to the font height.

As is often the case, the code is a much more concise way of describing the process. Smile But hopefully the narrative helps elaborate on some of the non-obvious aspects to or implicit requirements hidden in the code.

And so ends my little "warm-up" here, in which I try to describe an example of a custom Forms control, providing enough detail to help anyone else trying to learn the basics of writing a custom control. The key thing is to maintain the state of the control such that it's always ready to draw, and then always draw in the OnPaint method. The full code for the SimpleConsole class can be found here. If you look at it, you'll note that I included support for attaching this console to anything that uses a TraceListener, by exposing a custom TraceListener object that writes to the SimpleConsole instance. Isn't that fun? Smile

Posted by Peter | with no comments
Filed under: , , ,

For our control, there are two basic pieces of functionality we need, plus some glue. Those two pieces are: managing the actual text buffer, and drawing to the screen. The glue includes such things as watching for property changes that affect how the control will draw and managing the information used for scrolling.

Scrolling? Yes! It's actually not necessary to implement scrolling. We could just create a control that resizes itself dynamically based on its content, and then the (programmer) user of the control would place it in a scrollable container (like Panel) with AutoScroll set to true. Then, the container would automatically show scroll bars as needed to allow the (end) user to see all of the child control. But, sometimes it's useful for the control itself to handle the scrolling behavior; for example, to allow the control to automatically scroll to keep a particular position, without having to make assumptions about the parent control.

So as part of this SimpleConsole class, we'll add scrolling support by inheriting ScrollableControl (which does most of the work) and including the necessary glue to make that part work the way we want.

Let's start with the text buffer. One of the issues we want to address as compared to other possible solutions is efficiency and ease-of-use. The nature of a text console like this is that lines of text get added to the bottom and fall off the top once the maximum capacity has been reached. So, we need a data structure that efficiently allows us to add things at one end and remove them from the other, and we need some public members to provide access to that data structure.

You've probably already noticed that the description sounds at lot like a queue. And so it does! So, are we going to use .NET's Queue<T> class? That would be nice and easy, right? Unfortunately, the one thing that class is missing is random access. The only way to get at elements from anywhere other than the dequeue end is to enumerate the whole thing. Now, with .NET 3.0 and the LINQ extensions Skip and Take, the code to extract some subset of the queue would be very easy. But it would still have to scan the whole queue.

In reality, this is unlikely to be a genuine performance issue. The naïve approach would work fine in the vast majority of cases. But remember, one of the reasons we're doing all this work is to try to address some of the efficiency issues that the TextBox (not being inherently line-oriented) might have with very large buffer sizes. And besides, it's a fun exercise, so why not?

So, what's in this queue? Well, at a minimum, we have to have each line of text. We'll store those as strings of course. We could even leave it at that. But I'd like our control to only scroll in either axis when necessary, and I don't really want to put an arbitrary limit on line length (though there still is a practical limit based on the .NET graphical coordinate system). Which means we need to keep track of the actual width of each line. Rather than recomputing that every time, we'll compute it once for each line when its added and store that in the queue with the text for the line.

That makes our data structure look like this...

First, a very simple struct to store in the queue:

struct DisplayLine
    public readonly string str;
    public readonly int dx;

    public DisplayLine(string str, int dx)
        this.str = str;
        this.dx = dx;

Then, the queue itself (done as raw code in our control class...OOP-i-fying this as its own class is left as an exercise for the reader Smile):

private DisplayLine[] _rgdlBuffer;
private int _cdlCur = 0;
private int _idlTail = 0;

We've got the basics here. An array to store the data itself, a counter telling us how much data we have, and an index indicating where the data starts.

For our SimpleConsole, we'll keep operations simple. We'll be able to add text to the queue, change its length, and clear it. Exposing the length of the buffer as a property on our control class, we have:

public int BufferLines
    get { return _rgdlBuffer.Length; }
        if (value != _rgdlBuffer.Length)
            _rgdlBuffer = new DisplayLine[value];

Note that for simplicity, I've decided that any change to the length of the buffer will clear all of the existing contents. It would not be too difficult to change the size without losing the existing contents, but in the interest of keeping the property simple in this sample, I've left that as an exercise for those who might prefer that behavior.

And what about that Clear() method? All it does is reset our buffer state variables. We could have included a call to Array.Clear() to free up any data that had been in the queue before, but it's not strictly needed for the functionality we want and so again for simplicity I've just left that out.

public void Clear()
    _idlTail = 0;
    _cdlCur = 0;
    _fNewLine = true;


And what's that last method call? Remember the glue I mentioned? That's all that is. When the buffer's cleared, we've got a bit of housekeeping to do so that the other parts of the control class work right. We'll see that code shortly. But for now, there's one last interesting part about the buffer management. That's the code that actually does all the heavy lifting when a new line of text is added:

private void _Append(string strText, bool fNewLine)
    string[] rgstr = strText.Split(new string[] { "\r\n", "\r", "\n" },

    // If we're not starting a new line, remove the most
    // recent line from our buffer, and prepend it to
    // the text we're adding.
    // NOTE: _cstrCur will always be > 0 because _fNewLine
    // always starts out true for an empty buffer.
    if (!_fNewLine)
        int idlPrepend = (_idlTail + --_cdlCur) % _rgdlBuffer.Length;

        rgstr[0] = _rgdlBuffer[idlPrepend].str + rgstr[0];

    // Add each line, measuring the line width as we go
    // TODO: we could be smarter about adding lines, in case
    // the passed in text has a number of lines that exceeds
    // the size of our buffer, by only adding the last N
    // lines of the passed in text where N is the size of
    // our buffer.
    using (Graphics gfx = this.CreateGraphics())
        for (int istrAdd = 0; istrAdd < rgstr.Length; istrAdd++)
            int idlT = (_idlTail + _cdlCur++) % _rgdlBuffer.Length;
            string strAdd = rgstr[istrAdd];

            _rgdlBuffer[idlT] = new DisplayLine(strAdd, (int)gfx.MeasureString(strAdd, this.Font).Width);

    // Adjust the tail to account for any excess lines that
    // were added
    _idlTail = (_idlTail + Math.Max(0, _cdlCur - _rgdlBuffer.Length)) % _rgdlBuffer.Length;
    _cdlCur = Math.Min(_cdlCur, _rgdlBuffer.Length);

    _fNewLine = fNewLine;

The comments in the code, I hope, sufficiently describe what each part of that method does. The basic idea is that it's a traditional circular buffer, plus some code to deal with breaking an input string into individual lines and measuring each line. Of course, there's that glue at the end again. Smile

You may note that this method is private, and contains a parameter indicating whether the added text will be followed by a new line or not. The control class exposes this functionality as two separate methods, to make the interface simpler:

public void Append(string strText)
    _Append(strText, false);

public void AppendLine(string strLine)
    _Append(strLine, true);

And that's it for buffer management. Next time, actually drawing the data on the screen.

[Edit: this blogging stuff is harder that it looks. Smile So, one of the best ways to review code is to try to explain it to someone else. In reviewing the code for my next post, I realized I had some problems with the code that I wanted to fix, but I'd already posted some of it here. I have shamelessly revised history and replaced any of the changed code that appeared here with the new versions. I'm guessing that this is so early in the history of this blog that no one's even seen the previous version, but either way I hope I didn't inconvenience anyone too much.]

Posted by Peter | with no comments
Filed under: , , ,

One of the reasons I got motivated to start writing this blog was a recent spate of questions in the C# newsgroup about networking using .NET. These come up now and then, and I don't know of any reasonably comprehensive introductory resource specific to networking APIs in .NET.

There are some good resources in the context of the unmanaged Winsock API in Windows, and those would in fact be useful even to someone trying to understand the .NET API. But .NET is subtly different in some ways, and provides easy access to some of the more complicated parts of Winsock in other ways. I think there are books on the topic, but at least the basics of networking in API isn't really all that complicated. So why not attempt to write up a .NET-specific series of posts that provide the introduction that should have been in the MSDN documentation? (Ironically, MSDN does in fact have over a dozen .NET networking samples, but all are to demonstrate specific technologies or protocols, rather than being suitable as an introduction).

So, what's that got to do with "a Forms control for receiving text output"? Well, networking applications often send data back and forth, and it's often nice to have a way to watch the data as it goes by. For this purpose, one could use a ListBox or TextBox and just add new lines of text as they show up. But that brings up another question that I've seen at least a couple of times, which is how to get those controls to work more like a typical text output console?

Those controls mostly would work fine, just to get the job done. But what if you want a console with a limited, but large number of lines? These other controls can be a bit unwieldy as they're not really designed for that purpose. The ListBox only supports selection of entire lines at once, while the TextBox isn't as good at removing text on a line-by-line basis. What if we could have the best of both worlds?

At the same time, it's probably helpful to me to not just dive in to the networking stuff, but rather to get a feel for this whole "blog about code" thing with something less ambitious. So, while existing Forms controls would in fact be perfectly sufficient for our needs, a nice simple console-like control gives me an opportunity to "warm up" while at the same time providing a code sample that would be useful not only in any networking samples, but also for some of the other .NET questions I've seen having to do with writing custom controls.

For next time: the SimpleConsole class.

Posted by Peter | with no comments
Filed under: , ,

Every blog has to start somewhere. This blog starts here.

If you want to know what this blog is about and why I'm writing it, look here.

In addition to the general thoughts on the "About" page, I encourage feedback on the blog-specific issues that may exist. I'm still figuring out what I think works best in terms of formatting, code conventions, even RSS syndication. My current thoughts, such as they are:


  • Code will be mainly in the form of snippets. If I find a suitable way to post whole classes, then I'll do that and will try to make the code more "usable" (e.g. comments, XML and otherwise, fully-functional, etc.)
  • I use the Hungarian coding convention. I hope it won't get in the way too much, and I've found that it's more useful for variables than for naming properties and methods, so you'll still see .NET conventions too.
  • The RSS is currently sent as summaries, not full articles. Personally, for programming-related blogs, I tend to want to see the whole thing formatted, on the actual web site, so I've opted to not have the entire blog post sent over RSS. But that's a configurable option, and I'll definitely rethink my choice if I find people complaining that they just want entire blog posts in their RSS.


I look forward to having a place where I can put code samples and ideas that are more fully-fleshed out and more easily referenced than the newsgroup environment. I really prefer Usenet for day-to-day discussions, but sometimes having a more expressive forum is called for. I'm hoping this blog will serve that need.

Posted by Peter | with no comments
Filed under: ,