Breaking Liskov
Very recently, Barbara Liskov won the Turing award, which makes it a highly appropriate time to ponder when it's reasonable to ignore her most famous piece of work, the Liskov Substitution (or Substitutability) Principle. This is not idle speculation: I've had a feature request for MiscUtil. The request makes sense, simplifies the code, and is good all round - but it breaks substitutability and documented APIs.
The substitutability principle is in some ways just common sense. It says (in paraphrase) that if your code works for some base type T, it should be able to work with subtype of T, S. If it doesn't, S is breaking substitutability. This principle is at the heart of inheritance and polymorphism - I should be able to use a Stream without knowing the details of what its underlying storage is, for example.
Liskov's formulation is:
Let q(x) be a property provable about objects x of type T. Then q(y) should be true for objects y of type S where S is a subtype of T.
So, that's the rule. Sounds like a good idea, right?
Breaking BinaryReader's contract
My case in point is EndianBinaryReader (and EndianBinaryWriter, but the arguments will all be the same - it's better to focus on a single type). This is simply an equivalent to System.IO.BinaryReader, but it lets you specify the endianness to use when converting values.
Currently, EndianBinaryReader is a completely separate class to BinaryReader. They have no inheritance relationship. However, as it happens, BinaryReader isn't sealed, and all of the appropriate methods are virtual. So, can we make EndianBinaryReader derive from BinaryReader and use it as a drop-in replacement? Well... that's where the trouble starts.
There's no difficulty technically in doing it. The implementation is fairly straightforward - indeed, it means we can drop a bunch of methods from EndianBinaryReader and let BinaryReader handle it instead. (This is particularly handy for text, which is fiddly to get right.) I currently have the code in another branch, and it works fine.
And I would have gotten away with it if it weren't for that pesky inheritance...
The problem is whether or not it's the right thing to do. To start with, it breaks Liskov's substitutability principle, if the "property" we consider is "the result of calling ReadInt32 when the next four bytes of the underlying stream are 00, 00, 00, 01" for example. Not having read Liskov's paper for myself (I really should, some time) I'm not sure whether this is the intended kind of use or not. More on that later.
The second problem is that it contradicts the documentation for BinaryReader. For example, the docs for ReadInt32 state: "BinaryReader reads this data type in little-endian format." That's a tricky bit of documentation to understand precisely - it's correct for BinaryReader itself, but does that mean it should be true for all subclasses too?
When I've written in various places about the problems of inheritance, and why if you design a class to be unsealed that means doing more design work, this is the kind of thing I've been talking about. How much detail does it make sense to specify here? How much leeway is there for classes overriding ReadInt32? Could a different implementation read a "compressed" Int32 instead of always reading four bytes, for example? Should the client care, if they make sure they've obtained an appropriate BinaryReader for their data source in the first place? This is basically the same as asking how strictly we should apply Liskov's substitutability principle. If two types are the same in every property, surely we can't distinguish between them at all.
I wonder whether most design questions of inheritance basically boil down to defining which properties should obey Liskov's substitutability principle and which needn't, for the type you're designing. Of course, it's not just black and white - there will always be exceptions and awkward points. Programming is often about nuance, even if we might wish that not to be the case.
Blow it, let's do it anyway...
Coming back to BinaryReader, I think (unless I can be persuaded otherwise) that the benefits from going against the documentation (and strict substitutability) outweigh the downsides. In particular, BinaryReaders don't tend to be passed around in my experience - the code which creates it is usually the code which uses it too, or it's at least closely related. The risk of breaking code by passing it a BinaryReader using an unexpected endianness is therefore quite low, even though it's theoretically possible.
So, am I miles off track? This is for a class library, after all - should I be more punctilious about playing by the rules? Or is pragmatism the more important principle here?