Corner cases in Java and C#

Every language has a few interesting corner cases - bits of surprising behaviour which can catch you out if you're unlucky. I'm not talking about the kind of thing that all developers should really be aware of - the inefficiencies of repeatedly concatenating strings, etc. I'm talking about things which you would never suspect until you bump into them. Both C#/.NET and Java have some oddities in this respect, and as most are understandable even to a developer who is used to the other, I thought I'd lump them together.

Interned boxing - Java 1.5

Java 1.5 introduced autoboxing of primitive types - something .NET has had from the start. In Java, however, there's a slight difference - the boxed types have been available for a long time, and are proper named reference types just as you'd write elsewhere. In this example, we'll look at int boxing to java.lang.Integer. What would you expect the results of the following operation to be?

Object x = 5;
Object y = 5;
boolean equality = (x==y);

Personally, I'd expect the answer to be false. We're testing for reference equality here, after all - and when you box two values, they'll end up in different boxes, even if the values are the same, right? Wrong. Java 1.5 (or rather, Sun's current implementation of Java 1.5) has a sort of cache of interned values between -128 and 127 inclusive. The language specification explicitly states that programmers shouldn't rely on two boxed values of the same original value being different (or being the same, of course). Goodness only knows whether or not this actually yields performance improvements in real life, but it can certainly cause confusion. I only ran into it when I had a unit test which incorrectly asserted reference equality rather than value equality between two boxed values. The tests worked for ages, until I added something which took the value I needed to test against above 127.

Lazy initialisation and the static constructor - C#

One of the things which is sometimes important about the pattern I usually use when implementing a singleton is that it's only initialised when it's first used - or is it? After a newsgroup question asked why the supposedly lazy pattern wasn't working, I investigated a little, finding out that there's a big difference between using an initialiser directly on the static field declaration, and creating a static constructor which assigns the value. Full details on my beforefieldinit page.

The old new object - .NET

I always believed that using new with a reference type would give me a reference to a brand new object. Not quite so - the overload for the String constructor which takes a char[] as its single parameter will return String.Empty if you pass it an empty array. Strange but true.

When is == not reflexive? - .NET

Floating point numbers have been the cause of many headaches over the years. It's relatively well known that "not a number" is not equal to itself (i.e. if x=double.NaN, then x==x is false).

It's slightly more surprising when two values which look like they really, really should be equal just aren't. Here are a couple of sample programs:

using System;
public class Oddity1
{
    public static void Main()
    {
        double two = double.Parse("2");
        double a = double.Epsilon/two;
        double b = 0;
        Console.WriteLine(a==b);
        Console.WriteLine(Math.Abs(b-a) < double.Epsilon);
    }
}

On my computer, the above (compiled and run from the command line) prints out True twice. If you comment out the last line, however, it prints False - but only under .NET 1.1. Here's another:

using System;

class Oddity2
{
    static float member;

    static void Main()
    {
        member = Calc();
        float local = Calc();
        Console.WriteLine(local==member);
        member = local;
    }

    static float Calc()
    {
        float f1 = 2.82323f;
        float f2 = 2.3f;
        return f1*f2;
    }
}

This time it prints out True until you comment out the last line, which changes the result to False. This occurs on both .NET 1.1 and 2.0.

The reason for these problems is really the same - it's a case of when the JIT decides to truncate the result down to the right number of bits. Most CPUs work on 80-bit floating point values natively, and provide ways of converting to and from 32 and 64 bit values. Now, if you compare a value which has been calculated in 80 bits without truncation with a value which has been calculated in 80 bits, truncated to 32 or 64, and then expanded to 80 again, you can run into problems. The act of commenting or uncommenting the extra lines in the above changes what the JIT is allowed to do at what point, hence the change in behaviour. Hopefully this will persuade you that comparing floating point values directly isn't a good idea, even in cases which look safe.

That's all I can think of for the moment, but I'll blog some more examples as and when I see/remember them. If you enjoy this kind of thing, you'd probably like Java Puzzlers - whether or not you use Java itself. (A lot of the puzzles there map directly to C#, and even those which don't are worth looking at just for getting into the mindset which spots that kind of thing.)

Published Sunday, October 02, 2005 8:54 PM by skeet
Filed under: ,

Comments

# C#: some corner cases

I was just reading the blog entry corner cases in C# and Java when I remember two of such corner cases...

Wednesday, December 07, 2005 6:12 AM by TrackBack

Leave a Comment

(required) 
(required) 
(optional)
(required)