Monday, May 26, 2014

Don't Be Afraid of SingleOrDefault

The following is based on an issue I've seen many times in a code base at work, and only recently caused the problem described below.

Consider the following code:


Assume that for good reasons we need to find JobSources by their Source and JobId values. The calling code doesn't have the Id. The "database" doesn't enforce uniqueness of these 2 values as a key, we just assume that every other system that feeds up data behaves this way, too. Well, hope is a better word, as we'll see.

The first time UpdateOrAdd is called, it will find the one and only matching JobSource and update it. Then someone adds a second JobSource for job 2. We again try to update the JobSource with JobId 2 and SourceType File. Success! No exceptions.

Except, we're using Any to see if there are 1 or more items and then updating the first one. The code tolerates the condition of there being more than one item that matches our "key". We have potentially updated the wrong data. Corruption!

Now let's change UpdateOrAdd to use SingleOrDefault:



Not only is the expected shape of the data clear, but it's shorter too. SingleOrDefault returns the 1 item that matches or the default value for the type. For any class type, the default is null. If more than 1 item matches, it throws an exception.

Now on the second call to UpdateOrAdd, we get the following exception:

System.InvalidOperationException: Sequence contains more than one element.

Oh no! A noisy exception, instead of silent data corruption!

Why would someone write code the earlier Any/First stuff? Because some coders are scared of triggering exceptions. More scared of that than corrupting data. Any time I see this pattern in code at work I change it to use SingleOrDefault, or just Single if the item is definitely expected to be present. I've had to convince people that it's better to report a problem than update the wrong item.

Yes, there is a chance that the exception won't happen during testing because the data in the lab might be very clean. Perhaps no one has yet created the conditions there that break your assumption about the shape of the data. It just might be found by a real customer. That customer, though, will report the issue and, more importantly, not lose data.

No comments:

Post a Comment