The specifics about generics

Let’s say that we’re making an API response superclass, and it has a method for fetching some key:value pair from the response data. Let’s assume that behind the scenes, all the response data is just kept in a map like so:

So, if you’re implementing a new API response parser, somewhere in your code you’ll probably be doing a couple of calls like this:

And that’s all fine and dandy, but frankly it’s just not doing it for me. I have to type out at lot of casts, and and I have a personal vendetta against specifying my types more than I feel I ought to. So let’s gussy this up a little bit by adding this method to the superclass:

Perfect! We can now simplify that response a bit:

Now some of you might be saying “Hey, that’s barely shorter than it was with the cast”, and to you I say it’s all about style. However, I’m sure a few of you have noticed the more insidious thing here (that was still present previously, but now it’s a little less obvious). What am I talking about? Well, casting to a boolean (or other primitive) is a bit of a risky case thanks to Java wanting to have it’s Object types but eat primitives too.

So what happens if you try and get logged in but the server is throwing a bit of a fit and forgets to send you the logged_in_ok variable or changes the first letter of each word to an interrobang‽ Well, you return null, and null cast to a boolean results in a rather irritating exception. What’s interesting, however, if you take a look at your inevitable stacktrace, is that the site of the exception is not in your generic code, but back in your parse response method. And what does the disassembler say about our method?

Interesting! In the case of a simple cast in a generic method, the compiler actually treats the generic as an implicit cast in the caller, not at the end of the method itself. This is a pretty intriguing gotcha in my opinion, so let’s dig a little deeper into how generics actually work under the hood.

There are two ways that one might reasonably implement a generic system for functions in a language: duplication or sharing. For a duplicative system, the compiler would infer the types of all the calls to a generic function and generate bytecode for each version – i.e., if you had a compare() method and called it on lists of strings and ints, two bytecode methods would be created, one working on strings and the other integers.

This isn’t particularly efficient sizewise, so instead the java compiler uses the second method: code sharing. Instead of making many bytecode methods, it makes just one method and just throws all of the arguments at it. But things still need to have types, so Java changes the type of all things passed into a generic method into their laxest type bound, or object if there are no bounds. This process is called type erasure. In practice, this means that if you were to have the following:

After type erasure you would end up with (as bytecode, but all the same):

The unbound type parameter T in the first method is dumped and replaced with Object, and in the second method the generic List interface is replaced with the concrete List type. Now, looking at the code above you can see that we might get some amount of type issues if we just treat everything as an Object, but of course the compiler knows the ‘real’ types still and can be sure to not emit dangerous bytecode. However, at runtime the program has zero knowledge of what those types actually are. The process of converting back is called reification, but is not implemented in Java (except for a handful of things, e.g. primitives).

There are a handful of other interesting quirks around generics based on the effects of type erasure. For example, take a look at the following interface & implementation:

Looks good, everything checks out typewise. But after the erasure step, we end up with a bit of an issue:

NumberValue no longer has a greaterThan method that matches the interface it is trying to implement! To smooth over this sort of issue, the compiler will generate bridge methods to fix up this sort of interface mismatch, and you’ll end up with the following bytecode for the NumberValue:

Well, that was fun (for me at least) but it doesn’t really help us sort out this mess we’ve made for ourselves. A nice easy solution would be to just have a fallback value we can return instead, and trust that nobody asks for a boolean to default to null:

Great! Sort of. We still actually haven’t solved the issue if the server returns a key, but explicitly set to a null value. Hrm. As we discovered above, there isn’t a way to intuit the real class of a generic parameter at runtime, so in order to force the cast to occur in this method so that we can catch and handle it we need to also pass in the class that we are expecting, like so:

Lovely! Type safe (in that it returns a type you want, or the default value), cast-free getter methods. Hope that was interesting, I certainly learned a lot digging into my classcast exception. If you’re dying to know more, there’s a beautiful document here that contains more than you ever dreamed you wanted to know about how Java compilers handle this sort of thing.

Something interesting I missed? Spotted an egregious mistake? Let it be known in the comments below!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s