Thursday, August 26, 2010

A couple of comments on Defender Methods

Brian Goetz has posted version 3 of his proposal "Defender Methods", which is a way of adding methods to existing interfaces without breaking binary compatibility. Generally speaking, I think the idea is sound but I think there are some problems with the proposal in its current form. I would normally post my comments on the proposal to the lambda-dev mailing list, which ensures that any IP embedded in my comments are formally submitted to Oracle's ownership. However, Oracle's recent lawsuit against Google has made it clear that, even though I am a contributor to openjdk7, I do not have a license to Oracle's patents that are necessarily infringed by the use of the openjdk7 source base. This is a very confusing position for the organizer of an open-source effort to take. Rather than continuing to contribute IP directly to the project, I'll post my comments here and contribute them to Oracle once it is clear that I've been granted a license to the patents necessary to use openjdk7.

The latest version of the proposal has added a section on "Binary compatibility" (section 11), but that section fails to address the one binary compatibility issue raised previously: what are the binary compatibility implications of changing the default for a defender method? Since the defender method's default is an implementation detail, such a change should be binary (and source) compatible, but the resolution mechanism (section 3) makes it binary incompatible. The proposal doesn't provide compile-time semantics for invoking an extension method or compile-time rules for when a given class declaration is legal. At the very least there should be rules to ensure that sources that compile cleanly together do not trigger a VM error at runtime due to an ambiguous method invocation. The most natural change of the compile-time rules to match the given runtime resolution rules would make changing the defender method's default a source incompatible change as well (because it could introduce an ambiguity). I believe this undermines the justification for having defender methods out-of-line (section 2). Moreover, the separation forces API designers to create vestigial public API elements (the static extension methods).

These issues can be addressed by (a) having defender method bodies written inline, and (b) rejecting non-abstract class declarations where some defender method does not have a unique default.

I suspect the proposed implementation mechanisms (sections 7 and 8) will be far too expensive compared to existing virtual method calls. For the Interface.super calls (section 7), my suggestion is to extend the VM specification of invokespecial.

Saturday, February 13, 2010

A Syntax Option for Project Lambda (Closures for Java)

It was noted recently on the Project Lambda mailing list that allowing arrays of function type would undermine the type system. The reason for this is a combination of Java's covariant arrays, the natural subtypes among function types (they are covariant on return type and contravariant on argument types), exception checking, and the erasure implementation of generics. We certainly can't remove covariant arrays or checked exceptions, and removing the subtype relationship among function types really reduces their utility. Unfortunately, it is almost certainly too late to reify generics too. Although you can't have an array of function type, there is no problem having, for example, an ArrayList of a function type. So while we might prefer not to have this restriction, it won't be much of a problem in practice.

There are many ways in which arrays of function type would undermine the type system, but to give you some flavor of the problem, the following is a method written assuming that arrays of function type are allowed. This method must fail in some way - either fail to compile, or throw a ClassCastException, or an ArrayStoreException, or something. In all of the implementations being considered, this method would throw IOException, even though that isn't declared in the throws clause of the method. We have undermined Java's exception checking without even a cast!

public void main(String[] args) {
    #void()[] arrayOfFunction = new #void()[1];
    Object[] objectArray = arrayOfFunction;
    objectArray[0] = #(){ throw new IOException(); };
    arrayOfFunction[0].(); // invoke the function out of the array
}

The realization that supporting arrays of function type would introduce a loophole in the type system makes it possible to consider some very nice syntax options that were previously eliminated because there was no syntactic way to write an array of function type. But if we disallow arrays of function type, those options can be considered.

My favorite of the alternatives is based on this grammar

FunctionType:
( TypeListopt Throwsopt ) -> ResultType
Expression:
LambdaExpression
LambdaExpression:
( FormalParameterListopt ) -> Expression

A function type is written with the argument types between parentheses, then a right arrow (dash greater-than), and then the result type.

The most important distinction between this proposal and the existing draft specification is that this proposal places the result type after the argument types, instead of before. The benefits of its simplicity compared to the current specification are most obvious when you combine function types and lambdas with other features.

Here is a simple example to show how the two proposals look. The example is currying. The following method takes as a parameter a function of two arguments, and returns a function of one argument that returns a function of one argument. Applying the resulting function to one value, and then the result of that to another value, should have the same effect as applying the original function to both values. To make it interesting, we make the argument and result types generic, and we allow the function to throw an exception.

Using the currently proposed syntax for Project Lambda, the code would look something like this:

static <T,U,V,X extends Throwable>
##V(U)(throws X)(T) curry(#V(T,U)(throws X) function) {
   return #(T t)(#(U u)(function.(t,u)));
}

On the other hand, with the proposal described above it looks something like this:

static <T,U,V,X extends Throwable>
(T)->(U throws X)->V curry((T,U throws X)->V function) {
   return (T t)->(U u)->function.(t,u);
}

I've intentionally selected an example that uses the new features heavily, so either may take a few moments of studying to understand. But I claim the latter is much easier to follow than the former, even once you've become familiar with the syntax.

You can easily write a function that yields an array as its result

()->int[] arrayLambda = ()->new int[]{1, 2, 3};

With this proposed syntax there is no way to write an array of functions. But that is exactly what we concluded could not be made to work within the type system anyway.