Thursday, December 13, 2007

What flavor of closures?

I just attended Josh Bloch's presentation at JavaPolis, where he asks the community whether they want Java to support function types, or if they'd prefer that people write these things the way they do today. His examples are carefully selected from the most twisted of the test suite. Compiler test suites are a good place to find the most twisted but unrealistic uses of any given language feature. I thought it would be interesting to look at the question in the context of a real API. You probably know my opinion, but just to be clear, here is an excerpt from Doug Lea's fork-join framework

/**
 * An object with a function accepting pairs of objects, one of
 * type T and one of type U, returning those of type V
 */
interface Combiner<T,U,V> {
  V combine(T t, U u);
}
class ParallelArray<T> {
  /**
   * Returns a ParallelArray containing results of applying
   * combine(thisElement, otherElement) for each element.
   */
  <U,V> ParallelArray<V> combine(
    ParallelArray<U> other,
    Combiner<? super T, ? super U, ? extends V> combiner) { ... }
}

And the equivalent code ported to use the features of the closures spec:

class ParallelArray<T> {
  /**
   * Returns a ParallelArray containing results of applying
   * combine(thisElement, otherElement) for each element.
   */
  <U,V> ParallelArray<V> combine(
    ParallelArray<U> other,
    { T, U => V } combiner) { ... }
}

The question Josh asks is this: which version of this API would you prefer see?

The point he makes is that function types enable (he says "encourage") an "exotic" style of programming - functional programming - which should be discouraged, otherwise the entire platform will become infected with unreadable code. Although functional programming is just as possible with or without function types - they are just shorthand for interface types, after all - Josh prefers the language provide syntactic vinegar for these techniques.

Part of his talk was about the problems of being able to use nonlocal return by default in a closure. See my previous blog post for a description of how this theoretical problem won't exist in the next version of the spec, and doesn't exist in the prototype today.

Finally, Josh showed that if you want to use something like eachEntry to loop over a map, and you want to be able to use primitive types for the loop variables, autoboxing doesn't work and you'd have to define 81 different versions of the eachEntry method (one for each possible primitive type in each position). That's true, just as it's true that you'd have to define 81 different versions of the Map API if you want to be able to handle primitives in them. If it turns out to be a good idea to make autoboxing work for the incoming arguments to a closure, that is a small tweak to the closure conversion. These kinds of issues can be addressed in a JSR.

Josh's vision for an alternative is Concise Instance Creation Expressions along with adding a moderate number of new statement forms.

Monday, December 03, 2007

Restricted Closures

Note: this discusses a feature of the Closures specification that was published back in February, but which is likely to change in an upcoming revision.

The Closures for Java specification, version 0.5, contains a special marker interface java.lang.RestrictedFunction. When a closure is converted to an interface that extends RestrictedFunction, this prevents the closure from doing certain operations. Specifically, it prevents accessing mutated local variables from an enclosing scope, or using a break, continue, or return to a target outside the closure. The idea is that APIs that are intended to be used in a concurrent setting would want to receive restricted rather than unrestricted closures to prevent programmers from shooting themselves in the foot.

Two weeks ago Mark Mahieu contacted me regarding his experience with the closures version of the fork-join framework. Because I had ported that API before I had implemented any of the operations that would be restricted, and before RestrictedFunction itself, I had simply not provided any restrictions at all. Mark was wondering how to do it:

I hadn't looked at the jsr166y javadoc before you linked to it on your blog, so I had the chance to compare the two versions on equal terms, and I can honestly say that I found the closures version of the API to be much more approachable at first blush. I also suspect that the majority of the Java programmers I work with would feel the same way, once comfortable with function type syntax.

One thing I did wonder was whether a method like ParallelArray.combine() could be declared as:

public <U,V,C extends {T,U=>V} & RestrictedFunction> ParallelArray<V> combine(ParallelArray<U> other, C combiner) { ... }

but my reading of the specification suggests that the type C won't be a valid target for closure conversion. Maybe I'm being greedy, but in certain cases (jsr166y being a good example) I'd ideally want both the clarity provided by using function types in place of a multitude of interfaces, and the compile-time checking afforded by RestrictedFunction. Having said that, I think the additional type parameter above negates those gains in clarity somewhat, even if it were an option.

I responded, describing what I had been planning to do in the next minor update of the spec:

I expect to make that work. However, I hope it won't be necessary. I expect to support function types like

{T,U=>V}&RestrictedFunction

directly. For example

public <U,V> ParallelArray<V> combine(ParallelArray<U> other, {T,U=>V}&RestrictedFunction combiner) { ... }

You will be allowed to intersect a function type with non-generic marker interfaces such as RestrictedFunction, Serializable, etc. Unfortunately, I will have to rev the spec to support this.

Since that time I've been discussing this issue with a number of people. Some, who believe that the concurrent use cases are primary, or who believe that "Mort" programmers will blithely copy-and-paste code from anonymous inner classes (which have different semantics) into closures, suggest that the default is backwards: closures and function types should be restricted unless specific action is taken to make them otherwise. Reversing the sense of the marker interface doesn't work (it violates subtype substitutability), but there may be other ways to accomplish it. On the other hand, there are others who believe the synchronous use cases, such as control APIs, are primary (even when used in a concurrent setting), and prefer not to see the language cluttered with support for the restictions at all. Instead, they would prefer that any such restrictions take the form of warnings (which the programmer might suppress or ask javac to escalate to errors). I have sympathy for both camps.

Another possibility would be to produce a warning whenever you use a nonlocal transfer at all and do away with RestrictedFunction. The way to suppress the warning would be with a @SuppressWarning("nonlocal-transfer") annotation. Could we make it an error instead of a warning? This may make the interface easier to read, but it doesn't give the API designer any way to express a preference. It may make control APIs painful to use.

Finally, it would be possible to use a different syntax for restricted and unrestricted function types and closures. For example, one using the => token would be restricted, not allowing nonlocal transfers. One using a different token such as ==> or #> would be unrestricted, allowing nonlocal transfers. The idea is that if you want an unrestricted closure, you'd have to use the slightly more awkward syntax, and the receiving type must also be of the unrestricted variety. The control invocation syntax would be defined in terms of the unrestricted form. This enables API designers to express a preference for whether or not clients would be allowed to write unrestricted closures (and therefore, whether or not they would be allowed to use the control invocation syntax).

This can be made to work using only concepts already in the spec. The unrestricted form of a function type would be defined as an interface type as in the current spec. The restricted form would be the same but with RestrictedFunction mixed in. With this approach there is no need for the explicit "&" conjunction-type syntax for function types.