A place to be (re)educated in Newspeak

Saturday, February 24, 2007

Tuples

Tuples are a handy construct found in many programming languages. Oddly enough, they are lacking in mainstream languages like Java, C#, C++ etc.

Java was actually designed to have tuples from the start, but they never quite got in. At one point, I tried to add a tuple-like construct as part of JSR-65. We wanted to extend array initializers to full blown expressions. Eventually, that effort got quashed; I didn’t really resist, since the extension was painful, as such things always are in Java. We should have just done straight tuples as a separate construct - but at the time, that was frowned upon too.

Tuples are useful for a variety of reasons. If you need to package several objects together, but don’t want to define a class just for that purpose,you can use a tuple. This subsumes features like “multiple-value returns” which people have been seeking in Java for many years (without success). It also pretty much covers the need for variable-arity methods. Java went with a special sugar for these instead, a decision I initially opposed; I was eventually convinced to cave in on this, which I really regret (the responsible parties know who they are).

Anyway, that’s all water (well, coffee actually) under the bridge. Back to the main point.

Once you have tuples, handy uses crop up frequently, and you wonder how you ever got along without them.
Smalltalk doesn’t quite have tuples. Instead it has array literals, which are compile time constants and so rather limiting. Squeak does have tuples, though they are not very often used. It would be best to forego array literals entirely and replace them with tuples.

There are subtleties. Literal tuples are best defined as read only.
One reason for this is that readonly tuples are more polymorphic. Long tuples are subtypes of short ones:

{S. T. U. V } <= {S. T. U} <= {S. T} <= {S}

Read only tuples are covariant:

T1 <= T2, S1 <= S2 ==> {S1. T1} <= {S2. T2}

And a read-only literal tuple can be viewed as a list:

S <= U, T <= U, s: S , t : T ==> {s. t} : List[U]

where List[E] is a generic (note I'm using square brackets for type parameters) readonly type for lists (such as SeqCltn[E] in Strongtalk). It should be clear that it is very important to relate tuples to the general collection hierarchy.
Note that it is unsound to assume that

S <= U, T <= U ==> {S. T} <= List[U]

since

{S. T. V} <= {S.T} for all V, but if V <= U does not hold, {S. T. V} <= List[U] does not hold either.

Now if you want writable tuples, you can use an idiom like
{e1. e2. e3} asWritableTuple, which should create a copy of the literal that is writable (don’t worry about the cost of the copy; let the system figure it out).

So, to summarize: tuples are great. Every language should have them.

21 comments:

Fatih Coşkun said...

Nice to read you every now and then. I so totally agree with you on tuples. Every language should have them.

elizarov said...

This is a pretty obvious point, but nonetheless a point worth spelling out at least every year ;). You did not mention an other obvious benefit of tuples -- they let you swap two variable in _a_right_way_, that is:

x, y = y, x;

IMHO, the problem with Java is that it really should have supported tuples from the very start and it should really be supported inside JVM, since otherwise tuples implementation on top of existing JVM is quite costly and would be frowned upon. On the other hand, modification of JVM spec to support tuples is now extremely costly, too, since many other specs (JNI, JVMxI, etc) have been build upon JVM and all they heavily rely on absence of tuples in JVM.

Unknown said...

I totally agree with the need for tuples. The problem of "multiple return values" is indeed a pain.

Two comments:

First, to emphasize your point, in the context of tuples, there are three important properties:
- Static type safety
- Covariant subtyping
- Mutability

The catch is that the three cannot be supported simultaneously: one of these properties needs to be dropped. It seems that giving up mutability is, indeed, the best compromise.

Second, there is the issue of tuple field access. The two obvious options are: (1) by-name, (2) by-position. Elizarov also suggests pattern matching for this purpose.

elizarov said...

Mutability and explicit tuple field access are both not needed and can be even considered harmful. In OO languages there are classes which have both features. Tuples should not try to subsume other features, but are a nice light-weight addition to virtually any language. They do not really add power, but let you solve many real-life problems in a nice and convenient way.

Gilad mentioned that his tuples proposal was supposed to subsume varargs. I think that I saw this proposal somewhere on SDN, but, AFAIR, it was not as convenient as "true" varargs. You don't have to be using extra braces for true varargs support which Java now enjoys.

However, varargs can be considered as a feature that subsumes "multiple argument" functions (i.e. more than one argument), like it is commonly done in functional languages.

Unknown said...

I don't think that mutability is harmful per-se. The problem with mutability (in this context) is that it compromises type safety/covariant subtyping (and also thread safety). I don't understand why tuple field access may be harmful.

On a more general note, most features of a language do not add expressive power (you probably cannot be more powerful than a Turing machine). Thus, the claim that a feature only lets you "solve many real-life problems in a nice and convenient way" is true in almost any language design discussion.

elizarov said...

You are right about type-safety and concurrency problems that mutability causes. But I also look at it in a different and very important way -- language features shall strive to be orthogonal even though it is never perfectly possible in a real programming language that also has to be convenient.

If we have both mutability and tuple field access, then people would ask -- why don't you simply write a class for that? How do I choose between a class and a tuple when I write my code?

Both strict immutability and field access by pattern matching only make tuples so clearly distinct from classes, that those kinds of questions will rarely arise in practice.

Dk said...

Just felt compelled to point you towards the D programming language which grew basic support for tuples several versions ago. Sadly, we don't have multi-value returns yet, although they are in the works. The D community can always use more smart people to help steer the language :)

drumheller said...

And I'd like to point out that Boost effectively provides C++ with good tuple support, albeit as a template library not a core language element.

drumheller said...

Oops, the link to "good tuple support" above is broken; it should have been to here.

Ogi Pishev said...

Tuples are doing a nice job in Eiffel.

Hamlet D'Arcy said...

This post gave me an 'a-ha' moment because I've been wanting tuples for years and never knew they existed. Thanks.

But I have a question: In a weakly typed language, what is the difference between a tuple and an array? I don't see one.

Gilad Bracha said...

Indeed, there is no difference between array literals and tuples in a dynamically typed setting - provided the array literals are unrestricted.

For example, in Smalltalk, they are required to be compile time constants. Some dialects support tuples though, and the thingto do is chukj the old array literals and only use the tuples.

hlovatt said...

You can write an API that goes a long way to providing Tuples in Java:

public class Tuples {
__public static class Tuple1< E1 > {
____public final E1 e1;
____Tuple1( final E1 e1 ) { this.e1 = e1; }
____// etc.
__}
__public static < E1 > Tuple1< E1 > t( final E1 e1 ) {
____return new Tuple1( e1 );
__}

__public static class Tuple2< E1, E2 > extends Tuple1< E1 > {
____public final E2 e2;
____Tuple1( final E1 e1, final E2 e2 ) {
______super( e1 );
______this.e2 = e2;
____}
____// etc.
__}
__public static < E1, E2 > Tuple2< E1, E2 > t( final E1 e1, final E2 e2 ) {
____return new Tuple2( e1, e2 );
__}

__// etc.
}

If Tuples is imported statically then you can have the construct t( ... ) to create tuples.

Mikhail Franco said...

Must also mention Erlang with first class tuples.

Come to think of it, Erlang also has pure message-passing share-nothing actor formalism, and hotswappable code, so it ticks most of your boxes - shame about the Unicode thing :(

fundou said...

Isn't a Set (post-generics) a poor man's tuple?

Mobus said...

how about...

class Tuple {
private list = null;
Tuple(Object... list) {
this.list = list;
}
public Object get(int idx) {
return this.list[idx];
}
}

Anonymous said...
This comment has been removed by a blog administrator.
Zimbabwe vgtjbkjkij said...

I have rolled out my own Tuple mini-lib and I use it all the time.

http://paste.pocoo.org/show/191712/

Nadav C said...
This comment has been removed by the author.
Nadav C said...
This comment has been removed by the author.
Nadav C said...

I also miss having tuples in java. As others have noted you can obviously make you own little implementation of them. Functional libraries such as vavr and jool also provide a version of their own. However I wanted to add my 5 cents that I had an 'aha' moment the other day when I realized lambdas are a perfectly valid replacement for tuples out of the box. for example if you just want to return multiple values from a function without creating a special class for that you can

public static <T1,T2> Consumer<BiConsumer<String,T2>> getDefaultValueWithMsgSillyExample(Map<T1,T2> map, T1 key, T2 defaultValue){
return map.containsKey(key)? func -> func.accept("I found the key you asked for!", map.get(key)):
func -> func.accept("The key you asked for wasn't in the map, here is a default value", defaultValue);
}


Since this still looks a bit clunky you can define an interface with a static factory


@FunctionalInterface
public interface Tuple2<T1,T2> extends Consumer<BiConsumer<T1, T2>> {
static <T1,T2> Tuple2<T1, T2> tuple(T1 t1, T2 t2){
return func -> func.accept(t1,t2);
}

static <U1,U2> U1 getFirst(Tuple2<U1,U2> tuple){
Box<U1> box = new Box<>();
tuple.accept((x,y)->box.setItem(x));
return box.getItem();
}

static <U1,U2> U2 getSecond(Tuple2<U1,U2> tuple){
Box<U2> box = new Box<>();
tuple.accept((x,y)->box.setItem(y));
return box.getItem();
}

}

and then the above example shrinks to

public static <T1,T2> Tuple2<String,T2> getDefaultValueWithMsgSillyExample(Map<T1,T2> map, T1 key, T2 defaultValue){
return map.containsKey(key)? func -> func.accept("I found the key you asked for!", map.get(key)):
func -> func.accept("The key you asked for wasn't in the map, here is a default value", defaultValue);
}


and then you can just use it:

public static void main(String[] args) {
Tuple2<String, Integer> tuple = getDefaultValueWithMsgSillyExample(new HashMap<>(), 10, 100);
String msg = getFirst(tuple);
Integer value = getSecond(tuple);
}

and since lambda variables are automatically unpacked you almost get an equivalent of python's tuple unpacking feature:

public static void main(String[] args) {
Tuple2<String, Integer> tuple = getDefaultValueWithMsgSillyExample(new HashMap<>(), 10, 100);

tuple.accept((message, value) -> {
// use the unpacked tuple as you wish while you're in this scope..
System.out.println(message + ":" + value);
});
}



you can define similar overloads for 3,4,etc' type variables if you wish

The only problem with tuples for me is that the fields are not named so if you have several fields with the same type you might mix them up at the call site. if only java would have supported typedefs, this could have been taken care of too..