Duplications must have an opposite
Decomposing Duplication
We have figured out how to represent many operations as composition of constant-time atomic instructions. For example, a bubble sort is just a set of swaps. A counter is just a series of increments. The sum of a list is just a series of pairwise additions.
This is good, because it allows us to parallelize these operations. All parallel algorithms represent an operation as a series of smaller operatiions.
In many programming languages, cloning is an atomic operation. This means that we can not have two threads clone a value at the same time. However, cloning is not a constant-time operation. For example, cloning a very large tree will take longer than cloning a small tree.
This is not good. This makes cloning inherently un-parallelizable.
In addition, sometimes we don't need to clone the whole tree. For example, consider the following Rust code:
Both functions are the same thing. foo
clones, and then carries out an operation. bar
carries out an operation, and then clones. Most likely, LLVM would optimize both functions to the same thing. But semantically, foo
carries out an expensive clone and then a cheap operation, while bar
carries out a cheap operation, and then the clone becomes cheap. It would be nice if both formulations were completely equivalent.
This happens because cloning whole trees is atomic. We'll see later how decomposing duplication helps solve this problem too.
Incremental cloning
We can imagine an incremental cloning operation, which clones only a single term and defers cloning subterms for later on.
Let's call this operation dup
, which stands for duplication. This is how it would look like:
dup a b = (1, (2, 3));
a
and b
would both contain copies of (1, (2, 3))
after this expression is reduced. The dup
expression binds a
and b
. The reduction would look something like this:
dup a b = (1, (2, 3));
-------------
dup c0 c1 = 1;
dup d0 d1 = (2, 3);
a = (c0, d0);
b = (c1, d1);
--------------
dup d0 d1 = (2, 3);
a = (1, d0);
b = (1, d1);
--------------
dup e0 e1 = 2;
dup f0 f1 = 3;
a = (1, (e0, f0));
b = (1, (e1, f1));
--------------
dup f0 f1 = 3;
a = (1, (2, f0));
b = (1, (2, f1));
--------------
a = (1, (2, 3));
b = (1, (2, 3));
We decomposed a single duplication operation into 5 atomic independent duplication operations.
The cool thing about this algorithm is that it's done in a layer-by-layer basis. This means that if all tree branching points are constant-size (for example, in cons lists) we can implement each step in constant time.
Duplicating lambdas
What if we want to duplicate lambdas?
Consider λxλy(x y)
. Let's try to apply our algorithm naively.
dup a b = λxλy(x y);
--------------------
a = λx0(a0)
b = λx1(a1)
dup x0 x1 = x;
dup a0 a1 = λy(x y)
--------------------
a = λx0λy0(b0)
b = λx1λy1(b1)
dup x0 x1 = x;
dup y0 y1 = y;
dup b0 b1 = (x y)
--------------------
a = λx0λy0(c0 d0)
b = λx1λy1(c1 d1)
dup x0 x1 = x;
dup y0 y1 = y;
dup c0 c1 = x;
dup d0 d1 = y;
-------------------- now, we'll duplicate all variables.
a = λxλy(x y)
b = λxλy(x y)
This is very nice! It reduces correctly. However, there is a semantic mistake in what I'm doing here. Take a closer look at this step here:
a = λx0(a0)
b = λx1(a1)
dup x0 x1 = x;
dup a0 a1 = λy(x y)
What are we doing with x0
and x1
here? Didn't we decide earlier that dup
binds x0
and x1
? It turns out we are binding x0
and x1
twice here, and we are never defining x
(and using it twice!). Fortunately, this mistake "cancels out" at the end, and the end result is correct. However, this intermediate step is meaningless. There has to be a better way to do this.
It turns out that there is a term type that expresses this idea in a better way. This is the superposition, which is the opposite of a duplication. Instead of writing dup x0 x1 = x
, we would write x = {x0 x1}
. This is okay because x
is being bound here, while x0
and x1
are being used. This matches up with our expectations of what this "anti-dup" operation should do.
Let's rewrite the previous step using a superposition.
a = λx0(a0)
b = λx1(a1)
x = {x0 x1}
dup a0 a1 = λy(x y)
We can substitute x
here, and we'll end up with:
a = λx0(a0)
b = λx1(a1)
dup a0 a1 = λy({x0 x1} y)
Now I'll write the previous reduction again, but using superpositions instead.
dup a b = λxλy(x y);
--------------------
a = λx0(a0)
b = λx1(a1)
dup a0 a1 = λy({x0 x1} y)
--------------------
a = λx0λy0(b0)
b = λx1λy1(b1)
dup b0 b1 = ({x0 x1} {y0 y1})
--------------------
a = λx0λy0(c0 d0)
b = λx1λy1(c1 d1)
dup c0 c1 = {x0 x1};
dup d0 d1 = {y0 y1};
--------------------
a = λx0λy0(x0 y0)
b = λx1λy1(x1 y1)
This is the correct reduction, and doesn't suffer from the double-binding issue we had before. I've added another reduction rule here, which is. dup x0 x1 = {y0 y1}
reduces to x0 = y0; x1 = y1
.
There's something really strange happening here that I ignored. In this reduction step:
a = λx0(a0)
b = λx1(a1)
dup a0 a1 = λy({x0 x1} y)
x0
and x1
are variables that are being used outside of the lambdas that bind them!. We will later see how this is not a weakness, but a strength. I'll write more about these "scopeless lambdas" later.
The Symmetric Interaction Calculus
The end result of this is the Symmetric Interaction Calculus. The Symmetric Interaction Calculus has four reduction rules:
The Symmetric Interaction Calculus
term
= λx. term -- function
| (term term) -- application
| x -- variable
| {term term} -- superposition
| dup x0 x1 = term; term -- duplication
Reduction rules:
Lambda application:
((λx. body) arg)
------------------------
x <~ arg; body
Superposition application:
({x0 x1} b)
------------------------
dup b0 b1 = b; {(x0 b0) (x1 b1)}
Lambda duplication:
dup a0 a1 = λx b
-------------------------
x <~ {x0 x1};
a0 <~ λx0 b0;
a1 <~ λx1 b1;
dup b0 b1 = b;
Superposed duplication:
dup a0 a1 = {b0 b1}
-------------------------
a0 <~ b0;
a1 <~ b1;
a <~ b means a is replaced by b
The Symmetric Interaction Calculus is isomorphic to a subset of Symmetric Interaction Combinators called directed (or polarized) Interaction Combinators. This subset is enough to implement Turing Machines, which means that this language is Turing Complete.
This algorithm is really good because each operation is atomic and constant time.