Given a list or a sequence of items, a permutation of the list is a certain re-ordering of the items within the list. Hence, if you have a list like [1,2,3,4,5], its permutations would be: [2,1,3,4,5], [1,2,3,5,4], [1,2,5,3,4] and so on. In fact, you can permute anything, as exemplified by the pictures below:
Now, given a list with N elements, how many possible permutations are there? Convince yourself that it is N! (that is, N factorial). For example, if N=3, then you have 3! = 3x2 = 6 possible permutations. If you have N=100, then the number of permutations is 9.332621544 E+157, which is of course an absurdly large number.
What I want to focus on in this post is the question of how do we generate all possible permutations of a list? Consider the problem in the context of my 3 pens. How do I permute my 3 pens? I can start with 1 pen; it can be any one of the 3 pens:
Say I start with the green pen at position 1.
The black pen can go to position 2.
Or it can be the white pen occupying position 2. In other words, in this position, I can consider any pen except the green pen itself.
Suppose I choose the black pen to be in position 2. Then in position 3, I can consider every pen, except for the first and second pen. And of course, since I have only 3 pens, that can only be the white pen.
So the steps are as follows, for a 3 pen permutation:
The logic above should work. Only problem is that it's hardcoded for N=3 list. In the event of different N, we will need to change the code, the instructions. In fact, anything hardcoded is undesirable in a code, especially in our context here. We need something better. How do we go about doing this, doing without hardcoding?
Permutation through Recursion
Let's consider what logically happen in the course of a permutation. What can be the
logical structure of the process? Suppose I label my pens as
a,
b and
c. I can visualize permutation to be a tree structure, as follows:
Think of the tree as comprising of nodes connected by edges. At the root in the picture above, we have "a b c", the list of pens that we are to permute. 3 possibilities for the starting position, as denoted in the second layer of the tree: "a b c", "b a c" and "c b a". "b a c" is derived from "a b c" by swapping a and b, and "c b a" by swapping a and c. I need to explicit about it all, be clear about all these details since everything later will need to be transferred to code.
Now for each node in the second layer, freeze the first position (it's underlined) and consider the sublist beyond the first position. For example, for node "a b c" in the second layer, we consider only the sublist "b c"; a is frozen in its place That node has 2 child in the 3rd layer - corresponding to the "b c" and "c b", where "c b" is derived from "b c" by swapping b and c. Similar processing applies for other nodes in the second layer: freeze the first position, recursively process the remaining sublist.
Hence,
permutation can be done recursively. How will the recursive tree look like when we have 4 pens (or items)? The tree will look bigger of course.
In fact, it will grow factorially with increasing list size.
In pseudocode, the permutation task can be laid out as follows:
permute(item_list, left, right):
if left == right:
display item_list
else
for i from left to right:
temp_list = item_list
swap items in temp_list at index i and index left
permute(temp_list, i, right)
Ok so far?
I have another example to illustrate the concept above. Consider a 4-a-side football team:
We have 4 positions - striker, midfielder, defender and goalkeeper - and we have 4 players - blue guy, black guy, red hunk and a guy with blue head and black body (mixed blood). We need to make sure that everyone is playing at the optimal positions. The only way (computationally) is to have everyone try every possible position - in other words, try out every possible permutation of the players.
So we do it in this way:
4 possibilities for the striker position: red guy be striker, red guy swaps position with black guy, red guy swaps with blue guy and red guy swaps with mixed-blood guy, as shown above.
Next, for each of these possibilities, we consider only the midfield, defender and goalie positions. Apply the same reasoning to the midfield position, as we did to the striker position. Consider the case below: there are 3 possibilities for the midfield positions - black guy at midfield, black guy swaps position with blue, black guy swap positions with mixed-blood. Then for each of those 3 possibilities, we continue recursively the same logic.
Heap Algorithm
A different permutation algorithm, one that is usually more efficient implementation-wise, is that by Heap (1963). I will illustrate the main idea with the footballing example:
Let's fix the goalie position. Form all permutation of the other 3 positions - striker, midfield and defender.
After generating all possible permutations of that first 3 positions, say we have as follows (actually we do have exactly as follows, as shall be seen later):
Now swap the goalie position with the striker - goalie (mixed-blood guy) becomes striker, and the red guy becomes goalie.
Now again, form all possible permutations of the first three positions - striker, midfield and defender.
Repeat the process above, a number of times, as codified in the following algorithm, the so-called Heap algorithm:
In the pseudocode above, permute(L,N) forms a permutation of a list L of size N.
Now, on to a more technical explanation. To permute a list L = <L_1, L_2, ..., L_N> of size N, we iteratively do the following: recursively permute a sublist <L_1, L_2, ... L_(N-1)>. After the permutation of the sublist, we swap L_N with one of the elements in the sublist.
The tricky part, somewhat non-intuitive part in the algorithm is that of choosing the element in the sublist to swap with. We need to make sure that someone who had already tried the goalie position before do not get selected to be the goalie again. Heap (1963) provided the following rules:
- If N is odd, replace L_N with L_1
- If N is even, replace L_N with L_i where i is the iteration number.
To see these rules in action, consider the following permutations, where the list,
L = <1,2,3>, is of size
N=3.
The numbers in blue on the right hand side are the iteration indices. At the first iteration, iteration 1, we permute the sublist <1, 2>. Since only 2 elements involved then, its just a matter of swapping the two. So, the list becomes <2,1,3>. We then swap the last element in L, 3, with the first, 2, and we get L=<3,1,2>. Next, in the second iteration, we permute the sublist <3,1> and L becomes <1,3,2>. Swap 2 with 1 and we get L=<2,3,1>. In the last iteration, do a permutation of <2,3> and we get L = <3,2,1>. You can see that the number of permutations is 6, which is the factorial of 3.
Now, consider the case where N=4.
One more iteration is missing in the picture above. But what is to b noted, in this case,
N an even number, the swap at each iteration must be between
L_N and
L_i, where
i is the current iteration number.
What makes the Heap's swapping rules tricky is that it is unclear why it work. The rules must be such that the swap will put an element which has never been last before in the last position. Or the question is:
How in the world do you make sure that an ex-goalie does not become a goalie again??? Let's visualize the movement of the elements for different values of
N.
Consider the case of N=3.
after iteration 1 : 2 1 3
after iteration 2 : 1 3 2
after iteration 3 : 3 2 1
Numbers in the last position are in blue, and numbers in red are those that have been in the last position before. Note that the first element, after each iteration except for the last one, is always 'free', that is, it has not been a goalie before.
In the case of N=4, we have the following:
after iteration 1 : 3 2 1 4
after iteration 2 : 1 2 4 3
after iteration 3 : 4 3 1 2
after iteration 4 : 2 3 4 1
Notice how at every i-th iteration, the i-th is free - occupied by a black digit, one that had never been a goalie before.
Now, lets go further, to N=5:
after iteration 1 : 2 3 4 1 5
after iteration 2 : 3 4 1 5 2
after iteration 3 : 4 1 5 2 3
after iteration 4 : 1 5 2 3 4
after iteration 5 : 5 2 3 4 1
Again, as we so earlier for the case of N=3, after every iteration, except for the last one, the 0-th position is free.
Similar observations (as per even case) follow for the case of N=6 and N=8, as shown below:
N = 6
after iteration 1 : 5 2 3 4 1 6
after iteration 2 : 1 2 3 4 6 5
after iteration 3 : 6 5 3 4 1 2
after iteration 4 : 1 5 2 4 6 3
after iteration 5 : 6 5 2 3 1 4
after iteration 6 : 4 5 2 3 6 1
N = 8
after iteration 1 : 7 2 3 4 5 6 1 8
after iteration 2 : 1 2 3 4 5 6 8 7
after iteration 3 : 8 7 3 4 5 6 1 2
after iteration 4 : 1 7 2 4 5 6 8 3
after iteration 5 : 8 7 2 3 5 6 1 4
after iteration 6 : 1 7 2 3 4 6 8 5
after iteration 7 : 8 7 2 3 4 5 1 6
after iteration 8 : 6 7 2 3 4 5 8 1
Further Reading on Heap Algorithm