Matrix Chain Multiplication with C++ code – PART 2: Implementation

In the last post I explained how we got the general solution:

B(i,j) = \begin{cases} 0 \text{ if } i = j \\ \displaystyle\min_{k=i}^{j-1} \begin{cases} B(i, k) + B(k+1, j) + r_i \cdot c_k \cdot c_j \end{cases} \end{cases}

In this post we plug the formula in and watch it run! I will go through the recursive solution (O(2^n)), and then the more optimal dynamic programming solution (O(n^3)).

This is part of a 3 part series:

  1. Analysis and Design – analyze the problem and deduce a solution.
  2. Implementation – implement the algorithm to find the minimum number of operations.
  3. Extracting the Sequence – extract the sequence in which the chain is multiplied.

Input

Our input (input.txt), therefore, is the rows and columns of all n matrices:

7
30 35 15 5 10 20 25

The above input meant that there are 7 integers that correspond to the dimensions of 6 matrices:

matrices-inputs

The input only contains the rows from matrix A_1 until A_6, followed by the column of matrix A_6 . This is because a matrix A and only be multiplied with a matrix B if they are compatible (if r_A = c_B). So for brevity sake we do not need to put rows and columns of all matrices (otherwise the sequence will look like 30 35 35 15 15 5 5 10 10 20 20 25).

Notice, that if we place the sequence of integers in an array rcthe row of a matrix A_i is rc[i] and the column is rc[i+1].

Recursive Implementation

As surprising though it may be, the amount of code is surprisingly little:

I have written the code to follow the formula as close as possible, as such I start the index with 1 instead of 0. You can always change it to save that bit of space, but the code just serves to show that it works:

./mat-chain-mult-recursive < input.txt
15125

Now, if you print the inputs of the function B each time it is being called, you will see a lot of the same inputs are being called again and again. In geeksforgeeks.org’s article on matrix chain multiplication, they took the time to draw out the recursion tree.

Dynamic Programming: Ordering

So now we have broken the main problem to small recurring subproblems (Overlapping Subproblems), which we can piece together to solve the main problem (Optimal Substructure). With this 2 properties, we can use dynamic programming to drastically reduce the running time of an exponential solution.

However, this problem is tricky to solve since it involves 2 variables i and j. We need a 2D array to keep track of this. In a single dimensional array ordering is trivial; we simply go from left to right, building the solution from the bottom up, but with 2 dimensions ordering becomes tricky. Which cell do we solve first?

Well, we know that:

  • the base case is when i = j, then B(i, j) = 0.
  • i cannot exceed j, so those areas will need to be grayed out.
  • B(1, 6) is the solution.

So with that, let us fill up an array DP based on the inputs we have above:

DP-array-1

So we need to find B(1, 6) yes? Well, what does B(1, 6) need? Well, if we unfold the general solution plugging in i as 1 and j as 6 we have 5 selections in which we choose the smallest:

\begin{cases}k=1: B(1, 1) + B(2, 6) + r_1 \cdot c_1 \cdot c_6 \\k=2: B(1, 2) + B(3, 6) + r_1 \cdot c_2 \cdot c_6 \\k=3: B(1, 3) + B(4, 6) + r_1 \cdot c_3 \cdot c_6 \\k=4: B(1, 4) + B(5, 6) + r_1 \cdot c_4 \cdot c_6 \\k=5: B(1, 5) + B(6, 6) + r_1 \cdot c_5 \cdot c_6 \end{cases}

To make it easier to visualize what B(1, 6) requires, I distinguish the selection by colour:

DP-array-2

What you can see, is that B(1, 6) requires cells to the left and to the bottom of it. Well, we have the cells that represent the base cases solved. Perhaps the next logical step is to solve the next diagonal. This is possible, because all the values of all cells where j i = 1 depends on all cells where ji = 0. Consequently, all values of all cells where ji = 2 depends on all cells where ji = 1 and ji = 0; you can validate this by plugging in the numbers into the formula. Because all cells aside the base case requires cells below and to its left up until it hits the base case, we can continue filling up the cells of the array DP diagonally until we hit the top right cell.

From another perspective, we are finding all 5 pairs of the matrices in the chain and find their minimum cost (choose from 1 each), then we find all 4 triplets of matrices in the chain and find their minimum cost (choose from 2), then we find all 3 quadruplets of matrices in the chain and find their minimum cost (choose from 3), and this repeats until we hit the top right cell; each subsequent pass depending on the pass before it.

So we solve in diagonals in order ji = 0, 1, 2 … n – 1. Once you can sequentially iterate through the array DP in a diagonal fashion, you have everything you need to piece together the solution:

dp-array-3

It helps that you calculate the cells by hand to grasp the essence of this algorithm.

Dynamic Programming: Implementation

So what I did, was design a nested for loop to print the indices in the order I want. Once I do that I simply plug the formula inside, and I have my final solution:

Note that again, the index starts at 1.

Although code count is near indistinguishable, the performance is drastically improved at a negligible expense of space. We generalize that the improved implementation is O(n^3) by counting the number of nested for loops.

Conclusion

In the 3rd and final part, I will guide you through extracting the actual order in which the chain is multiplied. Check it out: Matrix Chain Multiplication with C++ code – PART 3: Extracting the Sequence.

References

 

Advertisements

One thought on “Matrix Chain Multiplication with C++ code – PART 2: Implementation

  1. Hi , I am not able to figure out what is wrong with my recursive implementation in c++ , I think the logic is the same

    using namespace std;

    #include
    #include
    #include

    // this is the most basic recursive version implementation

    int minimum =999999999 ;

    int param1 ;
    int param2 ;
    int param3;

    int matrix_chain(int p[] , int i , int j) {

    // minimum =p[0]*p[1];

    if (i == j) {

    cout << "m[" << i << "," << j << "] = " << 0 << "\n";
    return 0;

    }

    for (int k=i; k<j; k++) {

    param1= matrix_chain(p,i,k);
    param2= matrix_chain(p,k+1,j);
    param3 = p[i-1] * p[k] * p[j];

    cout << (param1+param2+param3) << "\n";

    //cout << "param3 : " << p[i-1] * p[j-1] << "\n";
    //cout << "p[k] : " << p[k] << "\n";

    if ((param1+param2+param3) < minimum) {

    minimum = (param1+param2+param3);

    cout << "minimum : " << minimum;

    }

    cout << "m[" << i << "," << j << "] = " << minimum << "\n";

    return minimum;
    }

    }

    int main() {

    int p[]={30,35,15,5,10,20,25};

    cout << "answer : " << matrix_chain(p,1,4);

    // cout << cut_rod(arr,18) ;

    // cout << memoized_cut_rod(arr,10);

    // cout << bottom_cut_rod(arr,9);

    // print_cut_rod_solution(arr,7);

    return 0;

    }

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s