1-0 Knapsack Problem – A Hands-on Guide (C++)

The 1-0 knapsack problem; an optimization puzzle famously solved with dynamic programming (dp). This post is merely my take on the problem, which I hope to provide a more hands-on approach.

To get started, try and attempt The Knapsack Problem (KNAPSACK) from SPOJ. Study the problem closely as I will referring to it throughout this guide. I will then explain how the general solution is derived and how dp is applied.

I assume you have known a bit of dp as a prerequisite, though if you haven’t you can check out my beginner friendly hands-on intro: 445A – Boredom – CodeForces Tutorial.

The General Solution

A dp solution is usually derived from a recursive solution. So let’s start with that.

We define 2 containers: v and c, that contains the all the values of each item and the capacity they consume respectively, starting from index 1. So for example, v[2] and c[2] returns the value and size of the 2nd item.

What are we trying to optimize here? We are trying to maximize the value; the combination of items that yields the highest value. So let us define a function B(i, w) that returns the maximum value given a scope of items (i) and the capacity of the knapsack (w). The solution to KNAPSACK will therefore be B(N, S).

In solving it recursively, imagine we have all N items with us and our knapsack completely empty (current capacity is S), and we consider the items one by one from the last one (the N-th item).

What are the base cases of this problem? When there are no items to put in (i = 0), or when the knapsack cannot contain any items (w = 0).

Before we consider some i-th item, we first need to make sure that it can fit into the knapsack given its capacity, in other words, an i-th item should not be considered if c[i] > w. If this is so, you will consider the maximum value of the the scope of items excluding the i-th item, or B(i-1, w).

So what happens when you can put the item in the knapsack? You have 2 choices: To put it in (take), or not put it in (keep).

  1. Keep: you exclude it from the scope of items in which you consider the maximum value, which is again B(i-1, w).
  2. Take: you will get the value of the i-th item you select (v[i]), BUT, we should also consider the remaining space after adding the i-th item inside (w-c[i]). When considering items to add here, we need to exclude the item we already added, so the scope of items we consider is i-1. With this remaining space and this scope of items, we also want to get the maximum value. We can do this recursively via B(i-1, w-c[i]).

Choosing between keep and take is as simple as taking the maximum of the 2.

If you piece all this together, you will get the general solution:

B(i,j) = \begin{cases} 0 \text{ if } i = 0\text{ or }w = 0\\ B(i-1, w) \text{ if } c[i] > w\\ max\Big( B(i-1, w), v[i] + B(i-1, w-c[i]) \Big) \end{cases}

Recursive Solution

Short and simple:

Plug it in the judge and you will get a TLE (Time Limit Exceeded).

Dynamic Programming Solution

We use a 2D array, DP, containing N+1 rows and S+1 columns. The rows map to the scope of i items we consider (the top considers 0 items and the bottom considers all N items). The columns map to capacity (an individual column is denoted as the w-th column) of knapsack left to right from 0 to S. A cell DP[i][w] means “this is the maximum value given i items and capacity of w“. The maximum value for N items and capacity of S is therefore DP[N][S].

We first fill DP with base cases: where i = 0 and where w = 0. This means that the first row and first column are all 0. The order in which we solve this is simple: starting where i = 1 and w = 1, for each row from top to bottom, we fill the cells from left to right.

I will continue on with an example using inputs from Tushar Roy’s YouTube video (you should check it out), but I jumbled the order to prove that ordering of items is not important. Here is the DP array:


Cells shaded in pale blue is when item does not fit in capacity w, or where c[i] > w. In pale green cells we have a choice: keep or take. Notice that with every row, we are adding one more item to consideration, and with every column we increase the capacity by 1.

Here is the code:

Getting the Selected Items

Some people use an auxiliary Boolean array (often called Keep) that keeps track of whether an item is selected or not. This seems to be an unnecessary occupation of space to me, since you can deduce the selected items from DP itself.

You start from DP[N][S], and from there:

  • An item is selected, if the value of the cell directly above it is not equal to the current cell. When this happens, the capacity of the knapsack reduces by the weight of the selected item. With that new capacity you select the next item.
  • An item is not selected, if the value of the cell directly above it is equal to the current cell. So we consider the next item by moving up one row; capacity remains unchanged.

This process continues until either there are no more items remaining, or the knapsack is full.

The table below shows the trail of the algorithm as it selects items (item 1 and 4) from the DP array we constructed before:


Below is the recursive function:

void pick(int i, int w)
    if (i <= 0 || w <= 0) return;

    int k = DP[i][w];
    if (k != DP[i - 1][w]) {
        cout << i << " "; // select!
        pick(i - 1, w - c[i]); // capacity decreases
    } else {
        // move on to next item; capacity no change
        pick(i - 1, w);

See the full implementation of this function in this gist.


Enough spoon feeding! It is time for you to try out some puzzles on your own. Conveniently UVa grouped a series of 3 questions that are slight variations of the 1-0 knapsack problem. I sort them here in order of difficulty:

  1. 10130 – SuperSale (my solution)
  2. 990 – Diving for Gold (my solution)
    You will need to list down the items you select in this one.
    – Beware of the tricky formatting! There shouldn’t be a blank line at the end of your output.
  3. 562 – Dividing coins (my solution)
    In my solution, there is a tip (short comment block before the main function) you can check out if you just want some pointers to get started.


Matrix Chain Multiplication with C++ code – PART 3: Extracting the Sequence

In my previous post, I wrote about finding the minimum cost of multiplying a matrix chain using dynamic programming. In this post I build on top of what we have covered to extract the sequence of multiplying the chain itself.

This is part of a 3 part series:

  1. Analysis and Design – analyze the problem and deduce a solution.
  2. Implementation – implement the algorithm to find the minimum number of operations.
  3. Extracting the Sequence – extract the sequence in which the chain is multiplied. 

Extract the Sequence

To do this we keep track of the point at which we split up the chain as prefix and suffix: the point (we define this from the previous post). We do this by storing it in another 2D array of the same size as DP, which we call splits:

int ops = DP[i][k] + DP[k + 1][j] + rc[i] * rc[k + 1] * rc[j + 1];
if (ops < DP[i][j]) {
	DP[i][j] = ops;
	splits[i][j] = k;

Now let us print out the elements of splits in dark blue. I will be reusing the example input from the previous post:


How do we interpret this? Let us start from where we got the main solution; where i = 1 and j = 6. This cell splits[1][6] signifies the last operation of multiplying the matrix chain – in the above table this maps to 3. This means in the last operation, a prefix from a chain A_1 \ldots A_3 multiplies with a suffix from a chain A_{4} \ldots A_6.

Let us now consider just the prefix. How then is prefix A_1 \ldots A_3 multiplied? We find that from splits[1][3], which in this case is 1. Therefore the it is formed by multiplying a prefix A_1 \ldots A_1 and a suffix A_2 \ldots A_3. In this prefix A_1 \ldots A_1 we hit a base case where i = j. Should this happen we return the matrix A_1. The suffix A_2 \ldots A_3 is split from from splits[2][3], which is 2. This in turns hits 2 base cases and returns the matrix A_2 and A_3.

If you repeat this recursive process with the top level suffix, you parenthesize the chain as such:


We can generalize a function mult that returns the resultant matrix that has been multiplied in the most optimal way from the matrix chain:

mult(i, j)=\begin{cases}A_i \text{ if } i = j\\mult(i, k) \cdot mult(k+1, j)\text{ where } k = splits[i][j]\end{cases}

The main solution is therefore mult(1, 6).


Below is an implementation of this:

Because we don’t have the contents of the matrices, I have the program output the parenthesis instead, along with the order in which it parenthesize them:

./mat-chain-mult-dp-trace < input.txt
multiply 2 and 3
multiply 1 and (2*3)
multiply 4 and 5
multiply (4*5) and 6
multiply (1*(2*3)) and ((4*5)*6)

Note that the numbers here are not integers but indices that map to its respective matrix in the matrix chain.

It is not that hard to convert the above code to multiply matrices. I will leave it as an exercise, should you be interested.


Why is this matrix chain multiplication problem so important that most computer science undergraduate programs must include it in their syllabus? The same reason applies for most algorithms you learn: some problems are just slight variations of another problem. Understanding one solution to a problem in the marrow of its bones may help you solve another similar problem.

Therefore, I encourage you to challenge yourself on these 2 problems to help solidify your understanding:

You can find my solutions to these problems in my github SPOJ repository.


Matrix Chain Multiplication with C++ code – PART 2: Implementation

In the last post I explained how we got the general solution:

B(i,j) = \begin{cases} 0 \text{ if } i = j \\ \displaystyle\min_{k=i}^{j-1} \begin{cases} B(i, k) + B(k+1, j) + r_i \cdot c_k \cdot c_j \end{cases} \end{cases}

In this post we plug the formula in and watch it run! I will go through the recursive solution (O(2^n)), and then the more optimal dynamic programming solution (O(n^3)).

This is part of a 3 part series:

  1. Analysis and Design – analyze the problem and deduce a solution.
  2. Implementation – implement the algorithm to find the minimum number of operations.
  3. Extracting the Sequence – extract the sequence in which the chain is multiplied.


Our input (input.txt), therefore, is the rows and columns of all n matrices:

30 35 15 5 10 20 25

The above input meant that there are 7 integers that correspond to the dimensions of 6 matrices:


The input only contains the rows from matrix A_1 until A_6, followed by the column of matrix A_6 . This is because a matrix A and only be multiplied with a matrix B if they are compatible (if r_A = c_B). So for brevity sake we do not need to put rows and columns of all matrices (otherwise the sequence will look like 30 35 35 15 15 5 5 10 10 20 20 25).

Notice, that if we place the sequence of integers in an array rcthe row of a matrix A_i is rc[i] and the column is rc[i+1].

Recursive Implementation

As surprising though it may be, the amount of code is surprisingly little:

I have written the code to follow the formula as close as possible, as such I start the index with 1 instead of 0. You can always change it to save that bit of space, but the code just serves to show that it works:

./mat-chain-mult-recursive < input.txt

Now, if you print the inputs of the function B each time it is being called, you will see a lot of the same inputs are being called again and again. In geeksforgeeks.org’s article on matrix chain multiplication, they took the time to draw out the recursion tree.

Dynamic Programming: Ordering

So now we have broken the main problem to small recurring subproblems (Overlapping Subproblems), which we can piece together to solve the main problem (Optimal Substructure). With this 2 properties, we can use dynamic programming to drastically reduce the running time of an exponential solution.

However, this problem is tricky to solve since it involves 2 variables i and j. We need a 2D array to keep track of this. In a single dimensional array ordering is trivial; we simply go from left to right, building the solution from the bottom up, but with 2 dimensions ordering becomes tricky. Which cell do we solve first?

Well, we know that:

  • the base case is when i = j, then B(i, j) = 0.
  • i cannot exceed j, so those areas will need to be grayed out.
  • B(1, 6) is the solution.

So with that, let us fill up an array DP based on the inputs we have above:


So we need to find B(1, 6) yes? Well, what does B(1, 6) need? Well, if we unfold the general solution plugging in i as 1 and j as 6 we have 5 selections in which we choose the smallest:

\begin{cases}k=1: B(1, 1) + B(2, 6) + r_1 \cdot c_1 \cdot c_6 \\k=2: B(1, 2) + B(3, 6) + r_1 \cdot c_2 \cdot c_6 \\k=3: B(1, 3) + B(4, 6) + r_1 \cdot c_3 \cdot c_6 \\k=4: B(1, 4) + B(5, 6) + r_1 \cdot c_4 \cdot c_6 \\k=5: B(1, 5) + B(6, 6) + r_1 \cdot c_5 \cdot c_6 \end{cases}

To make it easier to visualize what B(1, 6) requires, I distinguish the selection by colour:


What you can see, is that B(1, 6) requires cells to the left and to the bottom of it. Well, we have the cells that represent the base cases solved. Perhaps the next logical step is to solve the next diagonal. This is possible, because all the values of all cells where j i = 1 depends on all cells where ji = 0. Consequently, all values of all cells where ji = 2 depends on all cells where ji = 1 and ji = 0; you can validate this by plugging in the numbers into the formula. Because all cells aside the base case requires cells below and to its left up until it hits the base case, we can continue filling up the cells of the array DP diagonally until we hit the top right cell.

From another perspective, we are finding all 5 pairs of the matrices in the chain and find their minimum cost (choose from 1 each), then we find all 4 triplets of matrices in the chain and find their minimum cost (choose from 2), then we find all 3 quadruplets of matrices in the chain and find their minimum cost (choose from 3), and this repeats until we hit the top right cell; each subsequent pass depending on the pass before it.

So we solve in diagonals in order ji = 0, 1, 2 … n – 1. Once you can sequentially iterate through the array DP in a diagonal fashion, you have everything you need to piece together the solution:


It helps that you calculate the cells by hand to grasp the essence of this algorithm.

Dynamic Programming: Implementation

So what I did, was design a nested for loop to print the indices in the order I want. Once I do that I simply plug the formula inside, and I have my final solution:

Note that again, the index starts at 1.

Although code count is near indistinguishable, the performance is drastically improved at a negligible expense of space. We generalize that the improved implementation is O(n^3) by counting the number of nested for loops.


In the 3rd and final part, I will guide you through extracting the actual order in which the chain is multiplied. Check it out: Matrix Chain Multiplication with C++ code – PART 3: Extracting the Sequence.



Matrix Chain Multiplication with C++ Code – PART 1: Analysis and Design

Ah. The matrix chain multiplication problem. A classic dynamic programming example in the undergraduate computer science world. I will be doing a series consisting of 3 parts:

  1. Analysis and Design – analyze the problem and deduce a solution.
  2. Implementation – implement the algorithm to find the minimum number of operations.
  3. Extracting the Sequence – extract the sequence in which the chain is multiplied.

There’s practically a ton of articles out there that already writes about this problem. I intend to distinguish myself by expressing it in the simplest terms possible with examples, and season it with working C++11 source code. In other words, if you had problem understanding everywhere else, my hopes is that your search ends with this series.

In this first part, there will be no code. At all. Not even pseudocode. I promise some C++11 code in the coming 2 parts, but understanding this section will be important for implementation later.


I will assume you should have a bit of background in matrix math. It helps that you do a bit of refreshing if you have forgotten.

I will also assume you have known a bit of what dynamic programming is, as matrix chain multiplication is a slightly more complex example; you keep track of a 2D array instead of 1D array. I’ve written a beginner level hands on introduction to dynamic programming in a form of solving a problem which you can check out: 445A – Boredom – CodeForces Tutorial.

Understanding the Problem

Given a chain of matrices, we want to multiply them. We cannot change the order of the matrices as that would change the result or make it incompatible, therefore we say matrix multiplication is not commutative. In addition, regardless of how you parenthesize (e.g. whether the sequence as multiplied as A\cdot \left ( B\cdot C \right ) or \left ( A\cdot B \right )\cdot C ) the chain it will give you the same answer, thus we say it is associative.

How we parenthesize the matrices also determine how many operations will be performed. For a multiplication chain of integers, parenthesizing them does not affect the number of operations performed, but for matrices the change is significant. For example, given the matrices with the following dimensions:


If we define a row of a matrix as r_A and the column as c_A, then the total number of multiplications required to multiply A and B is r_A \cdot c_A \cdot c_{B}.

  • A\cdot \left ( B\cdot C \right ) -> 2*20*1 + 2*1*10 = 60 multiplications
  • \left ( A\cdot B \right )\cdot C -> 20*1*10 + 2*20*10 = 600 multiplications

The 1st case is 10 times faster!

NOTE: There is a neat coding exercise you can try out on this total multiplications thing on UVa.

Therefore, we want to figure out how to parenthesize the matrix chain such that we minimize the required number of operations.

However, to keep things simple, we will focus on the minimum number of operations. We will eventually find the sequence itself by building on top on this.


To put in formal terms, given a sequence of n matrices A_1, A_2, A_3 \ldots A_n we wish to parenthesize the product of this sequence so as to minimize the number of operations required.

Finding the Minimum Number of Operations

Suppose we have a function B(i, j) that computes the minimum number of required operations for multiplying a chain of matrices from matrix i to matrix j. The solution to the main problem would be B(1, n).

A standard technique to in solving a dynamic programming problem is usually to ask, “What is the last operation?”. Here, we ask what is the minimum total cost of the last operation? Regardless how the chain is parenthesized, there will always be 2 matrices remaining. The cost, is the minimum cost of some prefix + minimum cost of some suffix + the number of operations to multiply them. The cost of prefix and suffix in turn, is derived from the minimum number of operations that deduces them.

Because each matrix can only multiply with its adjacent matrix, a prefix can only start from A_1 to some matrix A_k, and a suffix can only start from A_{k+1} to A_n, split at some index k. The cost in the last operation for the prefix is then B(1, k) and the cost of the suffix is B(k+1, n)k therefore ranges from 1 until n-1 in the last operation, and within a range i to j, k ranges from i to j-1. We choose k by selecting the minimum cost of all the possible costs within a range i to j.

The resultant dimensions from multiplying 2 matrices are important to finding the cost. In a sequence of matrices A_i \ldots A_j If we split at a point k, the resultant dimensions of the prefix is r_i \times c_k and the suffix is r_{k+1} \times c_j. The cost of multiplying these 2 matrices are therefore r_i \cdot c_k \cdot c_j.


Running Through Possibilities

Let us consider the base case: When there is only 1 matrix. The prefix = suffix, and there are no operations performed, so the cost = 0.

Now what if there are only 2 matrices? There are no other possible combinations to compare with, so the minimum number of operations is the number of operations to multiply them. There is no cost to the prefix or suffix, because the cost for 1 matrix alone is 0.

3 matrices onwards is gets a little more challenging. When the chain has only 3 matrices, you have to choose between \left ( A_1\cdot A_2 \right )\cdot A_3 and A_1\cdot \left ( A_2\cdot A_3\right ). How we choose between the 2 is determined by cost of either multiplying A_1 and A_2 first, then take the resulting matrix (prefix) and multiply by A_3 (suffix), OR, multiplying A_2 and A_3 first, then multiplying that (suffix) by A_1 (prefix). Where we decide to make that split, either at A_1 or A_2, we determined by their cost. The matrix in which we choose we call it A_k, at a point k.

The minimum cost with 3 matrices, B(1, 3) is therefore B(1, k) + B(k+1, 3) + r_1 \cdot c_k \cdot c_3, and we choose between B(1, 1) + B(2, 3) + r_1 \cdot c_1 \cdot c_3 and B(1, 2) + B(3, 3) + r_1 \cdot c_2 \cdot c_3. You can continue doing this for 4 or 5 or 6 matrices until you start to see a pattern. Notice you can apply the same procedure to find B(1, 3) to find B(1, 2) – there is only 1 possible value for k for B(1, 2).

The General Solution

So in a range i to j, we select from j – i possibilities, from i until j – 1. Those possibilities in turn call B recursively until it hits the base case where i = j. The general solution is therefore:

B(i,j) = \begin{cases} 0 \text{ if } i = j \\ \displaystyle\min_{k=i}^{j-1} \begin{cases} B(i, k) + B(k+1, j) + r_i \cdot c_k \cdot c_j \end{cases} \end{cases}

And the solution to the main problem would be B(1, n).


This is all for the first part. Next post I will cover implementation: Matrix Chain Multiplication with C++ code – PART 2: Implementation.


445A – Boredom – CodeForces Tutorial

Inasmuch as this is a CodeForces tutorial, I’d also want to use Boredom as an introduction to dynamic programming (dp). I will also compare the dp solution to its alternate recursive implementation. This tutorial also caters to people who can’t seem to wrap their head around the solution or tutorial from CodeForces.

I won’t be explaining the problem itself. Should be pretty straightforward.


The maximum points can exceed the limit for the 32-bit data type int. In some of the test cases that value can be as high as 9999899998 (the maximum capacity for 32-bit int is 2147483647). The maximum value for the n integers may be up to 10^5, but they can total up to a lot higher than that.

That is why you will see C/C++ solutions out there the data type long long int:

typedef long long int bigInt;

Solution Overview

You can try to understand the succinct tutorial from CodeForces, if you haven’t already looked into it.

The sequence of n numbers needed to be grouped together and kept count, because when selecting a single number, it makes sense to repeatedly select it (thereby adding it to your points) until it doesn’t exist in the sequence. We will keep track of this count in array c, where c[i] contains the number of occurrences of i.

Let f be a function that calculates the maximum number of points to a value up to i. Now there will be 2 base cases:

  1. If i = 0, then f(0) = 0
  2. If i = 1, then f(0) = c[1]

I hope the 2nd point is clear. Up to 1 the maximum number of points you can get by selecting 1 is the number of occurrences of 1 in the array c.

When i >= 2, you have 2 choices in deciding which will be f(i):

  1. Select f(– 2), in which you can also add with i * c[i].
  2. Select f(i – 1), but in doing so you can’t choose to add i * c[i] because of the condition that all elements equal to ak + 1 and ak - 1 also must be deleted from the sequence.

Choosing between 1 and 2 is as simple as choosing which is largest.

Now let m be the largest number in the sequence. The solution is therefore f(m). The CodeForces tutorial writes f(n), but I doubt n here implies the number of elements in the sequence. To avoid confusion, I used m instead.

Most implementations I see online simply put f(100000).

It helps that you trace the algorithm step by step on paper, should it not be apparent to you how it works. For example for input:

10 5 8 9 5 6 8 7 2 8

You can track the values of i, c[i], and f(i):

i    c[i]   i*c[i]   f(i)
0    0      0        0
1    0      0        0
2    1      2        2
3    0      0        2
4    0      0        2
5    2      10       12
6    1      6        12
7    1      7        19
8    3      24       36
9    1      9        36
10   1      10       46

Do not undermine this method; learning how to trace algorithms in a systematic way can be very helpful to help you understand it.

Recursive Implementation

Based on the solution as detailed above, here the recursive implementation.

#include <algorithm>
#include <iostream>
#include <vector>

typedef long long int bigInt;

using namespace std;

vector<bigInt>  c(100001, 0);

bigInt f(const int i)
	if (i == 0) return 0;
	if (i == 1) return c[1];

	return max(f(i - 1), f(i - 2) + i*c[i]);

int main() 
	int n, x, m = 0;
	cin >> n;

	while(n--) {
		cin >> x;
		m = max(m, x);

	cout << f(m);

Don’t submit this solution; it will fail at test 9 (time limit exceeded).

That’s it! In 30 lines of code you can solve this problem! The issue? Well, try and increase the values in the sequence (not the number of elements in the sequence. This doesn’t affect the growth of the function). Even a value as small as 30 and I had to wait a few seconds for it to compute.

Analysis of Recursive Implementation

We can define the performance of this recursive implementation by the number of recursive calls.

Let us trace the recursive calls of the function f in a recursion tree. I used paler colors to signify the depth of the recursion tree; the deeper it gets, the paler it gets:

Recursion tree of function f

That is a total of 9 recursive calls. How about f(5)?

recursion tree- f(5)

5+9+1 (including itself) = 15 function calls.

If you observe for awhile, you will be able to deduce a pattern. Let q be a function that finds the number of function calls for f given i. Then we write q as: q(i)=q(i-1) + q(i-2) + 1, for all i >= 2; q(0)=1; q(1)=1. If you plot this function as a graph, you will get this:

exponenetial graph

So time complexity is O(2^n); Implementation scales exponentially.

By the time i reaches 18, q(18) = 8361. When i = 30, the number of function calls becomes 2692537! This would explain that bit of lag earlier. Just imagine i going as big as 10000. It is no surprise then, that when this happens, the program crashes due to stack overflow:

stack overflow!

Dynamic Programming Implementation

Dynamic programming, in essence, is an optimization technique that dramatically cuts down computational time but taking out a bit of space to store calculated answers to overlapping subproblems. The overlapping subproblems here are the calls to f, which for example in the recursion tree f(4) we can see that f(1) is called 3 times and f(2) and f(0) is called 2 times.

Furthermore, we can take the solutions the subproblems to compute the solution of the main problem. For example f(5) can be determined by f(4) and f(3), and f(4) can be determined by f(3) and f(2), and so on. This property is called Optimal Substructure.

To recap, the 2 properties that must exist for a problem to be made solvable by dp:

Let’s put theory to practice.

We can reduce the time complexity from O(2^n) to O(n) by keeping an array DP of the same size as cDP will store the calculated values of f for all i. The solution, therefore, is DP[m], since it contains f(m). Below is the dynamic programming solution:

#include <algorithm>
#include <iostream>
#include <vector>

typedef long long int bigInt;

using namespace std;

int main() 
	const int MAX_N = 100001;

	int n, x;
	vector<bigInt> dp(MAX_N, 0);
	vector<bigInt>  c(MAX_N, 0);

	cin >> n;

	while(n--) {
		cin >> x;

	dp[0] = 0;
	dp[1] = c[1];

	for (int i = 2; i < MAX_N; i++) {
		dp[i] = max(dp[i - 1], dp[i - 2] + i*c[i]);

	cout << dp[MAX_N - 1];

In the dp solution, f(i) is computed from 0 to 10000 (bottom to top), whereas in the recursive solution f(i) is computed from m to 0 (top to bottom).

Notice I didn’t bother keep track the largest value in the sequence (variable m). You can have that bit of optimization if you use it, but it isn’t necessary for this problem. In essence we have f (the block of code inside that for loop) called a constant 10000 times each time this program is called, and this works granted the largest value in the sequence doesn’t exceed 10000.

What is fascinating about this implementation is that it solves the problem with the same amount of code! Ladies and gentlemen, the power of algorithms!


103 – Stacking Boxes (Dynamic Programming Method) – UVa Tutorial

In my previous post I wrote about solving the Stacking Boxes problem using graphs. In this post I will write about a simpler method that utilizes dynamic programming that solves the same problem with half the amount of code.

I will assume you have read my previous post on using graphs, though you haven’t you can check it out here: 103 – Stacking Boxes (Graph Method) – UVa Tutorial. I will refer longest increasing subsequence as LIS and dynamic programming as dp.

On the O(n^2) LIS Dynamic Programming Solution

In using dynamic programming there are 2 variants of the algorithm: one solves in O(n^2) and another in O(n*log n), the later of which replaces the inner loop in the former algorithm with a binary search using an algorithm called Patience Sorting. I will be using the simpler, quadratic time variant. I won’t attempt to explain the algorithm in detail, but there are plenty of resources out there to help you wrap your head round it:

  1. Tushar Roy’s step by step of the algorithm (youtube video)
  2. Geeksforgeeks article
  3. Bo Qian’s youtube video (with C++ code)
  4. Article on Algorithmist (succinct, though can be tough to understand).

The first 2 resource will only find the length of the LIS, though that process is the heart of the algorithm. Once you can find that, getting the sequence itself is not that hard.

A Generic LIS Implementation

The code here is based off a stackoverflow answer.

template<typename T>
deque<T> find_lis(const vector<T> &a,
	function<bool(const T &, const T &)> comp = [&](const T &a, const T &b)->bool { return a < b; })
	int maxLength = 1, bestEnd = 0;

	vector<int> DP(a.size(), 1);
	vector<int> prev(a.size(), -1);

	for (int i = 1; i < a.size(); i++) { for (int j = i - 1; j >= 0; j--) {
			if (DP[j] + 1 > DP[i] && comp(a[j], a[i])) {
				DP[i] = DP[j] + 1;
				prev[i] = j;

		if (DP[i] > maxLength) {
			bestEnd = i;
			maxLength = DP[i];

	deque<T> lis;
	for (int i = bestEnd; i != -1; i = prev[i]) {

	return lis;

Using the function itself is pretty straightforward. You pass a vector that contains your sequence and it spits out the LIS as a deque. Here’s a github gist that test drives this function. You also have an option to modify the comparator, which in the gist is used to change the LIS algorithm to find the longest decreasing subsequence.

Solution Explanation

Once you understood what the LIS algorithm is doing, the rest is pretty much a no brainer. Once you sort the boxes and its dimensions, you can find the LIS is about 5 lines of code:

auto comp = [&](const Box *a, const Box *b)->bool { return canFit(a, b); };
auto lis = find_lis<Box*>(boxes, comp);

cout << lis.size() << endl;
for (auto b : lis) cout << b->id << " ";
cout << endl;

The only thing you need to customize here is the comparator; you will get the LIS based on whether the boxes will fit into each other. I handle this will a simple lambda function comp which I input inside find_lis. The canFit function is the exact same function in the graph method solution.

103 – Stacking Boxes (Graph Method) – UVa Tutorial

I am excited. Stacking Boxes has had me mystified over almost a year (my last failed submission was 5 Dec 2014). I have Googled and Googled but could not solve it… until now.

Perhaps you noticed that I placed in the title “Graph Method”. That is because there are 2 ways to solve this problem: either by using dynamic programming (or dp for short) or graphs. I will be writing about the graph method, which is, according to algorithmist, the more complex solution. I have not studied dp in depth, so I decided I will try graphs first. Perhaps in a future post I will look into dp and decide if it really is the simpler solution.

UPDATE: The dynamic programming variant is the solution is up, and it solves in half the amount of code!

NOTE: If you have no knowledge about graph theory, check out Algorithms by Robert Sedgewick and Kevin Wayne. In addition to the book, they have a website and a Coursera online course (Graphs are in part II). The algorithms I will talk about references from this book. I refer it here as the Princeton Algorithms book.

Let’s begin.

Explaining the Problem

Given a bunch of boxes of n dimensions, find a longest sequence such that you are able to fit a box inside a box for all boxes in that sequence. The boxes can be flipped, turned, contorted in order to fit inside another box, so you don’t have to follow the order of the dimensions they provide.

Gotcha’s and Observations

Perhaps a possible step you have considered is sorting the boxes and find the longest increasing subsequence (or LIS for short) from there by comparing from left to right; the most left being the smallest and the most right being the largest. Indeed sorting the boxes and its corresponding dimensions is the first step, but the start of the sequence is not necessarily the first box sorted in ascending order. Here is an example input:

5 2
3 70
8 10
4 9
9 31
10 10

5 boxes of 2 dimensions. And now the sorted order (the first number that is followed by a colon is the original ordering):

1: 3 70
3: 4 9
2: 8 10
4: 9 31
5: 10 10

In such a case the sequence cannot start from 1. The same applies that the last element of the sorted boxes isn’t necessarily the last element of the LIS (refer to the 5th box in the input above).

Notice that 5 can never preceded 4, and 4 can never precede 2 and so on. This is true for all sequences of sorted boxes: no box can precede a box before it. This way we only add the edges of the graph from top to bottom.

So imagine trying to solve this problem without graphs or dp. I actually tried listing every possible permutation and selecting the longest sequence a year ago (It was pretty bad).

Explanation of Solution

So to put it succinctly, if you make connections in a digraph based on whether a box fits in another box, you will get yourself a directed acyclic graph (DAG for short). The longest path of this graph is the LIS of boxes that can be stacked. You find the longest path by finding the shortest path of a graph with negative edge weights.

There are no variable weights to any of the edges here, so we just set all edges connecting from one box to another as -1.

To further explain the solution I will use an example input:

7 2
3 70
8 10
5 9
5 7
4 5
9 31
10 10

If you sort the boxes the input would like this:

1: 3 70
5: 4 5
4: 5 7
3: 5 9
2: 8 10
6: 9 31
7: 10 10

Now when building the digraph, note of the property that the start and end of the sequence is not necessarily the start and end of the sorted order of boxes. Therefore in addition to the vertices representing the boxes we have 2 more vertices, source and end (this is why the graph is sized k+2 in the solution). Source connects to all boxes, and all boxes connect to end. All connections to and from these 2 vertices are of 0 weight:

// add edges to source (start) and sink (end) vertex
for (int i = 1; i < k+1; i++) {
    g.addEdge(new Edge(0, i, 0));
    g.addEdge(new Edge(i, k+1, 0));

The line:

g.print(); cout << endl;

Prints the digraph by listing all the vertices and its respective edges and edge weights as such:

0 : 1(0) 2(0) 3(0) 4(0) 5(0) 6(0) 7(0)
1 : 8(0)
2 : 6(-1) 8(0)
3 : 2(-1) 6(-1) 7(-1) 8(0)
4 : 2(-1) 6(-1) 7(-1) 8(0)
5 : 4(-1) 3(-1) 2(-1) 6(-1) 7(-1) 8(0)
6 : 8(0)
7 : 8(0)
8 :

I’ve went ahead to illustrate the digraph for the above input. Note that I didn’t illustrate connections from and to source and end vertices, because it would look really messy.


I then find the longest path by using the topological sort algorithm, which relaxes (you can read about edge relaxation here) all edges in topological order (reverse post order of dfs) to find the shortest path tree.

Here is the gist of the algorithm in code:

distTo [ source ] = 0;

Topological topo(g);

for (const auto &v : topo.order) {
    for (const auto ed : g.adj(v)) relax(ed);

Here is a possible tree:


Note that there are 2 possible sequences for this example (5 4 2 6 and 5 3 2 6); the algorithm will only find one, but either one is a valid answer.

Topological Sort vs Bellman-Ford

I have looked up a solution by saicheems that solves Stacking Boxes by also using graphs, and he uses the Bellman-Ford algorithm. This is a sensible choice granted that there are negative edges, but this is a DAG; you can get away with the topological sort algorithm, which works also for digraphs with negative edges, as long as it is acyclic (Bellman-Ford takes care of cycles).

Why topological sort algorithm? Well, it is faster (and IMHO is also simpler than Bellman-Ford). Bellman-Ford solves the problem in O(E*V) whereas topological sort uses O(E+V).

Well, at least that is what is said on paper. Saicheems solution also ran in time 0.000 (I suspected something is wrong with the judge), which is no different my solution.