445A – Boredom – CodeForces Tutorial

Inasmuch as this is a CodeForces tutorial, I’d also want to use Boredom as an introduction to dynamic programming (dp). I will also compare the dp solution to its alternate recursive implementation. This tutorial also caters to people who can’t seem to wrap their head around the solution or tutorial from CodeForces.

I won’t be explaining the problem itself. Should be pretty straightforward.

Gotchas

The maximum points can exceed the limit for the 32-bit data type int. In some of the test cases that value can be as high as 9999899998 (the maximum capacity for 32-bit int is 2147483647). The maximum value for the n integers may be up to 10^5, but they can total up to a lot higher than that.

That is why you will see C/C++ solutions out there the data type long long int:

typedef long long int bigInt;

Solution Overview

You can try to understand the succinct tutorial from CodeForces, if you haven’t already looked into it.

The sequence of n numbers needed to be grouped together and kept count, because when selecting a single number, it makes sense to repeatedly select it (thereby adding it to your points) until it doesn’t exist in the sequence. We will keep track of this count in array c, where c[i] contains the number of occurrences of i.

Let f be a function that calculates the maximum number of points to a value up to i. Now there will be 2 base cases:

  1. If i = 0, then f(0) = 0
  2. If i = 1, then f(0) = c[1]

I hope the 2nd point is clear. Up to 1 the maximum number of points you can get by selecting 1 is the number of occurrences of 1 in the array c.

When i >= 2, you have 2 choices in deciding which will be f(i):

  1. Select f(– 2), in which you can also add with i * c[i].
  2. Select f(i – 1), but in doing so you can’t choose to add i * c[i] because of the condition that all elements equal to ak + 1 and ak - 1 also must be deleted from the sequence.

Choosing between 1 and 2 is as simple as choosing which is largest.

Now let m be the largest number in the sequence. The solution is therefore f(m). The CodeForces tutorial writes f(n), but I doubt n here implies the number of elements in the sequence. To avoid confusion, I used m instead.

Most implementations I see online simply put f(100000).

It helps that you trace the algorithm step by step on paper, should it not be apparent to you how it works. For example for input:

10
10 5 8 9 5 6 8 7 2 8

You can track the values of i, c[i], and f(i):

i    c[i]   i*c[i]   f(i)
0    0      0        0
1    0      0        0
2    1      2        2
3    0      0        2
4    0      0        2
5    2      10       12
6    1      6        12
7    1      7        19
8    3      24       36
9    1      9        36
10   1      10       46

Do not undermine this method; learning how to trace algorithms in a systematic way can be very helpful to help you understand it.

Recursive Implementation

Based on the solution as detailed above, here the recursive implementation.

#include <algorithm>
#include <iostream>
#include <vector>

typedef long long int bigInt;

using namespace std;

vector<bigInt>  c(100001, 0);

bigInt f(const int i)
{
	if (i == 0) return 0;
	if (i == 1) return c[1];

	return max(f(i - 1), f(i - 2) + i*c[i]);
}

int main() 
{
	int n, x, m = 0;
	cin >> n;

	while(n--) {
		cin >> x;
		c[x]++;
		m = max(m, x);
	}

	cout << f(m);
}

Don’t submit this solution; it will fail at test 9 (time limit exceeded).

That’s it! In 30 lines of code you can solve this problem! The issue? Well, try and increase the values in the sequence (not the number of elements in the sequence. This doesn’t affect the growth of the function). Even a value as small as 30 and I had to wait a few seconds for it to compute.

Analysis of Recursive Implementation

We can define the performance of this recursive implementation by the number of recursive calls.

Let us trace the recursive calls of the function f in a recursion tree. I used paler colors to signify the depth of the recursion tree; the deeper it gets, the paler it gets:

Recursion tree of function f

That is a total of 9 recursive calls. How about f(5)?

recursion tree- f(5)

5+9+1 (including itself) = 15 function calls.

If you observe for awhile, you will be able to deduce a pattern. Let q be a function that finds the number of function calls for f given i. Then we write q as: q(i)=q(i-1) + q(i-2) + 1, for all i >= 2; q(0)=1; q(1)=1. If you plot this function as a graph, you will get this:

exponenetial graph

So time complexity is O(2^n); Implementation scales exponentially.

By the time i reaches 18, q(18) = 8361. When i = 30, the number of function calls becomes 2692537! This would explain that bit of lag earlier. Just imagine i going as big as 10000. It is no surprise then, that when this happens, the program crashes due to stack overflow:

stack overflow!

Dynamic Programming Implementation

Dynamic programming, in essence, is an optimization technique that dramatically cuts down computational time but taking out a bit of space to store calculated answers to overlapping subproblems. The overlapping subproblems here are the calls to f, which for example in the recursion tree f(4) we can see that f(1) is called 3 times and f(2) and f(0) is called 2 times.

Furthermore, we can take the solutions the subproblems to compute the solution of the main problem. For example f(5) can be determined by f(4) and f(3), and f(4) can be determined by f(3) and f(2), and so on. This property is called Optimal Substructure.

To recap, the 2 properties that must exist for a problem to be made solvable by dp:

Let’s put theory to practice.

We can reduce the time complexity from O(2^n) to O(n) by keeping an array DP of the same size as cDP will store the calculated values of f for all i. The solution, therefore, is DP[m], since it contains f(m). Below is the dynamic programming solution:

#include <algorithm>
#include <iostream>
#include <vector>

typedef long long int bigInt;

using namespace std;

int main() 
{
	const int MAX_N = 100001;

	int n, x;
	vector<bigInt> dp(MAX_N, 0);
	vector<bigInt>  c(MAX_N, 0);

	cin >> n;

	while(n--) {
		cin >> x;
		c[x]++;
	}

	dp[0] = 0;
	dp[1] = c[1];

	for (int i = 2; i < MAX_N; i++) {
		dp[i] = max(dp[i - 1], dp[i - 2] + i*c[i]);
	}

	cout << dp[MAX_N - 1];
}

In the dp solution, f(i) is computed from 0 to 10000 (bottom to top), whereas in the recursive solution f(i) is computed from m to 0 (top to bottom).

Notice I didn’t bother keep track the largest value in the sequence (variable m). You can have that bit of optimization if you use it, but it isn’t necessary for this problem. In essence we have f (the block of code inside that for loop) called a constant 10000 times each time this program is called, and this works granted the largest value in the sequence doesn’t exceed 10000.

What is fascinating about this implementation is that it solves the problem with the same amount of code! Ladies and gentlemen, the power of algorithms!

References

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s