Synchronizing BibTeX in Overleaf (BibLaTeX) and Texmaker (MiKTeX, Apacite)

In this post I detail how to get bibtex working on Overleaf (previously known as WriteLatex) and Texmaker (Windows 10 64-bit, MikTeX). Note that the citation format I’m using is APA, as specified by my university.

Overleaf may have the advantage of having collaborative editing with (almost) live previewing, but I hit a lot of problems getting the documents with bibtex I wrote there to compile in Texmaker. It just doesn’t compile. Conversely, copy-pasting working bibtex code from TexMaker into Overleaf pulls out compile errors.

So the best workflow I can come out with at the moment is this: Create latex document from my template (get from here: Overleaf to Texmaker Bibtex Template. There should be 2 files: main.tex and ref.bib. If the template link is not working, you can get from this Github gist instead), edit the latex document collaboratively in Overleaf, and then when you need it to compile in Texmaker, download the project as a zip and change some code.

Fortunately, it’s only 2 blocks of code, annotated as “SETUP DOCUMENT” and “END DOCUMENT”. You’ll find this in the start and end of the latex document respectively. These code blocks (provided in the template; the texmaker version is commented out) will need to be changed when moving your code back to TexMaker.

Overleaf

SETUP DOCUMENT:

% BEGIN -- SETUP DOCUMENT --
\documentclass[a4paper,12pt]{article}
\usepackage[utf8]{inputenc}
\usepackage[british]{babel}
\usepackage{csquotes}
\usepackage[backend=biber,style=apa]{biblatex}
\DeclareLanguageMapping{british}{british-apa}
\usepackage[pdftex]{graphicx}
\addbibresource{ref.bib}
\let\cite\parencite
\begin{document}
% END -- SETUP DOCUMENT --

END DOCUMENT:

% BEGIN -- END DOCUMENT --
\printbibliography
\end{document}
% END -- END DOCUMENT --

TexMaker

SETUP DOCUMENT:

% BEGIN -- SETUP DOCUMENT --
\documentclass[a4paper,12pt]{article}
\usepackage{hyperref}
\usepackage{apacite}
\begin{document}
\bibliographystyle{apacite}
% END -- SETUP DOCUMENT --

END DOCUMENT:

% BEGIN -- END DOCUMENT --
\bibliography{ref}{}
\end{document}
% END -- END DOCUMENT --

If you receive warning messages in TexMaker that goes something like

Citation ‘blablabla2007’ undefined

when you press F1 (quickbuild), you will need to enable bibtex in your build. To do this, go to “options > Configure TexMaker” and under “Quick Build” tab, select the quick-build command “PdfLatex + Bib(la)tex + PdfLaTeX (x2) + View Pdf”

2016-05-12 15_44_11-Settings

Tip: You can generate bibtex code from easily with bibme.org.

There is another variation of this: separate your content into another *.tex file, and then you have 2 master documents – one for Overleaf (main.tex) and another you use for TexMaker (main-texmaker.tex or whatever name you want) – which both includes the same content file. In TexMaker, you’ll need to set one document to be a master document to work with multiple files.

Have fun with Overleaf!

Advertisements

CS231n – Assignment 1 Tutorial – Q2: Training a Support Vector Machine

This is part of a series of tutorials I’m writing for CS231n: Convolutional Neural Networks for Visual Recognition. Go to this page to see the full listing.

To conserve space, I won’t be placing my full solutions in this post. Refer to my github repository for full source code.

Loss is fairly straightforward so I will be skipping that. After you successfully implemented that, it should give a value between 8 to 10. If you replace W with a zero matrix instead, the loss value will be exactly 9.

Though not in the lecture video, you can find the formula for the gradient in the lecture notes. We define margin as a matrix of margins where:

\text{margin}_{i, j} = w_j \cdot x_i - w_{y_i}\cdot x_i + \Delta

For each sample i,  we look through the n number of margins (n being the number of classes we want to classify).

If that class is the correct class (denote as y_i), it contributes the sample row i itself with a weight of the negation of the total number (where 1 is the indicator function that is one if the condition inside is true or zero otherwise) of incorrect classes where the margin is > 0, given n number of classes:

\nabla_{w_{y_i}} L_i = - \left( \sum\limits_{j=1, j\neq y_i}^n 1(\text{margin}_{i,j} > 0) \right) x_i

If that class is incorrect, it contributes the sample row i depending on whether the margin is >0:

\nabla_{w_j} L_i = 1 \left( \text{margin}_{i,j} > 0 \right) x_i

After you implement the naive version of SVM gradient, it should match up fairly closely with the numerically computed gradient:

numerical: 0.449971 analytic: 0.449971, relative error: 5.636661e-10
numerical: -16.007355 analytic: -16.007355, relative error: 6.481167e-12
numerical: 9.938192 analytic: 9.938192, relative error: 8.279972e-12
numerical: 6.561015 analytic: 6.540654, relative error: 1.554020e-03
numerical: 0.151951 analytic: 0.151951, relative error: 1.675199e-09
numerical: 3.936123 analytic: 3.936123, relative error: 7.290840e-11
numerical: -28.234382 analytic: -28.234382, relative error: 4.143686e-12
numerical: -0.452718 analytic: -0.452718, relative error: 4.690168e-10
numerical: -22.248917 analytic: -22.248917, relative error: 1.991237e-11
numerical: 20.414164 analytic: 20.381418, relative error: 8.027022e-04
numerical: 10.663350 analytic: 10.663350, relative error: 8.140355e-12
numerical: 2.612748 analytic: 2.612748, relative error: 6.536924e-11
numerical: -4.644796 analytic: -4.644796, relative error: 5.570529e-12
numerical: 15.987607 analytic: 15.987607, relative error: 2.419968e-11
numerical: -35.476517 analytic: -35.476517, relative error: 9.777531e-12
numerical: -25.275914 analytic: -25.275914, relative error: 5.744716e-12
numerical: -3.928484 analytic: -3.928484, relative error: 1.384318e-11
numerical: -16.666177 analytic: -16.666177, relative error: 1.360765e-12
numerical: 17.289968 analytic: 17.289968, relative error: 4.612259e-11
numerical: 4.519505 analytic: 4.519505, relative error: 1.483479e-11

On average if the relative error is e-8 instead, check that you have implemented regularization. This is not explicitly given in the notes, but it is hinted in the assignment:

you didn’t forget the regularization gradient did you?

You can find the regularization gradient by finding the derivative of the loss function regularization. Given that the loss function regularization is \frac{1}{2}\lambda \sum W^2 where \lambda is the regularization parameter, the derivative of that with respect to W is then \lambda \sum W.

To implement vectorization we need to fuse operations together. So now we try to find the gradient of a given class k for all n classes. The result is a single row with the number of columns being the number of input features (1 x 3073).  We define that the total number of samples is m. This would be the mean of all correct (\nabla_{w_{y}} L) and incorrect (\nabla_{w_j} L) samples given the class k:

\nabla_{w_k} L = \frac{1}{m}\left[ \nabla_{w_j} L + \nabla_{w_{y}} L \right]

\nabla_{w_{y}}L= \sum\limits_{i=1}^{m}\left[- \left(  \sum\limits_{j=1, j \neq y_i}^{n} 1 \left( \text{margin}_{i,j} > 0 \right) \right) x_{i}\right]

\nabla_{w_{j}}L= \sum\limits_{i=1}^{m} 1 \left( \text{margin}_{i,j} > 0 ) x_i \right)

This can be implemented in a few lines of python code:

# Semi-vectorized version... It's not any faster than the loop version.
num_classes = W.shape[1]
incorrect_counts = np.sum(margins > 0, axis=1)
for k in xrange(num_classes):
  # use the indices of the margin array as a mask.
  wj = np.sum(X[margins[:, k] > 0], axis=0)
  wy = np.sum(-incorrect_counts[y == k][:, np.newaxis] * X[y == k], axis=0)
  dW[:, k] = wj + wy
dW /= num_train # average out weights

incorrect_counts counts the margins for each sample where margin > 0. The result is a single column vector is m number of rows. With this, for each class k, we select the samples in which the correct class is k. Because incorrect_counts is a vector, we need to change it to a matrix of size x 1 by adding [:, np.newaxis] so that the each element of the vector will be mapped to X for scalar multiplication.

There is a faster way, by simply using a single dot product, though it may be tough to figure it out. The intuition is that aside seeing it as converting from one dimension to another, an interesting way to look at matrix multiplication is that it is a grouping of sums of specific combination of rows together; each row in B is a sum of different row combinations of the matrix X, and you control it in the matrix A.

Let’s put this idea to a simpler context. Suppose we have a 4×3 matrix X, which maps to 4 samples (row) and 3 features (column). If we set k to 2, then A is a 2×4 matrix; k number of rows and m number of samples. A^TX will then return a matrix B with dimensions 2×3.

The following code snippet:

import numpy as np
X = np.arange(12).reshape((4, 3))
A = np.zeros((2, 4))
B = A.dot(X)
print A.shape, X.shape, B.shape

Will produce the output:

(2, 4) (4, 3) (2, 3)

Each row in B maps to a class k and is a sum of specific rows in X. In A, each row maps to a class k, and each column maps to a specific sample. If we simplify B to be a boolean array of 1 and 0, then we choose 1 to select, 0 otherwise. For example:

\begin{bmatrix}0 & 1 & 0 & 1 \\1 & 0 & 1 & 0\end{bmatrix} \times \begin{bmatrix}0 & 1 & 2 \\ 3 & 4 & 5 \\ 6 & 7 & 8 \\ 9 & 10 & 12 \end{bmatrix} = \begin{bmatrix}12 & 14 & 16 \\ 6 & 8 & 10 \end{bmatrix}

Row 1 in B is a sum of row 2 and 4 of X, and row 2 in B is sum of row 1 and 3 of X. How matrix A chooses its rows is by multiplication: if the value is 0, multiplying gives an empty row and multiplying by 1 returns the row itself. If instead you change 1 to -2 or 0.5, it will be multiplied by that weight instead. Do play around this in your python console to get the intuition before proceeding.

With this we can attain a succinct formulation:

Q_{i, k} = \begin{cases} - \left(  \sum\limits_{j=1,  j \neq y_i}^{n} 1  \left(\text{margin}_{i,j} > 0 \right) \right)  & \text{if } k = y_i \\ 1 \left( \text{margin}_{i,k} > 0 \right) & \text{otherwise}\end{cases}

\nabla_{w} L = \frac{1}{m} \left( Q^{T}X \right)^T =\frac{1}{m} X^TQ

Now we can take this idea and apply in our code:

X_mask = np.zeros(margins.shape)
X_mask[margins > 0] = 1

Notice that this will factor only the incorrect classes because we previously set all the cells in the margin array that map to the correct class to 0:

margins[np.arange(num_train), y] = 0 # done this before

Of course, our X_mask matrix is incomplete because we didn’t factor in the correct classes. This is not too complicated to do. We understood that for every class k, there are rows in X where class k is the correct class. For each of these rows, we multiply it by the negation of number of incorrect classes where the margin is >0. We don’t need to recalculate this; we can already find this value for each of these rows from X_mask itself, since now each of its rows is 1 where the margin is >0 and 0 otherwise. Finding the number of incorrect classes where the margin is >0 is as simple as finding the sum of the columns for each row:

incorrect_counts = np.sum(X_mask, axis=1)

With this we can find the gradient matrix. The full implementation is below:

# Fully vectorized version. Roughly 10x faster.
X_mask = np.zeros(margins.shape)
X_mask[margins > 0] = 1
# for each sample, find the total number of classes where margin > 0
incorrect_counts = np.sum(X_mask, axis=1)
X_mask[np.arange(num_train), y] = -incorrect_counts
dW = X.T.dot(X_mask)

Lastly you should remember to regularize the gradient. The gradient with regularization is therefore:

\nabla_{w} L = \frac{1}{m} X^TQ + \lambda \sum W

The loss function is visualized:

loss vs iteration number

If the graph is drastically more fuzzy or it doesn’t seem to be decreasing, chances are somewhere in your code you are doing it wrong. Here’s what happen when I forgot to regularize the gradient:

loss vs iteration number forgot regularization

The loss is still decreasing, but not as fast as it should be.

The training and validation accuracy is visualized:

accuracy plot

In the last section code has been given to help you visualize the weights. With proper hyperparameters, more iterations will give a smoother result:

visualize weights

CS231n – Assignment 1 Tutorial – Q3: Implement a Softmax classifier

This is part of a series of tutorials I’m writing for CS231n: Convolutional Neural Networks for Visual Recognition. Go to this page to see the full listing.

To conserve space, I won’t be placing my full solutions in this post. Refer to my github repository for full source code.

The loss function is covered in the lecture and course notes. However, I find myself a little lost when looking for the formula for the gradient. What I find on the internet usually looks something like this:

\text{if } i = j :\; \frac{\partial y_i}{\partial z_i} = y_i (1 - y_i)
\text{if } i \neq j :\; \frac{\partial y_i}{\partial z_j} = -y_i y_j

And I have a lot of trouble figuring out how to implement it. Fortunately, there are notes on this, though not explicitly linked in the course site itself. Here is the link. I wasn’t able to see how these 2 formulas are also the derivative of the Softmax loss function, so anyone who is able to explain that I’d be really grateful.

For each sample, we introduce a variable p which is a vector of the normalized probabilities (normalize to prevent numerical instability. Refer to course notes: “Practical issues: Numeric stability.“):

f_i = W x_i

p_k = \frac{e^{f_k-\max_j f_j}}{ \sum_j e^{f_j-\max_j f_j} }

In my naive solution, this is implemented as a lambda function:

f_i = X[i].dot(W)
f_i -= np.max(f_i)
sum_j = np.sum(np.exp(f_i))
p = lambda k: np.exp(f_i[k]) / sum_j

The cross-entropy loss function L for a sample i is therefore:

L_i = -\log\left(p_{y_i}\right)

Total loss for all m samples with regularization is then:

L = \frac{1}{m}\sum_i \left[-\log\left(p_{y_i}\right)\right] + \frac{1}{2}\lambda \sum W^2

The gradient for a class k for a given sample i is therefore:

\nabla_{w_k}L_i = (p_k -1\{y_i = k\})x_i

(Recall that 1 is the indicator function that is one if the condition inside is true or zero otherwise)

When converting to vectorized implementation, as with SVM, you need to somehow change to use a single dot product operation. Fortunately this is simpler than SVM. We define a matrix Q where the number of rows (each sample as i) is the sample size, and the columns (each class as k) being the number of classes:

Q_{i, k} = p_k -1\{y_i = k\}

\nabla_{w} L = \frac{1}{m}  X^TQ + \lambda \sum W

Here is a snippet of the implementation:

num_train = X.shape[0]
f = X.dot(W)
f -= np.max(f, axis=1, keepdims=True) # max of every sample
sum_f = np.sum(np.exp(f), axis=1, keepdims=True)
p = np.exp(f) / sum_f
ind = np.zeros_like(p)
ind[np.arange(num_train), y] = 1
Q = p - ind
dW = X.T.dot(Q)

Though not in the python notebook, I added the loss plot function (with some slight alterations) from the SVM notebook to visualize the loss as the number of iterations increase. You can refer to my python notebook for the code; I will only show you the plot result, which looks remarkably similar to SVM:

softmax_loss_plot

You should be able to get a validation accuracy pretty close to 0.4 (I got about 0.39).

When visualizing the weights, you should get a somewhat similar result to the SVM classifier:

softmax_visualize_weights

Converting a Java Project to Use JPA

In this post I walk through some of the gotchas when converting a java application that works with raw SQL strings to one using ORM (Object Relational Mapping) via JPA (Java Persistence API). I will be converting a simple application from “Generic Java GUI Netbeans Project with Embedded Database” that only has 2 entities: Post and Comment.

The finished product is in a git branch called JPA. If you don’t use git you can download this sample project as a zip from MediaFire.

You can view all the changes to convert to JPA in this Github diff.

Netbeans makes the initial setup very simple by generating persistence.xml (you find the persistence unit name here) for you, as well as the the entities for you from your database.

SQL needs to be rewritten to Java Persistence Query Language

This isn’t much an issue really; in the long run it does you a favour since it is database vendor independent.

Change from:

ResultSet rs = db.executeQuery("SELECT * FROM post ORDER BY date_created DESC");

To:

List<Post> rs = db.createQuery("SELECT p FROM Post p ORDER BY p.dateCreated DESC").getResultList();

Default Values are Lost

I noticed something strange when adding a Post entity: the created_date attribute shows up as a null when I convert to use JPA. My DDL (Database Definition Language) looks like this (Derby DB SQL):

CREATE TABLE post ( 
    id INT PRIMARY KEY GENERATED ALWAYS AS IDENTITY(START WITH 1, INCREMENT BY 1),
    name VARCHAR(250) NOT NULL,
    content VARCHAR(500),
    date_created TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

So each time I create a Post, I’m expecting the date_created attribute to show the current date, but it doesn’t. So all the SQL code where I have DEFAULT is basically replaced with null when I use JPA.

Great…

The workaround is to code the default values into the attribute field of the entity classes. Here is the dateCreated attribute inside Post:

@Column(name = "DATE_CREATED")
@Temporal(TemporalType.TIMESTAMP)
private Date dateCreated = new Date(); // new Date() returns the current timestamp

Exceptions in JPA Only Happen in Runtime

So when converting the code, I realized that in the places where SQLException would appear, Netbeans puts up an error saying that SQLException doesn’t happen here:

Sqlexception is never thrown

That’s ok I think. But what’s weird was that it offered to remove the try-catch block as a solution! Wow wow wow, stop. Aren’t there exceptions? Well, turns out there’s PersistenceException.

The problem? It’s a subclass of RuntimeException. I don’t exactly know what was the reason the exception was happening in runtime, but without the try-catch block, the procedure is going to fail (Here the Post entity cannot have null value for Name attribute) silently.

Now for a before and after. Before:

String sql = "INSERT INTO post (name, title, content) VALUES (?, ?, ?)";
try {
    PreparedStatement ps = core.DB.getInstance().getPreparedStatement(sql);
    ps.setString(1, authorField.getText());
    ps.setString(2, titleField.getText());
    ps.setString(3, postTxtArea.getText());
    ps.executeUpdate();
    
    JOptionPane.showMessageDialog(this, "Post has successfully been added.", "Successfully added post!", JOptionPane.INFORMATION_MESSAGE);
    dispatchEvent(new WindowEvent(this, WindowEvent.WINDOW_CLOSING));
} catch (SQLException ex) {
    JOptionPane.showMessageDialog(this, ex.toString(), "Invalid content... or some shit like that", JOptionPane.ERROR_MESSAGE);
}

After:

try {
    Post p = new Post();
    p.setContent(postTxtArea.getText());
    p.setName(authorField.getText());
    p.setTitle(titleField.getText());
    core.DB.getInstance().persist(p);
    
    JOptionPane.showMessageDialog(this, "Post has successfully been added.", "Successfully added post!", JOptionPane.INFORMATION_MESSAGE);
    dispatchEvent(new WindowEvent(this, WindowEvent.WINDOW_CLOSING));
} catch (PersistenceException ex) {
    JOptionPane.showMessageDialog(this, ex.toString(), "Invalid content... or some shit like that", JOptionPane.ERROR_MESSAGE);
}

The following dialog should pop up when the Author field is empty:

PersistenceException

Well, this output doesn’t just happen automatically. There’s still one more issue that I’ll get to next:

Empty Strings are not Null

In my DDL, I have a rule that Post cannot have null value for Name attribute. Yet for some reason a string “” is not a null value in JPA. It is actually stored in the database as “”.

How is “” not null?

stored as empty string

This kind of shit doesn’t happen when simply working with raw SQL strings.

There are workarounds for this online: one of them was using the annotations @Size(min=1) or @NotNull. Unfortunately I’m using Java SE 8 (@Size is currently supported up to Java EE 7 as of this writing) and I’m not using Spring (for @NotNull).

So what I ended up doing was place the validation in the setter method:

public void setName(String name) {
    if (name.trim().isEmpty()) return; // empty strings begone!
    this.name = name;
}

As you can imagine, I have to do this for every attribute in every entity that doesn’t accept empty strings.

You Need to Manually Manage One To Many Relationships

I had this issue that when I add a one to many relationship: Post has many Comments, now I add a Comment to a Post. I don’t see the changes immediately though; had to restart the application to see the new Comments. I would have thought that JPA would update the Comments collection inside the Post, but it didn’t.

Here’s how my setPostId function in my Comment entity looks like:

public void setPostId(Post postId) {
    this.postId = postId;
}

This is because (as aggravating as it sounds) in JPA, it is the responsibility of the application, or the object model to maintain relationships. By that it meant that it is the responsibility of the programmer to manually code in the links when a setter is called:

public void setPostId(Post postId) {
    this.postId = postId;
    postId.getCommentCollection().add(this); // programmer needs to add this himself
}

I’m going to be a bit honest here: this is kinda inconvenient, and retarded. In Rails this sort of things is done automatically, and it should be. In JPA I need to remind myself that each time I have a one-to-many relationship I need to manually link them together or it doesn’t update automatically/only updates after application is restarted.

Conclusion

There may be other quirks I may miss out since I’m only dealing with a very simple project that doesn’t even have update and delete, so feel free to let me know about in the comments below.

Yea, I know I complain a bit, but working with raw SQL strings is going to be pretty hard to manage in the long run. You’re dealing with simply strings, most errors only pop up in runtime, and your IDE can’t assist you much. With JPA when you’re writing in Java Persistence Language, Netbeans can actually figure out data types and attribute names:

autocomplete in java persistence language string

So unless you really need that sort of performance boost, using an ORM would be the right call.

1-0 Knapsack Problem – A Hands-on Guide (C++)

The 1-0 knapsack problem; an optimization puzzle famously solved with dynamic programming (dp). This post is merely my take on the problem, which I hope to provide a more hands-on approach.

To get started, try and attempt The Knapsack Problem (KNAPSACK) from SPOJ. Study the problem closely as I will referring to it throughout this guide. I will then explain how the general solution is derived and how dp is applied.

I assume you have known a bit of dp as a prerequisite, though if you haven’t you can check out my beginner friendly hands-on intro: 445A – Boredom – CodeForces Tutorial.

The General Solution

A dp solution is usually derived from a recursive solution. So let’s start with that.

We define 2 containers: v and c, that contains the all the values of each item and the capacity they consume respectively, starting from index 1. So for example, v[2] and c[2] returns the value and size of the 2nd item.

What are we trying to optimize here? We are trying to maximize the value; the combination of items that yields the highest value. So let us define a function B(i, w) that returns the maximum value given a scope of items (i) and the capacity of the knapsack (w). The solution to KNAPSACK will therefore be B(N, S).

In solving it recursively, imagine we have all N items with us and our knapsack completely empty (current capacity is S), and we consider the items one by one from the last one (the N-th item).

What are the base cases of this problem? When there are no items to put in (i = 0), or when the knapsack cannot contain any items (w = 0).

Before we consider some i-th item, we first need to make sure that it can fit into the knapsack given its capacity, in other words, an i-th item should not be considered if c[i] > w. If this is so, you will consider the maximum value of the the scope of items excluding the i-th item, or B(i-1, w).

So what happens when you can put the item in the knapsack? You have 2 choices: To put it in (take), or not put it in (keep).

  1. Keep: you exclude it from the scope of items in which you consider the maximum value, which is again B(i-1, w).
  2. Take: you will get the value of the i-th item you select (v[i]), BUT, we should also consider the remaining space after adding the i-th item inside (w-c[i]). When considering items to add here, we need to exclude the item we already added, so the scope of items we consider is i-1. With this remaining space and this scope of items, we also want to get the maximum value. We can do this recursively via B(i-1, w-c[i]).

Choosing between keep and take is as simple as taking the maximum of the 2.

If you piece all this together, you will get the general solution:

B(i,j) = \begin{cases} 0 \text{ if } i = 0\text{ or }w = 0\\ B(i-1, w) \text{ if } c[i] > w\\ max\Big( B(i-1, w), v[i] + B(i-1, w-c[i]) \Big) \end{cases}

Recursive Solution

Short and simple:

Plug it in the judge and you will get a TLE (Time Limit Exceeded).

Dynamic Programming Solution

We use a 2D array, DP, containing N+1 rows and S+1 columns. The rows map to the scope of i items we consider (the top considers 0 items and the bottom considers all N items). The columns map to capacity (an individual column is denoted as the w-th column) of knapsack left to right from 0 to S. A cell DP[i][w] means “this is the maximum value given i items and capacity of w“. The maximum value for N items and capacity of S is therefore DP[N][S].

We first fill DP with base cases: where i = 0 and where w = 0. This means that the first row and first column are all 0. The order in which we solve this is simple: starting where i = 1 and w = 1, for each row from top to bottom, we fill the cells from left to right.

I will continue on with an example using inputs from Tushar Roy’s YouTube video (you should check it out), but I jumbled the order to prove that ordering of items is not important. Here is the DP array:

DP-array-knapsack-1-0

Cells shaded in pale blue is when item does not fit in capacity w, or where c[i] > w. In pale green cells we have a choice: keep or take. Notice that with every row, we are adding one more item to consideration, and with every column we increase the capacity by 1.

Here is the code:

Getting the Selected Items

Some people use an auxiliary Boolean array (often called Keep) that keeps track of whether an item is selected or not. This seems to be an unnecessary occupation of space to me, since you can deduce the selected items from DP itself.

You start from DP[N][S], and from there:

  • An item is selected, if the value of the cell directly above it is not equal to the current cell. When this happens, the capacity of the knapsack reduces by the weight of the selected item. With that new capacity you select the next item.
  • An item is not selected, if the value of the cell directly above it is equal to the current cell. So we consider the next item by moving up one row; capacity remains unchanged.

This process continues until either there are no more items remaining, or the knapsack is full.

The table below shows the trail of the algorithm as it selects items (item 1 and 4) from the DP array we constructed before:

knapsack-1-0-pick-trace

Below is the recursive function:

void pick(int i, int w)
{
    if (i <= 0 || w <= 0) return;

    int k = DP[i][w];
    if (k != DP[i - 1][w]) {
        cout << i << " "; // select!
        pick(i - 1, w - c[i]); // capacity decreases
    } else {
        // move on to next item; capacity no change
        pick(i - 1, w);
    }
}

See the full implementation of this function in this gist.

Application

Enough spoon feeding! It is time for you to try out some puzzles on your own. Conveniently UVa grouped a series of 3 questions that are slight variations of the 1-0 knapsack problem. I sort them here in order of difficulty:

  1. 10130 – SuperSale (my solution)
  2. 990 – Diving for Gold (my solution)
    You will need to list down the items you select in this one.
    – Beware of the tricky formatting! There shouldn’t be a blank line at the end of your output.
  3. 562 – Dividing coins (my solution)
    In my solution, there is a tip (short comment block before the main function) you can check out if you just want some pointers to get started.

References

Generic Java GUI Netbeans Project with Embedded Database

So your team just wants to create a java GUI project that requires a database. How do you minimize the hassle of configuring a database? For this reason I created this simple generic Micropost project to serve as a template to start out.

Link to Project: https://github.com/bruceoutdoors/MicropostsExample

2015-11-30 15_48_30-New notification

This guide requires that you are already familiar with SQL and java GUI development.

Configuration and Setup

The only thing you need to run this project is Netbeans. Download the ZIP folder from the site, or git clone the repo if you know git. From there, you open up the project from Netbeans and click run.

That’s it.

Did I mention configuration? There is zero configuration. Even when you deploy your application as a JAR file, it is the same thing; it runs automatically. Just like that.

Prerequisite Knowledge

You need to understand a few things:

  1. Embedded databases
  2. Migrations

Embedded Database

An embedded database is a database that do not need a server, and is embedded in an application. This means the application itself manages the database directly. Here, I am using Derby DB (or Java DB). JDK comes prepackaged with this, though to avoid trouble I have packaged that library into the project itself.

derby-logo-web

Notice, that when you launch the application a folder called “database” and a text file “derby.log” is created. These are the files you will remove should you want to start clean.

You can view the schema and data in the “database” folder from Netbeans. Under the services tab, right click on Databases and select New Connection…

netbeans new connection

Select Java DB (Embedded) as the JDBC driver and click next. Where you see JDBC URL, append the directory of the database folder. Leave the User Name and Password field blank. Click Test Connection and it should indicate “Connection Succeeded”.

JDBC Url

Then click Next until you cannot click it anymore, followed by Finish. Now you can peep inside your embedded database, and execute queries as you please:

netbeans database service

IMPORTANT NOTE: Derby DB embedded does not support multiple connections at once. If you connect to the database via Netbeans database service, then you can’t run your application. You have do disconnect (right-click the connection and disconnect it) from the database first before running your application again.

Migrations

When you first create your database, you don’t always figure every possible entities and relationships in one shot. Inevitably, we would want to make incremental changes to add remove tables or modify existing tables, migrating the database to a different version. Chances are also that you don’t want to wipe out your existing data because of it. This is why migrations are created.

flyway-logo-tm

This project uses Flyway migration library. It is as simple as just using a series of SQL files with the prefix V1__blablabla.sql, V2__blablabla.sql and so on. These SQL files are then executed in that order. Therefore, when you make changes to your database, you simply add another migration file (or you can just modify a single migration and wipe out the database every time). If you are working together using a VC system like git, all your databases will be synchronized automatically to the latest structure with existing data intact.

Migrations are located in PROJECT_DIR/src/db/migrations:

2015-12-04 14_46_55-Search

Each time the application is launched, it automatically migrates the database to the latest version.

That is all you need to know. For everything else go figure on your own. Good luck!

Matrix Chain Multiplication with C++ code – PART 3: Extracting the Sequence

In my previous post, I wrote about finding the minimum cost of multiplying a matrix chain using dynamic programming. In this post I build on top of what we have covered to extract the sequence of multiplying the chain itself.

This is part of a 3 part series:

  1. Analysis and Design – analyze the problem and deduce a solution.
  2. Implementation – implement the algorithm to find the minimum number of operations.
  3. Extracting the Sequence – extract the sequence in which the chain is multiplied. 

Extract the Sequence

To do this we keep track of the point at which we split up the chain as prefix and suffix: the point (we define this from the previous post). We do this by storing it in another 2D array of the same size as DP, which we call splits:

int ops = DP[i][k] + DP[k + 1][j] + rc[i] * rc[k + 1] * rc[j + 1];
if (ops < DP[i][j]) {
	DP[i][j] = ops;
	splits[i][j] = k;
}

Now let us print out the elements of splits in dark blue. I will be reusing the example input from the previous post:

splits-array

How do we interpret this? Let us start from where we got the main solution; where i = 1 and j = 6. This cell splits[1][6] signifies the last operation of multiplying the matrix chain – in the above table this maps to 3. This means in the last operation, a prefix from a chain A_1 \ldots A_3 multiplies with a suffix from a chain A_{4} \ldots A_6.

Let us now consider just the prefix. How then is prefix A_1 \ldots A_3 multiplied? We find that from splits[1][3], which in this case is 1. Therefore the it is formed by multiplying a prefix A_1 \ldots A_1 and a suffix A_2 \ldots A_3. In this prefix A_1 \ldots A_1 we hit a base case where i = j. Should this happen we return the matrix A_1. The suffix A_2 \ldots A_3 is split from from splits[2][3], which is 2. This in turns hits 2 base cases and returns the matrix A_2 and A_3.

If you repeat this recursive process with the top level suffix, you parenthesize the chain as such:

parenthesized-matrices

We can generalize a function mult that returns the resultant matrix that has been multiplied in the most optimal way from the matrix chain:

mult(i, j)=\begin{cases}A_i \text{ if } i = j\\mult(i, k) \cdot mult(k+1, j)\text{ where } k = splits[i][j]\end{cases}

The main solution is therefore mult(1, 6).

Implementation

Below is an implementation of this:

Because we don’t have the contents of the matrices, I have the program output the parenthesis instead, along with the order in which it parenthesize them:

./mat-chain-mult-dp-trace < input.txt
multiply 2 and 3
multiply 1 and (2*3)
multiply 4 and 5
multiply (4*5) and 6
multiply (1*(2*3)) and ((4*5)*6)
((1*(2*3))*((4*5)*6))

Note that the numbers here are not integers but indices that map to its respective matrix in the matrix chain.

It is not that hard to convert the above code to multiply matrices. I will leave it as an exercise, should you be interested.

Application

Why is this matrix chain multiplication problem so important that most computer science undergraduate programs must include it in their syllabus? The same reason applies for most algorithms you learn: some problems are just slight variations of another problem. Understanding one solution to a problem in the marrow of its bones may help you solve another similar problem.

Therefore, I encourage you to challenge yourself on these 2 problems to help solidify your understanding:

You can find my solutions to these problems in my github SPOJ repository.

Reference