Dynamic Programming: Tabulation (Bottom-Up Approach) in Java
Understanding the tabulation (bottom-up) approach in dynamic programming, and applying it to solve problems like Longest Common Subsequence and Knapsack problem in Java.
Longest Common Subsequence (LCS) with Tabulation in Java
Introduction
The Longest Common Subsequence (LCS) problem is a classic computer science problem that finds the longest sequence of characters that is common to two given strings. The characters in the subsequence do not need to occupy consecutive positions within the original strings. Dynamic programming, specifically tabulation, is a very efficient way to solve the LCS problem. This document will walk through the tabulation method in Java, covering its logic, implementation, and complexity analysis.
Understanding Longest Common Subsequence (LCS)
Given two strings, string1
and string2
, the LCS is the longest sequence of characters that appears in both strings, but not necessarily contiguously.
For example:string1 = "ABCDGH"
string2 = "AEDFHR"
The LCS is "ADH" (length 3).
Another example:string1 = "AGGTAB"
string2 = "GXTXAYB"
The LCS is "GTAB" (length 4).
LCS with Tabulation (Dynamic Programming)
Tabulation (also known as bottom-up dynamic programming) solves the LCS problem by building a table of solutions to subproblems. The table stores the lengths of LCSs for all prefixes of the two input strings.
Table Structure
A 2D array (or table) dp[][]
is created, where dp[i][j]
represents the length of the LCS of the first i
characters of string1
and the first j
characters of string2
.
The table is constructed as follows:
dp[0][j] = 0
for allj
(the LCS of an empty string and any other string is empty).dp[i][0] = 0
for alli
(the LCS of any string and an empty string is empty).- If
string1[i-1] == string2[j-1]
, thendp[i][j] = dp[i-1][j-1] + 1
(If the last characters of the prefixes match, the LCS is extended by one). - If
string1[i-1] != string2[j-1]
, thendp[i][j] = max(dp[i-1][j], dp[i][j-1])
(If the last characters don't match, the LCS is the longer of the LCSs obtained by either excluding the last character ofstring1
or excluding the last character ofstring2
).
Java Implementation
import java.util.*;
import java.lang.*;
import java.io.*;
class LCSTabulation {
public static int lcs(String s1, String s2) {
int n = s1.length();
int m = s2.length();
int[][] dp = new int[n + 1][m + 1];
// Initialize the first row and column to 0 (already done by default in Java)
// for(int i=0; i<=n; i++) dp[i][0] = 0;
// for(int j=0; j<=m; j++) dp[0][j] = 0;
for (int i = 1; i <= n; i++) {
for (int j = 1; j <= m; j++) {
if (s1.charAt(i - 1) == s2.charAt(j - 1)) {
dp[i][j] = 1 + dp[i - 1][j - 1];
} else {
dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
}
}
}
return dp[n][m];
}
public static void main(String args[]) {
String s1 = "AGGTAB";
String s2 = "GXTXAYB";
System.out.println("The Length of Longest Common Subsequence is " + lcs(s1, s2));
}
}
Step-by-Step Code Walkthrough
- Initialization: A 2D array
dp
of size(n+1) x (m+1)
is created, wheren
andm
are the lengths ofs1
ands2
, respectively. The first row and column are implicitly initialized to 0 as that's the default value in Java. Explicitly initializing them is not required but adds to readability. - Iteration: The code iterates through the
dp
table, starting from index(1, 1)
. - Comparison:
- If
s1.charAt(i - 1) == s2.charAt(j - 1)
(the characters at the current indices match), thendp[i][j] = 1 + dp[i - 1][j - 1]
. This means the length of the LCS is increased by 1, based on the LCS of the previous prefixes. - If
s1.charAt(i - 1) != s2.charAt(j - 1)
(the characters don't match), thendp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1])
. This selects the maximum LCS length between excluding a character from s1 and excluding a character from s2.
- If
- Result: After the iteration,
dp[n][m]
contains the length of the LCS of the entire stringss1
ands2
. This value is returned.
Complexity Analysis
Time Complexity
The time complexity of the tabulation approach is O(n*m)
, where n
is the length of string1
and m
is the length of string2
. This is because the algorithm iterates through each cell of the dp
table once.
Space Complexity
The space complexity is O(n*m)
, due to the dp
table used to store the lengths of LCSs for all prefixes. While we can optimize space complexity using only two rows, the standard tabulation approach uses O(n*m) space for clarity.