Dynamic Programming: Tabulation (Bottom-Up Approach) in Java

Understanding the tabulation (bottom-up) approach in dynamic programming, and applying it to solve problems like Longest Common Subsequence and Knapsack problem in Java.


Longest Common Subsequence (LCS) with Tabulation in Java

Introduction

The Longest Common Subsequence (LCS) problem is a classic computer science problem that finds the longest sequence of characters that is common to two given strings. The characters in the subsequence do not need to occupy consecutive positions within the original strings. Dynamic programming, specifically tabulation, is a very efficient way to solve the LCS problem. This document will walk through the tabulation method in Java, covering its logic, implementation, and complexity analysis.

Understanding Longest Common Subsequence (LCS)

Given two strings, string1 and string2, the LCS is the longest sequence of characters that appears in both strings, but not necessarily contiguously.

For example:
string1 = "ABCDGH"
string2 = "AEDFHR"
The LCS is "ADH" (length 3).
Another example:
string1 = "AGGTAB"
string2 = "GXTXAYB"
The LCS is "GTAB" (length 4).

LCS with Tabulation (Dynamic Programming)

Tabulation (also known as bottom-up dynamic programming) solves the LCS problem by building a table of solutions to subproblems. The table stores the lengths of LCSs for all prefixes of the two input strings.

Table Structure

A 2D array (or table) dp[][] is created, where dp[i][j] represents the length of the LCS of the first i characters of string1 and the first j characters of string2.

The table is constructed as follows:

  • dp[0][j] = 0 for all j (the LCS of an empty string and any other string is empty).
  • dp[i][0] = 0 for all i (the LCS of any string and an empty string is empty).
  • If string1[i-1] == string2[j-1], then dp[i][j] = dp[i-1][j-1] + 1 (If the last characters of the prefixes match, the LCS is extended by one).
  • If string1[i-1] != string2[j-1], then dp[i][j] = max(dp[i-1][j], dp[i][j-1]) (If the last characters don't match, the LCS is the longer of the LCSs obtained by either excluding the last character of string1 or excluding the last character of string2).

Java Implementation

 import java.util.*;
import java.lang.*;
import java.io.*;

class LCSTabulation {

    public static int lcs(String s1, String s2) {
        int n = s1.length();
        int m = s2.length();

        int[][] dp = new int[n + 1][m + 1];

        // Initialize the first row and column to 0 (already done by default in Java)
        // for(int i=0; i<=n; i++) dp[i][0] = 0;
        // for(int j=0; j<=m; j++) dp[0][j] = 0;

        for (int i = 1; i <= n; i++) {
            for (int j = 1; j <= m; j++) {
                if (s1.charAt(i - 1) == s2.charAt(j - 1)) {
                    dp[i][j] = 1 + dp[i - 1][j - 1];
                } else {
                    dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
                }
            }
        }

        return dp[n][m];
    }

    public static void main(String args[]) {

        String s1 = "AGGTAB";
        String s2 = "GXTXAYB";

        System.out.println("The Length of Longest Common Subsequence is " + lcs(s1, s2));
    }
} 

Step-by-Step Code Walkthrough

  1. Initialization: A 2D array dp of size (n+1) x (m+1) is created, where n and m are the lengths of s1 and s2, respectively. The first row and column are implicitly initialized to 0 as that's the default value in Java. Explicitly initializing them is not required but adds to readability.
  2. Iteration: The code iterates through the dp table, starting from index (1, 1).
  3. Comparison:
    • If s1.charAt(i - 1) == s2.charAt(j - 1) (the characters at the current indices match), then dp[i][j] = 1 + dp[i - 1][j - 1]. This means the length of the LCS is increased by 1, based on the LCS of the previous prefixes.
    • If s1.charAt(i - 1) != s2.charAt(j - 1) (the characters don't match), then dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]). This selects the maximum LCS length between excluding a character from s1 and excluding a character from s2.
  4. Result: After the iteration, dp[n][m] contains the length of the LCS of the entire strings s1 and s2. This value is returned.

Complexity Analysis

Time Complexity

The time complexity of the tabulation approach is O(n*m), where n is the length of string1 and m is the length of string2. This is because the algorithm iterates through each cell of the dp table once.

Space Complexity

The space complexity is O(n*m), due to the dp table used to store the lengths of LCSs for all prefixes. While we can optimize space complexity using only two rows, the standard tabulation approach uses O(n*m) space for clarity.