Longest Common Subsequence (#1143 Leetcode)

 Problem Statement

Given two strings text1 and text2, return the length of their longest common subsequenceIf there is no common subsequence, return 0.

subsequence of a string is a new string generated from the original string with some characters (can be none) deleted without changing the relative order of the remaining characters.

  • For example, "ace" is a subsequence of "abcde".

common subsequence of two strings is a subsequence that is common to both strings. 

Example 1:

Input: text1 = "abcde", text2 = "ace" 
Output: 3  
Explanation: The longest common subsequence is "ace" and its length is 3.

Example 2:

Input: text1 = "abc", text2 = "abc"
Output: 3
Explanation: The longest common subsequence is "abc" and its length is 3.

Example 3:

Input: text1 = "abc", text2 = "def"
Output: 0
Explanation: There is no such common subsequence, so the result is 0.

 

Constraints:

  • 1 <= text1.length, text2.length <= 1000
  • text1 and text2 consist of only lowercase English characters.

Optimal Solution: 

A common subsequence between two sequences define a non-crossing matching between the two sequences. 

Data Structure

  • We can use dynamic programming to solve this complex problem by breaking into smaller sub problems and avoiding redundant calculations. 
  • We use Bottom-Up Version of dynamic programming to solve this problem. That means we will start calculating the length from the first character of each given text. Then we'll proceed to the next character in text 1 and next character in text 2. 
  • As we want to compare each characters in the text 1 with each characters in text 2, we'll use tabulation approach. Eg: dp[m][n]

Code Explanation

Consider the given texts are text1 = "abcde" and text2 = "ace"

  1. Calculate the length of each texts. Since the length is between 1 and 1000, use short as data type.
  2. Now, declare a dp vector with size as length of each texts + 1 calculated at step 1. Now go to step 4.
  3. Have an imaginary row (-1) and column(-1) and set it as 0.



  4. Comparing character by character in each texts.
  • if the characters match in both texts when i = 1 and j = 1
    • Increment the previous value with 1. This means we need to have a pre-set value before this. That's the reason we have step 3 and step 2 with size as m+1 and n+1

  • if the character don't match when i = 2, j = 1
    • Eg: text 1 => b and text 2 => a , then do either of the following:
      • skip or ignore char a from text 1. If you skip a, then you will ignore the calculated length of the common subsequence in dp[0][0]. 
      • skip or ignore char b from text 1. If you skip b, you will still hold the calculated length of the common subsequence in dp[0][0]. (preferred)
    • Must get the max between dp[i-1, j] and dp[i, j-1]

    1. It keeps going in this way. 
    2. To get a good understanding of this formula, When i = 3, j = 3, text1[3] = c, and text2[3]=e don't match.  At this point, the longest common subsequence is bc. dp[2][3] is 2 and dp[3][2] is 1.     
    • Either you need to choose 'bc' by selecting dp[2][3] 
    • or choose 'a' by selecting dp[3][2]. 
    1. As the intention to get the longest, we will choose the maximum dp[2][3]

Dynamic Programming always come with recursive formulation. 

class Solution {
public:
    int longestCommonSubsequence(string text1, string text2) {
        short m = text1.size(), n = text2.size();
        short dp[m+1][n+1];
        for (int i = 0; i <= m; i++) {
           dp[i][0] = 0;
        }
        for (int i = 0; i <= n; i++) {
           dp[0][i] = 0;
        }
        for (int i = 1; i <= m; i++) {
           for (int j = 1; j <= n; j++) {
              dp[i][j] = (text1[i-1] == text2[j-1]) ? (dp[i-1][j-1] + 1) : max(dp[i-1][j], dp[i][j-1]);
              
           }
        }
        return dp[m][n];
    }
};


Dry-Run of the algorithm

Complexity Analysis

  • Time Complexity: O(length1 * length2)
    • The nested loops iterate through each character in both strings, filling the 2D array.
  • Space Complexity: O(length1 * length2)
    • The space used by the 2D array to store intermediate results.


Comments