When tackling problems involving two sequences, like strings or arrays, a powerful mental model is to map them onto a 2D grid. This structure is the foundation for a common class of Dynamic Programming solutions. Instead of thinking recursively, we can think tabularly.
Let's define our state. For two sequences, seq1 of length m and seq2 of length n, we create an (m+1) x (n+1) grid. Each cell, dp[i][j], will store the optimal solution for the subproblem considering the prefix of seq1 of length i and the prefix of seq2 of length j. Specifically, this means seq1[0...i-1] and seq2[0...j-1]. The extra row and column handle the base cases where one of the prefixes is empty.
The value of dp[i][j] is calculated based on the values in adjacent cells that represent smaller subproblems. This is the core of the DP transition. Let's use the (LCS) problem as our example. We want to find the length of the longest subsequence common to both seq1 and seq2.
To compute dp[i][j], we look at the last characters of the current prefixes, seq1[i-1] and seq2[j-1]. There are two possibilities:
The characters match: If seq1[i-1] == seq2[j-1], we've found a new character to add to our common subsequence. This means the LCS of the current prefixes is one character longer than the LCS of the prefixes without these last characters. So, we look at dp[i-1][j-1] and add 1.
The characters don't match: If seq1[i-1] != seq2[j-1], we can't extend the subsequence with both. The longest common subsequence must come from either ignoring the last character of seq1 (the solution for dp[i-1][j]) or ignoring the last character of seq2 (the solution for dp[i][j-1]). We take the maximum of these two options.
By filling the grid cell by cell, typically row by row, you build up the solution from the smallest subproblems (empty prefixes) to the full problem. The final answer for the entire sequences is found in the bottom-right corner of the grid, at dp[m][n]. This tabular approach turns a complex recursive problem into a straightforward grid traversal.
We've established that dp[i][j] holds the optimal solution for a prefix of length i from the first sequence and a prefix of length j from the second. But this definition creates an immediate question: what happens when i or j is zero? How do we handle a prefix of length zero?
A prefix of length zero is simply an empty sequence—an empty string "" or an empty vector. The solution for a problem involving an empty sequence is often trivial. For example, the longest common subsequence between any string and an empty string is always an empty string, with a length of 0.
To handle these base cases gracefully, we pad our DP table. If our sequences have lengths N and M, we create a table of size (N+1) x (M+1). The extra row at the top (row 0) and the extra column on the left (column 0) act as sentinels. These sentinels represent the solutions for subproblems involving an empty prefix.
This structure introduces a crucial mapping: the cell at dp[i][j] corresponds to the subproblem considering the first i elements of the first sequence and the first j elements of the second. In C++, which uses 0-based indexing, this means we're dealing with array1[0...i-1] and array2[0...j-1]. When your code inside the loops refers to dp[i][j], the corresponding elements in the original arrays are array1[i-1] and array2[j-1]. This intentional off-by-one simplifies the logic immensely. You never have to write special if statements to handle i=0 or j=0 within your main loops, as those cases are neatly handled by the pre-filled sentinel boundaries.
The sentinel row and column aren't just empty padding; they must be initialized with the correct base case values for the problem at hand. The values you place in dp[0][j] and dp[i][0] are the starting points from which all other solutions are built.
For Longest Common Subsequence (LCS), the LCS of any string and an empty string is 0. So, the entire first row and first column of the DP table are initialized to 0.
For Edit Distance, the distance between a string of length
iand an empty string isi(representingideletions). So,dp[i][0]is set toi, anddp[0][j]is set toj.
Once these base cases are correctly set, the recurrence relation can fill out the rest of the table, building upon these known starting values. Your loops can then safely run from i=1 to N and j=1 to M, accessing dp[i-1][j] or dp[i][j-1] without ever going out of bounds.
For two strings of length N and M, the DP table is typically sized (N+1) x (M+1) to accommodate the ______.
This technique of using sentinel boundaries is a cornerstone of clean and efficient DP implementations for two-sequence problems.
We've set up our (N+1) x (M+1) grid. Now it's time to fill it in. The value of each cell, dp[i][j], is calculated based on the values of its neighbors. Specifically, the solution for dp[i][j] depends on the solutions already computed in the cells directly above (dp[i-1][j]), to the left (dp[i][j-1]), and diagonally to the upper-left (dp[i-1][j-1]).
This dependency is the core of in 2D DP. The optimal solution for a larger problem (dp[i][j]) is constructed from the optimal solutions of its smaller, constituent subproblems. Let's see how this works using the Longest Common Subsequence (LCS) problem as our guide.
When filling dp[i][j], the logic splits into two cases based on whether the current characters, s1[i-1] and s2[j-1], match.
Case 1: The characters match (
s1[i-1] == s2[j-1]) When the characters are the same, we've found a new element for our common subsequence. This new character extends the longest common subsequence of the prefixes before it (s1up toi-1, ands2up toj-1). Therefore, we look at the solution for the subproblem without these characters, which isdp[i-1][j-1], and add 1.
This is the diagonal move. We carry over the result from the top-left neighbor and increment it.
Case 2: The characters do not match (
s1[i-1] != s2[j-1]) If they don't match, we can't extend the subsequence with both characters. We have to discard one. Our choice is guided by which option yields a better result:
Discard
s1[i-1]and keeps2[j-1]. The LCS is then the LCS ofs1's prefix of lengthi-1ands2's prefix of lengthj. This corresponds to the value in the cell above:dp[i-1][j].Discard
s2[j-1]and keeps1[i-1]. The LCS is the LCS ofs1's prefix of lengthiands2's prefix of lengthj-1. This corresponds to the value in the cell to the left:dp[i][j-1].Since we want the longest subsequence, we take the maximum of these two possibilities.
These are the orthogonal moves. We take the best result from our top or left neighbor.
When the characters match, the DP transition involves a ______ move.
This three-way transition is a powerful pattern. A different problem, like , uses the exact same three neighbor cells but with a different objective. Instead of maximizing a subsequence, we want to minimize the number of operations (insert, delete, replace) to transform one string into another.
The logic changes slightly:
Match (s1[i-1] == s2[j-1]): No operation is needed. The cost is 0. We carry over the result from the diagonal: dp[i][j] = dp[i-1][j-1].
Mismatch (s1[i-1] != s2[j-1]): We must perform an operation. We choose the one with the minimum cost:
s1[i-1] to s2[j-1]. Cost is 1 + the cost for the prefixes before, so 1 + dp[i-1][j-1].s1[i-1]. Cost is 1 + the cost of transforming s1's prefix of i-1 into s2's prefix of j, so 1 + dp[i-1][j].s2[j-1] to s1. Cost is 1 + the cost of transforming s1's prefix of i into s2's prefix of j-1, so 1 + dp[i][j-1].The structure is the same, but the formula adapts to the problem's goal—minimizing cost instead of maximizing length. By understanding the meaning of the diagonal, top, and left moves, you can adapt this template to a wide range of problems involving two sequences.
Now we translate the conceptual DP grid into a robust C++ implementation. The natural choice for a 2D grid in C++ is a vector of vectors. We'll size it to (n1 + 1) x (n2 + 1) to accommodate the sentinel row and column, which handle our base cases for empty prefixes.
The core of the implementation is a double-nested loop. The outer loop iterates through the rows, representing prefixes of the first sequence, while the inner loop handles the columns, or prefixes of the second. This structure systematically fills the entire DP table, building upon previously computed subproblems.
#include <vector>
#include <string>
#include <algorithm> // For std::max
int longestCommonSubsequence(std::string text1, std::string text2) {
int n1 = text1.length();
int n2 = text2.length();
// dp[i][j]: LCS of text1's first i chars and text2's first j chars.
// We initialize all values to 0, which correctly sets our base cases.
std::vector<std::vector<int>> dp(n1 + 1, std::vector<int>(n2 + 1, 0));
// Start loops at 1 because row 0 and column 0 are sentinels.
for (int i = 1; i <= n1; ++i) {
for (int j = 1; j <= n2; ++j) {
// Compare the characters corresponding to the current grid cell.
// Note the [i-1] and [j-1] to map from 1-based DP to 0-based string indices.
if (text1[i - 1] == text2[j - 1]) {
// If characters match, extend the LCS from the diagonal.
dp[i][j] = dp[i - 1][j - 1] + 1;
} else {
// If they don't match, take the best result from excluding
// either the current char of text1 (top) or text2 (left).
dp[i][j] = std::max(dp[i - 1][j], dp[i][j - 1]);
}
}
}
// The result for the full strings is in the bottom-right corner.
return dp[n1][n2];
}
For problems like Longest Common Subsequence, the final answer is neatly located at the bottom-right corner of the grid: dp[n1][n2]. This cell represents the optimal solution considering the entirety of both sequences.
This clean result is a direct consequence of how we defined our state
dp[i][j], where each cell's value is guaranteed to be optimal for the prefixes it represents.
However, not all problems have such a straightforward conclusion. In some variations, the optimal solution might not involve the full length of both sequences. For example, in a problem like finding the longest common substring, a match could end anywhere in the grid. In such cases, you would need to track the maximum value found across all dp[i][j] cells during the calculation, instead of just returning the final corner value.
The order of the nested loops generally doesn't impact correctness, as long as dp[i][j] is calculated after its dependencies. However, it can affect performance due to how memory is accessed. C++ stores 2D vectors in , meaning all elements of one row are stored contiguously in memory.
Traversing row by row, as shown in the code, is typically more cache-friendly and can be slightly faster than iterating through columns in the outer loop. While modern compilers are very smart, sticking to this standard row-by-row traversal is a good habit for predictable performance.
"One of the most common applications of nested loops is traversing 2D arrays or matrices."
With the grid logic, sentinel padding, transition formula, and now the C++ implementation locked in, you have a complete and powerful template for a wide range of DP problems.
You've mastered the standard 2D dynamic programming template, building a full O(NM) matrix to find the optimal solution. This is a robust way to solve problems like Longest Common Subsequence. But in competitive programming, memory limits can be tight. An O(NM) table for N, M up to 10,000 would require hundreds of megabytes, which often exceeds the allowed memory.
Look closely at the transition formula. To calculate any cell dp[i][j], you only need values from its neighbors: dp[i-1][j-1], dp[i-1][j], and dp[i][j-1]. Notice a pattern? The entire calculation for row i only ever depends on values from row i and row i-1. We never need to look at row i-2 or earlier. This observation is the key to a massive space optimization.
Instead of storing the whole matrix, we only need to keep track of the previous row and the current row we are calculating. This is often called the rolling array technique. We can use two 1D arrays, let's call them prev and curr, each of size M+1.
prevstores the results of thei-1row.curris used to compute the results for theirow.- After row
iis fully computed,previs updated with the values fromcurrbefore moving to the next iteration for rowi+1.
// Using the LCS problem as an example
// s1 and s2 are the input strings
int n1 = s1.length();
int n2 = s2.length();
// We only need two rows of size n2+1
vector<int> prev(n2 + 1, 0);
vector<int> curr(n2 + 1, 0);
for (int i = 1; i <= n1; ++i) {
for (int j = 1; j <= n2; ++j) {
if (s1[i - 1] == s2[j - 1]) {
// Corresponds to dp[i-1][j-1]
curr[j] = prev[j - 1] + 1;
} else {
// Corresponds to dp[i-1][j] and dp[i][j-1]
curr[j] = max(prev[j], curr[j - 1]);
}
}
// The current row becomes the previous row for the next iteration
prev = curr;
}
// The final answer is the last element of the last computed row
// return prev[n2]; or curr[n2]; they are the same at the end
This simple change reduces the space complexity from O(NM) to O(2M), which is just O(M). If M is smaller than N, you can swap the roles of the two sequences to get O(min(N, M)) space. But we can do even better.
Can we get rid of one more array? Yes, but it requires careful thought about the order of operations. Let's try to use a single 1D array, dp, of size M+1. When we compute the values for the current row i, we will be overwriting the values from the previous row i-1 in the same dp array.
The problem arises with the diagonal dependency. The formula dp[i][j] = dp[i-1][j-1] + ... needs the value from the previous row at column j-1. If we iterate through the inner loop from left to right ( j from 1 to M), when we compute dp[j], the value at dp[j-1] has already been updated for the current row. We've lost the dp[i-1][j-1] we needed.
The solution is to reverse the inner loop's direction. By iterating j from M down to 1, when we compute dp[j], the value at dp[j-1] still holds the value from the previous row, dp[i-1][j-1]. We must also temporarily store the old dp[i-1][j-1] value before it gets updated in the j-1 iteration. A single variable can hold this diagonal value for us.
int n1 = s1.length();
int n2 = s2.length();
// Ensure n2 is the smaller dimension if possible
if (n1 < n2) {
swap(s1, s2);
swap(n1, n2);
}
vector<int> dp(n2 + 1, 0);
for (int i = 1; i <= n1; ++i) {
int prev_val = 0; // Stores dp[i-1][j-1]
for (int j = 1; j <= n2; ++j) {
int temp = dp[j]; // Stores dp[i-1][j] before it's overwritten
if (s1[i - 1] == s2[j - 1]) {
dp[j] = prev_val + 1;
} else {
// dp[j] is dp[i-1][j] and dp[j-1] is dp[i][j-1]
dp[j] = max(dp[j], dp[j - 1]);
}
prev_val = temp; // Update prev_val for the next j
}
}
// Final answer
// return dp[n2];
This final version achieves O(min(N, M)) space complexity. While it might look slightly more complex, the performance gain in memory-constrained environments is substantial. The time complexity remains O(N*M), but due to better , this optimized version can sometimes run slightly faster in practice.
"If we closely look the relation,
dp[ind][target] = dp[ind-1][target] || dp[ind-1][target-arr[ind]] We see that to calculate a value of a cell of the dp array, we need only the previous row values (say prev). So, we don’t need to store an entire array. Hence we can space optimize it."
Knowing how to apply these space-saving techniques is a vital skill. It can turn an unsolvable problem into a solvable one and demonstrates a deeper understanding of the underlying data dependencies in dynamic programming.
You've mastered the 'Two-Array' template for Longest Common Subsequence (LCS). You know how to define the state dp[i][j], initialize boundary conditions, and implement the transition logic. Now, let's see how powerful this mental model really is. The core structure—a 2D grid where each cell depends on its top, left, and top-left neighbors—is not limited to LCS. By tweaking the transition rules and boundary conditions, we can solve a whole family of related problems.
This pattern recognition is key in competitive programming. When you see a problem involving optimal solutions for two sequences, your first thought should be this 2D grid. The next step is to figure out exactly how the three-neighbor dependency needs to be modified.
A classic variation is calculating the between two strings. This is the minimum number of single-character edits (insertions, deletions, or substitutions) required to change one string into the other. The state definition remains the same: dp[i][j] is the minimum edit distance between the first i characters of s1 and the first j characters of s2.
The key difference lies in the transition logic. Unlike LCS where we maximize a score, here we minimize a cost. If s1[i-1] == s2[j-1], no operation is needed, so the cost is 0, and we take the value from dp[i-1][j-1]. If they don't match, we must perform an operation. We have three choices, each adding a cost of 1 to a previous state:
- Substitution: Change
s1[i-1]tos2[j-1]. The cost is1 + dp[i-1][j-1].- Deletion: Delete
s1[i-1]. The cost is1 + dp[i-1][j].- Insertion: Insert
s2[j-1]. The cost is1 + dp[i][j-1].
We take the minimum of these three options. The boundary conditions also change. The edit distance from a string of length i to an empty string is i (i deletions), so dp[i][0] = i. Similarly, dp[0][j] = j.
Another common problem is finding the length of the Longest Common Substring. This sounds similar to LCS, but has a crucial difference: a substring must be contiguous. An can have gaps, but a substring cannot.
This single constraint dramatically changes the transition. The state dp[i][j] now represents the length of the common substring ending at s1[i-1] and s2[j-1].
If s1[i-1] == s2[j-1], we can extend the common substring ending at the previous characters. So, dp[i][j] = dp[i-1][j-1] + 1.
If s1[i-1] != s2[j-1], the contiguity is broken. A common substring cannot end here. We reset the count: dp[i][j] = 0. This is the key insight. Unlike LCS, there's no dependency on the top or left cells when there's a mismatch.
Because a mismatch resets the count, the longest common substring might not end at the very last characters of the strings. The final answer isn't necessarily
dp[n1][n2]. Instead, it's the maximum value found anywhere in the entiredptable.
#include <iostream>
#include <vector>
#include <string>
#include <algorithm>
int longestCommonSubstring(const std::string& s1, const std::string& s2) {
int n1 = s1.size();
int n2 = s2.size();
// dp[i][j]: length of common substring ending at s1[i-1] and s2[j-1]
std::vector<std::vector<int>> dp(n1 + 1, std::vector<int>(n2 + 1, 0));
int maxLength = 0; // The answer can be anywhere in the table
for (int i = 1; i <= n1; ++i) {
for (int j = 1; j <= n2; ++j) {
if (s1[i - 1] == s2[j - 1]) {
// If characters match, extend the previous common substring
dp[i][j] = dp[i - 1][j - 1] + 1;
// Update the global maximum length found so far
maxLength = std::max(maxLength, dp[i][j]);
} else {
// If they don't match, the streak is broken.
dp[i][j] = 0;
}
}
}
return maxLength;
}
The 'Two-Array' template provides a powerful skeleton. Recognizing that a new problem fits this mold is half the battle. The other half is correctly defining the state and tweaking the transition rules to match the problem's specific constraints—whether that means minimizing costs, resetting on mismatches, or simply counting matches.
When applying the general 'Two-Array' dynamic programming template for sequence problems, the calculation for a cell dp[i][j] most commonly relies on which other cells?
Its top (dp[i-1][j]), left (dp[i][j-1]), and top-left (dp[i-1][j-1]) neighbors.
Only the top (dp[i-1][j]) and left (dp[i][j-1]) neighbors.
Only the top-left neighbor (dp[i-1][j-1]).
The entire previous row and the entire previous column.
In the context of the Longest Common Substring problem, the state dp[i][j] represents the length of the common substring ending at s1[i-1] and s2[j-1]. What happens if s1[i-1] != s2[j-1]?
dp[i][j] = 0
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
dp[i][j] = 1 + min(dp[i-1][j], dp[i][j-1])
dp[i][j] = dp[i-1][j-1]
When calculating the Edit Distance between a non-empty string s1 of length m and an empty string, what is the value of dp[m][0]?
0
The value is undefined and not needed for the algorithm.
1
m
To solve the Edit Distance problem using the 'Two-Array' template, how should you calculate dp[i][j] when the characters s1[i-1] and s2[j-1] are different?
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
dp[i][j] = 1 + min(dp[i-1][j], dp[i][j-1], dp[i-1][j-1])
dp[i][j] = 1 + dp[i-1][j-1]
When applying the general 'Two-Array' dynamic programming template for sequence problems, the calculation for a cell dp[i][j] most commonly relies on which other cells?
Its top (dp[i-1][j]), left (dp[i][j-1]), and top-left (dp[i-1][j-1]) neighbors.
In the context of the Longest Common Substring problem, the state dp[i][j] represents the length of the common substring ending at s1[i-1] and s2[j-1]. What happens if s1[i-1] != s2[j-1]?
dp[i][j] = 0
When calculating the Edit Distance between a non-empty string s1 of length m and an empty string, what is the value of dp[m][0]?
m
To solve the Edit Distance problem using the 'Two-Array' template, how should you calculate dp[i][j] when the characters s1[i-1] and s2[j-1] are different?
dp[i][j] = 1 + min(dp[i-1][j], dp[i][j-1], dp[i-1][j-1])
By mastering these variations, you move from simply knowing an algorithm to understanding a fundamental problem-solving pattern.
2D DP Matrix
Map two sequences to a 2D grid where dp[i][j] stores the optimal solution for prefixes of length i and j, enabling subproblem analysis.
Sentinel Padding Use an (N+1)x(M+1) matrix with a padded row and column to handle empty prefixes, simplifying base cases and preventing index errors during transitions.
DP Transitions
Calculate each cell dp[i][j] using values from its top (dp[i-1][j]), left (dp[i][j-1]), and diagonal (dp[i-1][j-1]) neighbors based on problem-specific logic.
C++ Implementation
Translate the logic into code using a vector<vector<int>> and nested loops from 1 to N/M, with dp[n][m] often holding the final answer.
Space Optimization Reduce space from O(N*M) to O(min(N, M)) by using a "rolling array," since each new row calculation only depends on the immediately preceding row.
Template Application Adapt the core 2D matrix structure to solve related problems like Edit Distance by modifying only the transition logic and base cases.
dp[i][j] represents the best solution for seq1's prefix of length i and seq2's prefix of length j.(i, j) is built upon solutions for (i-1, j), (i, j-1), and (i-1, j-1).seq1[i-1] == seq2[j-1].(N+1) x (M+1) for sequences of length N and M.dp[i][j] corresponds to seq1[i-1] and seq2[j-1] due to 0-based array indexing in C++.dp[i-1][j-1] when i or j is 1.dp[i-1][j-1] cell is used when considering both seq1[i-1] and seq2[j-1] together.dp[i-1][j] (top) and dp[i][j-1] (left) represent excluding the last element from one of the sequences.seq1[i-1] == seq2[j-1], the answer extends the diagonal: dp[i][j] = 1 + dp[i-1][j-1].dp[i][j] = max(dp[i-1][j], dp[i][j-1]).dp[i][j] can only be constructed from the optimal solutions of its neighbors.std::vector<std::vector<int>> dp(n1 + 1, std::vector<int>(n2 + 1, 0)); for initialization.for (int i = 1; i <= n1; ++i) and for (int j = 1; j <= n2; ++j).s1[i-1] and s2[j-1] to update dp[i][j].dp[n1][n2].n1, n2 instead of just n, m) to improve code readability.dp[i][j] only relies on dp[i-1] values (the previous row) and dp[i] values (the current row).dp[0] and dp[1], using i % 2 to alternate between them.dp[i-1][j-1] value in a temporary variable before it's overwritten.(i, j) means taking 1 + min(dp[i-1][j], dp[i][j-1], dp[i-1][j-1]).dp[i][j] = 0. The final answer is the maximum value in the whole table, not just dp[n1][n2].(N+1)x(M+1) table and nested loops form a reusable skeleton.if/else block that defines the transition.dp[i][j]
A state in a dynamic programming table that stores the optimal solution for the subproblem involving the first i elements of the first sequence and the first j elements of the second sequence.
dp[i][j] holds the length of the LCS between s1[0...i-1] and s2[0...j-1].Sentinel Padding
The practice of adding an extra row and column to a DP table (making it (N+1)x(M+1)) to represent base cases involving empty prefixes, which simplifies transition logic.
i can start at 1, and dp[i-1] is always a valid access.Transition Formula
The set of equations that defines how to compute a DP state dp[i][j] based on the values of previously computed states, typically its neighbors.
dp[i][j] = max(dp[i-1][j], dp[i][j-1]).Rolling Array A space optimization technique where only a small, constant number of rows (or columns) of the DP table are kept in memory, overwriting older rows as the computation proceeds.
Levenshtein Distance A specific name for the Edit Distance problem, which measures the minimum number of single-character edits (insertions, deletions, or substitutions) required to change one word into the other.
Step 1: Initialization
(N+1) x (M+1) table and fill it with zeros. This sets up the sentinel row and column.dp[0][j] = 0 and dp[i][0] = 0 for all i, j. This correctly states that the LCS with an empty string is 0.Step 2: Loop Through Rows (Outer Loop)
i from 1 to N.for (int i = 1; i <= n1; ++i) starts. Each i corresponds to a new prefix of the first sequence.Step 3: Loop Through Columns (Inner Loop)
i, iterate through the second sequence with an index j from 1 to M. At this point, you are ready to calculate dp[i][j].for (int j = 1; j <= n2; ++j) starts. You are at cell (i, j).Step 4: Check for Match
s1[i-1] and s2[j-1].if (s1[i-1] == s2[j-1]) condition.i-1 and j-1 to access the 0-indexed strings.Step 5a: Apply Match Transition (Diagonal)
dp[i][j] = 1 + dp[i-1][j-1].Step 5b: Apply Mismatch Transition (Top/Left)
s1[i-1] or s2[j-1]. Set dp[i][j] = max(dp[i-1][j], dp[i][j-1]).Step 6: Extract Final Answer
dp[n1][n2], is the length of the LCS for the entire sequences.dp[n1][n2].Topic: 2D DP Matrix
Formula:
Topic: DP Transitions
Formula:
Topic: Template Application
Problem: Given two strings text1 and text2, return the length of their longest common subsequence.
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
int longestCommonSubsequence(std::string text1, std::string text2) {
int n1 = text1.length();
int n2 = text2.length();
// Step 1: Initialize (N+1)x(M+1) DP table with sentinel padding.
// The vector is auto-filled with 0, which is the correct base case.
std::vector<std::vector<int>> dp(n1 + 1, std::vector<int>(n2 + 1, 0));
// Step 2 & 3: Loop through rows and columns starting from 1.
for (int i = 1; i <= n1; ++i) {
for (int j = 1; j <= n2; ++j) {
// Step 4: Check for a character match.
// Note: Accessing strings with i-1 and j-1.
if (text1[i - 1] == text2[j - 1]) {
// Step 5a: Match case - extend from diagonal.
dp[i][j] = 1 + dp[i - 1][j - 1];
} else {
// Step 5b: Mismatch case - take max of top and left.
dp[i][j] = std::max(dp[i - 1][j], dp[i][j - 1]);
}
}
}
// Step 6: The final answer is in the bottom-right corner.
return dp[n1][n2];
}
int main() {
std::string s1 = "abcde";
std::string s2 = "ace";
std::cout << "LCS of '" << s1 << "' and '" << s2 << "' is: "
<< longestCommonSubsequence(s1, s2) << std::endl; // Output: 3
return 0;
}
⚠️ Common pitfall: Starting loops from i=0 and j=0 without special handling for boundary conditions, which leads to i-1 or j-1 being -1 and causing an error. The sentinel padding approach avoids this entirely.
Topic: C++ Implementation
Problem: Same as above, but implement with O(min(N, M)) space complexity.
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
int longestCommonSubsequenceSpaceOptimized(std::string text1, std::string text2) {
// Ensure text1 is the smaller string to optimize space.
if (text1.length() < text2.length()) {
std::swap(text1, text2);
}
int n1 = text1.length();
int n2 = text2.length();
// Use two rows: 'prev' for dp[i-1] and 'curr' for dp[i].
std::vector<int> prev(n2 + 1, 0);
std::vector<int> curr(n2 + 1, 0);
for (int i = 1; i <= n1; ++i) {
for (int j = 1; j <= n2; ++j) {
if (text1[i - 1] == text2[j - 1]) {
// Match: Depends on the diagonal from the PREVIOUS row.
curr[j] = 1 + prev[j - 1];
} else {
// Mismatch: Depends on top (prev row) and left (curr row).
curr[j] = std::max(prev[j], curr[j - 1]);
}
}
// The current row becomes the previous row for the next iteration.
prev = curr;
}
// The final answer is the last element of the last computed row.
return prev[n2];
}
int main() {
std::string s1 = "abcde";
std::string s2 = "ace";
std::cout << "Space-optimized LCS of '" << s1 << "' and '" << s2 << "' is: "
<< longestCommonSubsequenceSpaceOptimized(s1, s2) << std::endl; // Output: 3
return 0;
}
⚠️ Common pitfall: Mixing up prev and curr arrays. When calculating curr[j], prev[j] represents the cell directly above (dp[i-1][j]), prev[j-1] is the diagonal (dp[i-1][j-1]), and curr[j-1] is the cell to the left (dp[i][j-1]). Getting this mapping wrong breaks the logic.
Topic: Space Optimization