Given 2 strings, find the minimum number of operations required to change one string to the other. An operation is defined as either replacement of a character, deletion of a character, or addition of a character. This problem is also called Levenshtein Distance. Levenshtein distance is a string metric for measuring the difference between two sequences.
Levenshtein Distance can be solved with the help of dynamic programming.
Here, we are going to create a 2D array / matrix called edits
, this matrix will hold the minimum number of substitutions, deletions and additions of creating specific substrings.
In the given image, empty strings are added and those will always contain 0 values.
Looking at row[0][0]
- The question is:
How many operations are required to convert an
empty string
to anempty string
, it will always be 0.
Looking at row[0][1]
- The question is:
How many operations are required to convert an
empty
string to the charactery
- 1 operation ofaddition
is required.
and so on..
This is how we are going to create the first row and the first column.
Let's look at row[1][1]
:
How many operations are required to convert
a
toy
- 1 substitute operation.
Let's look at row[1][2]
:
How many operations are required to convert
a
toya
- It is 1, because we can either removea
or adda
.
Keeping in mind that str[i] === str[j]
here, we take the previous value we encounter and update the current value at index i
- From here, we can devise an algorithm for Dynamic Programming.
Finally, the matrix looks like this:
The solution is edits[str2.length][str1.length]
- This number will be the minimum number of operations required to change one string to the other.
Pseudo code
if (str[i] === str[j]) {
edits[i][j] = edits[i-1][j-1] // Take the previous value
} else {
// Take 1 + Minimum value of neighbouring index.
// We add 1 because atleast one addition or deletion
// operation will be required.
}
O(mn)
where m = length of first string, n = length of second string. Because we are traversing a 2D array.O(mn)
where m = length of first string, n = length of second string. Because we are strong the 2D array / creating it.