Use StringUtils.difference( )
,
StringUtils.indexOfDifference( )
, and
StringUtils.getLevenshteinDistance( )
.
StringUtils.difference( )
prints out the
difference between two strings,
StringUtils.indexOfDifference( )
returns the index
at which two strings begin to differ, and
StringUtils.getLevenshteinDistance( )
returns the
“edit distance” between two
strings. The following example demonstrates all three of these
methods:
int dist = StringUtils.getLevenshteinDistance( "Word", "World" ); String diff = StringUtils.difference( "Word", "World" ); int index = StringUtils.indexOfDifference( "Word", "World" ); System.out.println( "Edit Distance: " + dist ); System.out.println( "Difference: " + diff ); System.out.println( "Diff Index: " + index );
This code compares the strings “Word” and “World,” producing the following output:
Edit Distance: 2 Difference: ld Diff Index: 3
StringUtils.difference()
returns the difference between two strings,
returning the portion of the second string, which starts to differ
from the first. StringUtils.indexOfDifference()
returns the index at which the second string starts to diverge from
the first. The difference between
“ABC” and
“ABE” is
“E,” and the index of the
difference is 2. Here’s a more complex example:
String a = "Strategy"; String b = "Strategic"; String difference = StringUtils.difference( a, b ); int differenceIndex = StringUtils.indexOfDifference( a, b ); System.out.println( "difference(Strategy, Strategic) = " + difference ); System.out.println( "index(Strategy, Strategic) = " + differenceIndex ); a = "The Secretary of the UN is Kofi Annan." b = "The Secretary of State is Colin Powell." difference = StringUtils.difference( a, b ); differenceIndex = StringUtils.indexOfDifference( a, b ); System.out.println( "difference(..., ...) = " + difference ); System.out.println( "index(..., ...) = " + differenceIndex );
This produces the following output, showing the differences between two strings:
difference(Strategy, Strategic) = ic index(Strategy, Strategic) = 7 difference(...,...) = State is Colin Powell. index(...,...) = 17
The Levenshtein distance is calculated as the number of insertions,
deletions, and replacements it takes to get from one string to
another. The distance between
“Boat” and
“Coat” is a one letter replacement,
and the distance between “Remember”
and “Alamo” is 8—five letter
replacements and three deletions. Levenshtein distance is also known
as the edit distance
, which is the number of
changes one needs to make to a string to get from string A to string
B. The following example demonstrates the
getLevenshteinDistance( )
method:
int distance1 = StringUtils.getLevenshteinDistance( "Boat", "Coat" ); int distance2 = StringUtils.getLevenshteinDistance( "Remember", "Alamo" ); int distance3 = StringUtils.getLevenshteinDistance( "Steve", "Stereo" ); System.out.println( "distance(Boat, Coat): " + distance1 ); System.out.println( "distance(Remember, Alamo): " + distance2 ); System.out.println( "distance(Steve, Stereo): " + distance3 );
This produces the following output, showing the Levenshtein (or edit) distance between various strings:
distance(Boat, Coat): 1 distance(Remember, Alamo): 8 distance(Steve, Stereo): 3
The Levenshtein distance has a number of different applications, including pattern recognition and correcting spelling mistakes. For more information about the Levenshtein distance, see http://www.merriampark.com/ld.htm, which explains the algorithm and provides links to implementations of this algorithm in 15 different languages.
52.15.78.83