search

Found

info Overview

Computes the Levenshtein edit distance between two strings, with insert, delete, and substitute counts plus similarity percent and a normalized value.

📘 How to Use

  1. Enter the two strings you want to compare into String A and String B
  2. Toggle Case sensitive and Trim whitespace to match your data
  3. Read the edit distance, similarity, normalized distance, and the insert/delete/substitute breakdown

Levenshtein Distance Calculator

0 / 2000 chars
0 / 2000 chars

Edit Distance

0 edits

Similarity

100.0 %

Normalized Distance

0.000 (0-1)

Operations Breakdown

Insertions

0

Deletions

0

Substitutions

0

※ Algorithm: Standard Levenshtein distance via dynamic programming (insertion, deletion, substitution each cost 1).

※ Limit: up to 2000 characters per string. Backtrace separates insertion / deletion / substitution counts along the optimal path.

Article

Levenshtein Distance Calculator | Edit Distance With Insert, Delete, and Substitute Counts

A calculator that returns how different two strings are as a minimum edit distance, then breaks that distance down into insertions, deletions, and substitutions alongside a similarity percent and a 0–1 normalized value.

💡 About This Tool

How many edits turn "kitten" into "sitting"? When you build spell-checkers, fuzzy search, or autocorrect, that minimum-edit count is the number your matching threshold sits on. Most online calculators stop at the bare distance, so you never see whether the gap is mostly substitutions or mostly insertions and deletions — which is exactly the part you need when you are tuning a matcher.

Levenshtein distance counts insertions, deletions, and substitutions as one edit each, and returns the fewest edits that transform one string into another. This tool computes the distance with dynamic programming, then walks the optimal path back to separate the insertion, deletion, and substitution counts. You can paste real strings from your own data and watch how the breakdown shifts, so you can pick a sensible cutoff before wiring it into code. It compares up to 2000 characters per side.

🧐 Frequently Asked Questions

How is the similarity percent calculated? It is (1 - distance / max(lenA, lenB)) * 100. Identical strings score 100%, and the score drops toward 0% as the distance approaches the length of the longer string.

What is the normalized distance for? It is the edit distance divided by the longer string's length, giving a 0–1 value. Use it to compare pairs of different lengths fairly: a distance of 3 means something different across a 7-character pair versus a 70-character one.

When should I use Levenshtein instead of Jaro-Winkler? Levenshtein weights every character edit equally, which suits addresses and long company names. Jaro-Winkler rewards a matching prefix and suits short personal names, so pick the metric that matches your data's shape.

How is this different from Damerau-Levenshtein? Plain Levenshtein counts a swap of two adjacent characters (a transposition) as two edits. If you want "conversion" vs "convresion" treated as a single typo, use a Damerau-Levenshtein variant that allows transpositions.

How are case and emoji handled? Turning off Case sensitive treats A and a as equal. Comparison runs on code units, so a surrogate-pair emoji is counted internally as two units, which can inflate the distance for emoji-heavy text.

What happens past 2000 characters? Input is clipped to 2000 characters per string. The dynamic-programming step grows with the product of the two lengths, so the cap keeps the calculation practical inside a browser.

📚 Fun Facts

Soviet mathematician Vladimir Levenshtein defined this distance in 1965 in a paper on error-correcting codes, long before it became a staple of spell-checkers, DNA sequence alignment, OCR correction, and the diff algorithms behind version control. A neat property: the distance is always at least the difference in lengths and never more than the longer string's length, which is why the normalized value stays cleanly between 0 and 1. When you debug a fuzzy matcher, reading the operation breakdown often tells you more than the raw number — a pile of substitutions points to character swaps, while lopsided insertions or deletions usually mean one string simply has extra content.