search

Found

info Overview

Build the KMP failure-function (LPS) table, list every match index, and compare the linear-time KMP comparison count with a naive search on the same text.

📘 How to Use

  1. Type the substring you want to find into the Pattern field
  2. Paste or type the text you want to scan into the Haystack field
  3. Read the match count, KMP compares, naive compares, and speedup ratio
  4. Inspect the LPS failure table and the list of match start indexes

KMP Pattern Search Step Counter

String to search for. No length limit, shorter is faster.

Text to scan. Character-level search; whitespace and symbols are significant.

Matches
0
KMP compares
0
Naive compares
0
Speedup

LPS[i] = length of the longest proper prefix of p[0..i] that is also a suffix; used to skip on mismatch.

※ KMP is O(n+m); naive matching is O(n·m) in the worst case. LPS preprocessing is O(m).

※ A step is one character comparison with the text. LPS fallback moves are not counted as steps.

Article

KMP Pattern Search Step Counter | See the Failure Table and Step Savings

Watch the Knuth-Morris-Pratt algorithm work on your own text. This tool builds the LPS failure function table for your pattern, runs the linear-time KMP scan, and counts how many character comparisons KMP needs versus a naive nested-loop search on the same input.

💡 About this tool

Most KMP explanations stop at "it's O(n+m), trust me." That leaves a gap: you can recite the complexity bound but still not feel where the savings come from, or why the failure function matters. When you sit down to implement KMP for a class assignment or a coding interview, the LPS table is exactly the part that trips people up.

This counter closes that gap with concrete numbers from your input. Type any pattern and text, and you see the LPS array row by row (i, p[i], lps[i]), the exact match positions, and the head-to-head comparison count. Because both the KMP and naive figures come from the same haystack, you can find the break-even point yourself: on random text the gap is small, but on a pattern like aaaab against a long run of as, naive search blows up while KMP stays linear. That contrast is the whole point of the algorithm, and here you can produce it on demand.

🧐 Frequently Asked Questions

What does the LPS table actually store? LPS[i] is the length of the longest proper prefix of p[0..i] that is also a suffix of p[0..i]. On a mismatch, KMP jumps the pattern pointer back to LPS[j-1] instead of restarting at zero, which is what removes the redundant comparisons.

How is a "compare" counted here? One step equals one comparison between a pattern character and a text character. The pointer rewinds that KMP does through the LPS table are not counted as steps, because they touch no text character. Naive search counts every inner-loop character comparison.

Why is the speedup sometimes barely above 1x? On natural-language or random text, mismatches usually happen on the first character, so naive search rarely backtracks far. KMP's advantage shows up on patterns with long repeated prefixes against adversarial text, where naive search re-scans the same characters many times.

Does it find overlapping matches? Yes. After a full match KMP resets j to LPS[m-1], so overlapping occurrences (like aa inside aaaa) are all reported. Match indexes are 0-based.

Are whitespace and symbols significant? Yes. The search runs at the character level, so spaces, line breaks, and punctuation are matched literally. There is no length limit on either field, though shorter patterns build their LPS table faster.

📚 Fun facts

KMP was published in 1977 by Donald Knuth, James Morris, and Vaughan Pratt, and the story goes that Morris discovered the linear-time idea first while building a text editor, before Knuth formalized it via automata theory. The failure function is essentially a tiny finite-state machine compiled from the pattern. The same prefix-function idea reappears far from string search: it powers competitive-programming tricks like counting the distinct borders of a string or finding the shortest period of a repeating sequence.