System and method for bit-map based keyword spotting in communication traffic
First Claim
1. A method for searching a text that comprises a sequence of symbols for occurrences of a pattern of the symbols, using a sequence of evaluations at respective positions in the text, the method comprising:
- performing a first evaluation of whether the pattern occurs in a current position in the text, wherein the performing comprises;
positioning the pattern so as to end at the current position in the text,comparing the symbols of the pattern to the corresponding symbols of the text while progressing from an end to a beginning of the pattern, andupon finding a mismatch between the pattern and the text, calculating the jump to the subsequent position depending at least on the mismatch;
if the pattern does not occur in the current position, calculating a size of a jump to a subsequent position in the text based on results of the first evaluation; and
performing a second evaluation of whether the pattern occurs in the subsequent position in the text.
3 Assignments
0 Petitions
Accused Products
Abstract
Methods and systems for locating occurrences of a search pattern in a body of text. A processor searches the text for one or more occurrences of a pattern. Both the text and the pattern comprise symbols of some alphabet. In preparation for the search, the processor defines a respective bit-map for each alphabet symbol. Using the bit-maps, the processor carries out a highly efficient process of searching the text for occurrences of the pattern. The processor then scans the pattern backwards using the bit-maps, symbol by symbol, attempting to match the symbols of the pattern to the corresponding symbols of the text. If a match is not found, the processor calculates the size of the jump to the next position in the text based on the accumulated results of the evaluations up to the current position.
-
Citations
18 Claims
-
1. A method for searching a text that comprises a sequence of symbols for occurrences of a pattern of the symbols, using a sequence of evaluations at respective positions in the text, the method comprising:
-
performing a first evaluation of whether the pattern occurs in a current position in the text, wherein the performing comprises; positioning the pattern so as to end at the current position in the text, comparing the symbols of the pattern to the corresponding symbols of the text while progressing from an end to a beginning of the pattern, and upon finding a mismatch between the pattern and the text, calculating the jump to the subsequent position depending at least on the mismatch; if the pattern does not occur in the current position, calculating a size of a jump to a subsequent position in the text based on results of the first evaluation; and performing a second evaluation of whether the pattern occurs in the subsequent position in the text. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. Apparatus for searching a text that comprises a sequence of symbols for occurrences of a pattern of the symbols, using a sequence of evaluations at respective positions in the text, the apparatus comprising:
-
a memory, which is configured to store the text; and a processor, which is configured to; hold the pattern to be found in the text, perform a first evaluation of whether the pattern occurs in a current position in the text by positioning the pattern so as to end at the current position in the text and comparing the symbols of the pattern to the corresponding symbols of the text while progressing from an end to a beginning of the pattern, and, upon finding a mismatch between the pattern and the text, to calculate the jump to the subsequent position depending at least on the mismatch, calculate, if the pattern does not occur in the current position, a size of a jump to a subsequent position in the text based on results of the first evaluation, and perform a second evaluation of whether the pattern occurs in the subsequent position in the text. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
Specification