System and method for keyword spotting using representative dictionary
First Claim
1. A method comprising operations performed by a computer, the operations comprising:
- holding a first dictionary comprising first textual phrases for searching in data;
deriving from the first dictionary a second dictionary, which comprises second textual phrases and has a smaller data size than the first dictionary, such that occurrence of any of the first textual phrases in the data corresponds to the occurrence of at least one of the second textual phrases in the data;
searching input data with the second dictionary; and
in response to identifying in the input data a second textual phrase from the second dictionary, locating in the input data a first textual phrase from the first dictionary corresponding to the identified second textual phrase.
3 Assignments
0 Petitions
Accused Products
Abstract
Methods and systems for keyword spotting, i.e., for identifying textual phrases of interest in input data. In the embodiments described herein, the input data comprises communication packets exchanged in a communication network. The disclosed keyword spotting techniques can be used, for example, in applications such as Data Leakage Prevention (DLP), Intrusion Detection Systems (IDS) or Intrusion Prevention Systems (IPS), and spam e-mail detection. A keyword spotting system holds a dictionary of textual phrases for searching input data. In a communication analytics system, for example, the dictionary defines textual phrases to be located in communication packets—such as e-mail addresses or Uniform Resource Locators (URLs).
-
Citations
20 Claims
-
1. A method comprising operations performed by a computer, the operations comprising:
-
holding a first dictionary comprising first textual phrases for searching in data; deriving from the first dictionary a second dictionary, which comprises second textual phrases and has a smaller data size than the first dictionary, such that occurrence of any of the first textual phrases in the data corresponds to the occurrence of at least one of the second textual phrases in the data; searching input data with the second dictionary; and in response to identifying in the input data a second textual phrase from the second dictionary, locating in the input data a first textual phrase from the first dictionary corresponding to the identified second textual phrase. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. Apparatus, comprising:
-
a memory, which is configured to hold a first dictionary comprising first textual phrases for searching in data; and a processor, which is configured to derive from the first dictionary a second dictionary, which comprises second textual phrases and has a smaller data size than the first dictionary, such that occurrence of any of the first textual phrases in the data corresponds to the occurrence of at least one of the second textual phrases in the data, to search input data with the second dictionary, and, in response to identifying in the input data a second textual phrase from the second dictionary, to locate in the input data a first textual phrase from the first dictionary corresponding to the identified second textual phrase. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification