REAL TIME SINGLE-SWEEP DETECTION OF KEY WORDS AND CONTENT ANALYSIS
First Claim
1. A method for content analysis of a text, the analysis depending on a presence of a plurality of keywords in the text, the method comprising:
- A) providing an incidence database including at least a first counter and a second counter, said first counter for indicating an incidence of a first keyword of the plurality of keywords, said second counter for indicating an incidence of a second keyword of the plurality of keywords;
B) supplying a detection tree including a plurality of branches, each branch of said plurality of branches matching a string of characters, and said detection tree also including a plurality of sites on said plurality of branches, each site associated with a keyword of the plurality of keywords;
said plurality of sites including a first site associated with said first keyword and a second site associated with said second keyword;
C) reading at least one character from the textD) selecting from said plurality of branches, a current branch matching said at least one character;
E) reading at least one more character from the text;
F) selecting a sub-branch of said current branch, said sub-branch of said current branch matching said at least one more character;
G) reaching said first siteH) incrementing said first counter upon said reaching said first site;
I) updating a score upon said incrementing, said updating being dependent on a condition, said condition including a limitation on a value of said first counter and said condition also including a limitation on a value of said second counter.
0 Assignments
0 Petitions
Accused Products
Abstract
A system and method are provided for real-time analysis of text. During a single sweep through the text, a detection tree is used to identify a sequence of characters in the text from a large dictionary of keywords. When a keyword is detected a rule tally database is updated. An intermediate score may be available during the sweep and a final score of the text may be available substantially immediately upon finishing the single sweep. A second text may be analyzed immediately using the same score buffer and rule tally database without updating the rule tally database.
15 Citations
24 Claims
-
1. A method for content analysis of a text, the analysis depending on a presence of a plurality of keywords in the text, the method comprising:
-
A) providing an incidence database including at least a first counter and a second counter, said first counter for indicating an incidence of a first keyword of the plurality of keywords, said second counter for indicating an incidence of a second keyword of the plurality of keywords; B) supplying a detection tree including a plurality of branches, each branch of said plurality of branches matching a string of characters, and said detection tree also including a plurality of sites on said plurality of branches, each site associated with a keyword of the plurality of keywords;
said plurality of sites including a first site associated with said first keyword and a second site associated with said second keyword;C) reading at least one character from the text D) selecting from said plurality of branches, a current branch matching said at least one character; E) reading at least one more character from the text; F) selecting a sub-branch of said current branch, said sub-branch of said current branch matching said at least one more character; G) reaching said first site H) incrementing said first counter upon said reaching said first site; I) updating a score upon said incrementing, said updating being dependent on a condition, said condition including a limitation on a value of said first counter and said condition also including a limitation on a value of said second counter. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A system for content analysis of a text, the analysis is depending on presence of a plurality of keywords in the text, the system comprising:
-
A) an incidence database including a) a first counter configured for indicating an incidence of a first keyword of the plurality of keywords; b) a second counter configured for indicating an incidence of a second keyword of the plurality of keywords; B) a detection tree including; a) a plurality of branches, each branch of said plurality of branches matching a string of characters; b) a plurality of sites on said plurality of branches, each site of said plurality of sites associated with a keyword of the plurality of keywords; c) said plurality of sites including a first site associated with said first keyword;
wherein said detection tree is configured for navigating to reach said first site upon reading said first keyword in the text and wherein said first counter is configured for incrementing upon said reaching said first site;C) a score buffer configured for updating upon said incrementing, said updating being dependent on a condition, said condition including a limitation on a value of said first counter and said condition also including a limitation on a value of said second counter. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
Specification