Data disambiguation systems and methods

US 7,565,368 B2
Filed: 05/04/2004
Issued: 07/21/2009
Est. Priority Date: 05/04/2004
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method comprising:

receiving text with a computer system comprising a computer-readable medium configured with a functional presences engine, the functional presence engine configured as a probabilistic parser;

performing with the computer system, lexical analysis on the text effective to tokenize text portions to produce tokenized content in a format specified in one or more interpreted lexical files specifying one or more matching rules and corresponding output symbols; and

With a computer system configured with a knowledge base component operably associated with the functional presence engine, defining;

cases of text matchable to text received by the functional presence engine; and

responses that are triggered in an event of a match, wherein individual lexical files comprise a macro section that specifies macro values that are substitutable for macro names, and a lex section that specifies lexical rewrite rules, and wherein the lex section comprises a main section that contains rules that are executed at a top level of a tokenization process, and a sub-section associated with a rule in the main section, the sub-section containing a group of rules that get executed only if the associated main section rule produces the best match.

View all claims

8 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Various embodiments provide a state-based, regular expression parser in which data, such as generally unstructured text, is received into the system and undergoes a tokenization process which permits structure to be imparted to the data. Tokenization of the data effectively enables various patterns in the data to be identified. In some embodiments, one or more components can utilize stimulus/response paradigms to recognize and react to patterns in the data.

Citations

12 Claims

1. A computer-implemented method comprising:
- receiving text with a computer system comprising a computer-readable medium configured with a functional presences engine, the functional presence engine configured as a probabilistic parser;
  
  performing with the computer system, lexical analysis on the text effective to tokenize text portions to produce tokenized content in a format specified in one or more interpreted lexical files specifying one or more matching rules and corresponding output symbols; and
  
  With a computer system configured with a knowledge base component operably associated with the functional presence engine, defining;
  
  cases of text matchable to text received by the functional presence engine; and
  
  responses that are triggered in an event of a match, wherein individual lexical files comprise a macro section that specifies macro values that are substitutable for macro names, and a lex section that specifies lexical rewrite rules, and wherein the lex section comprises a main section that contains rules that are executed at a top level of a tokenization process, and a sub-section associated with a rule in the main section, the sub-section containing a group of rules that get executed only if the associated main section rule produces the best match.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The computer-implemented method of claim 1, comprisingwith the functional presence engine configured to use a lexical analysis program to process text, producing tokenized text portions in accordance with the one or more matching rules specified by the one or more lexical files.
  - 3. The computer-implemented method of claim 1, wherein the one or more rules are specified as regular expressions.
  - 4. The computer-implemented method of claim 3, wherein the functional presence engine is configured to attempt to match all regular expressions and then select the rule that produces the best match.
  - 5. The computer-implemented method of claim 3, wherein the functional presence engine selects a first successfully matched rule to determine which output symbol will be utilized.
  - 6. The computer-implemented method of claim 1, the lexical analysis program is configured to select a rule that produces a best match and responsive thereto, utilize the output symbol associated with the rule that produced the best match.

7. A computer readable medium having instructions stored thereon which when executed by a processor cause the processor to:
- receive text with a computer system configured with a functional presence engine, the functional presence engine configured as a probabilistic parser;
  
  perform lexical analysis on the text effective to tokenize text portions to produce tokenized content in a format specified in one or more interpreted lexical files specifying one or more matching rules and corresponding output symbols; and
  
  with a knowledge base component operably associated with the functional presence engine, define;
  
  cases of text matchable to text received by the functional presence engine; and
  
  responses that are triggered in an event of a match, wherein individual lexical files comprise a macro section that specifies macro values that are substitutable for macro names, and a lex section that specifies lexical rewrite rules, and wherein the lex section comprises a main section that contains rules that are executed at a top level of a tokenization process, and a sub-section associated with a rule in the main section, the sub-section containing a group of rules that get executed only if the associated main section rule produces the best match.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The computer readable medium of claim 7, with the functional presence engine configured to use a lexical analysis program to process text, producing tokenized text portions in accordance with the one or more matching rules specified by the one or more lexical files.
  - 9. The computer readable medium of claim 7, wherein the one or more rules are specified as regular expressions.
  - 10. The computer readable medium of claim 9, wherein the functional presence engine is configured to attempt to match all regular expressions and then select the rule that produces the best match.
  - 11. The computer readable medium of claim 9, wherein the functional presence engine selects a first successfully matched rule to determine which output symbol will be utilized.
  - 12. The computer readable medium of claim 7, the lexical analysis program is configured to select a rule that produces a best match and responsive thereto, utilize the output symbol associated with the rule that produced the best match.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Verint Americas Incorporated (Verint Systems Incorporated)
Original Assignee
Next IT Corporation (Verint Systems Incorporated)
Inventors
Hust, Robert, Zartler, Mark
Primary Examiner(s)
Corrielus; Jean M
Assistant Examiner(s)
Jami; Hares

Application Number

US10/839,425
Publication Number

US 20060004826A1
Time in Patent Office

1,904 Days
Field of Search

707/102, 707103 R-103 Z, 704 9- 10
US Class Current

1/1
CPC Class Codes

G06F 16/3329   Natural language query form...

G06F 16/3344   using natural language anal...

G06F 40/284   Lexical analysis, e.g. toke...

Y10S 707/99943   Generating database or data...

Y10S 707/99944   Object-oriented database st...

Data disambiguation systems and methods

First Claim

8 Assignments

0 Petitions

Accused Products

Abstract

Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

Data disambiguation systems and methods

First Claim

8 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links