×

Technique and tools for high-level rule-based customizable data extraction

  • US 20020091818A1
  • Filed: 01/05/2001
  • Published: 07/11/2002
  • Est. Priority Date: 01/05/2001
  • Status: Abandoned Application
First Claim
Patent Images

1. A computer program product for efficiently extracting data from a data stream, the computer program product embodied on one or more computer-readable media and comprising:

  • computer-readable program code means for defining one or more data extraction rules, each of the rules comprising one or more rule components;

    computer-readable program code means for defining one or more output document templates for storing extracted data, wherein each of the templates comprises one or more tags which are hierarchically structured and wherein each template is to be associated with one or more of the data extraction rules;

    computer-readable program code means for associating at least one of the templates with at least one of the rules;

    computer-readable program code means for storing the rules, the templates, and the associations;

    computer-readable program code means for monitoring at least one data stream for arrival of incoming data;

    computer-readable program code means for comparing the incoming data to selected ones of the stored rules until detecting a matching rule;

    computer-readable program code means for extracting data from the incoming data, upon detecting the matching rule, according to the matching rule; and

    computer-readable program code means for storing the extracted data in an extensible document which is created according to the tags and structure of a selected one of the templates that is associated with the matching rule.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×