×

Techniques for Extracting Unstructured Data

  • US 20120078950A1
  • Filed: 09/29/2010
  • Published: 03/29/2012
  • Est. Priority Date: 09/29/2010
  • Status: Abandoned Application
First Claim
Patent Images

1. A method comprising:

  • receiving a plurality of extensible grammar expressions, wherein each extensible grammar expression includes a regular expression that searches for a set of information;

    receiving a given document including unstructured data;

    tokenizing the given document;

    searching the tokenized given document using the regular expressions to determine if the unstructured data in the document matches one or more of the extensible grammar expressions;

    extracting one or more sets of information from the unstructured data using one or more heuristics; and

    outputting the one or more sets of extracted information.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×