×

System, Method, and Apparatus for Information Extraction of Textual Documents

  • US 20100169309A1
  • Filed: 12/30/2008
  • Published: 07/01/2010
  • Est. Priority Date: 12/30/2008
  • Status: Active Grant
First Claim
Patent Images

1. A method for extraction of text from a set of text documents, the method comprising the steps of:

  • a) identifying a plurality of document segments within a given text document;

    b) for each given document segment identified in a), generating and storing at least one structured annotation embedded within the document and associated with the given segment, the at least one structured annotation specifying the start and end of the given document segment and a rhetorical relation associated with the given segment;

    c) processing the structured annotations generated and stored in b) to generate a plurality of variables that represent document segments and associated rhetorical relations as specified by the structured annotations;

    d) storing the variables generated in c) in a repository;

    e) receiving query input from a user that specifies at least one rhetorical relation of interest; and

    f) in response to receipt of said query input, querying the variables stored in the repository to identify zero or more document segments that are associated with a rhetorical relation that matches the at least one rhetorical relation of interest specified by said query input for output to the user.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×