×

Systems and methods of semantically annotating documents of different structures

  • US 8,924,374 B2
  • Filed: 02/22/2008
  • Issued: 12/30/2014
  • Est. Priority Date: 02/22/2008
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method, comprising:

  • at a computer having memory and one or more processors;

    receiving one or more search keywords from a user;

    selecting a plurality of candidate document identifiers in accordance with the one or more search keywords, each candidate document identifier corresponding to a respective document at a respective data source;

    for a respective candidate document identifier of the plurality of candidate document identifiers;

    retrieving a document corresponding to the respective candidate document identifier from a data source, wherein the document has a structure type;

    converting the document into a node stream, wherein the document conversion is initiated immediately after retrieving a portion of the document;

    generating a customized data model for the document using the node stream in accordance with the structure type of the document;

    identifying one or more candidate chunks within the customized data model in accordance with a set of heuristic rules associated with the structure type; and

    selecting one or more chunks of the candidate chunks that satisfy the one or more search keywords; and

    providing at least one of the selected one or more chunks for display to the user.

View all claims
  • 6 Assignments
Timeline View
Assignment View
    ×
    ×