×

Methods and systems to train models to extract and integrate information from data sources

  • US 8,805,861 B2
  • Filed: 05/15/2009
  • Issued: 08/12/2014
  • Est. Priority Date: 12/09/2008
  • Status: Expired due to Fees
First Claim
Patent Images

1. A non-transitory computer readable storage medium storing at least one program configured for execution by at least one processor of a computer system, the at least one program comprising instructions to:

  • obtain a domain model comprising a set of entity types having corresponding properties and relationships between entities in a set of entities, wherein the domain model is characterized by a domain grammar;

    receive a first tag layout of a first source document obtained from a first information source associated with the domain model, the first tag layout comprising;

    (i) a plurality of user-provided navigational tags, whereina user-provided navigational tag in the plurality of a user-provided navigational tags indicates a navigational position of the first source document relative to a second source document, from the first information source, navigationally connected with the first source document, and(ii) a plurality of corresponding user-identified tokens in the first source document, whereina user-identified token in the plurality of corresponding user-identified tokens includes a portion of content of the first source document;

    select a page grammar in plurality of page grammars for the first source document in accordance with the plurality of user provided navigational tags;

    extract information from a third source document having a predefined degree of tag layout similarity to the first source document using the page grammar, wherein the second source document is obtained from a second information source; and

    transform the information extracted from the second source document in accordance with the domain grammar, thereby extracting and integrating information from a plurality of information sources.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×