×

System and method for block segmenting, identifying and indexing visual elements, and searching documents

  • US 10,223,455 B2
  • Filed: 10/04/2010
  • Issued: 03/05/2019
  • Est. Priority Date: 10/02/2009
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for processing and identifying documents in accordance with their relevance to a search query, the method comprising:

  • generating, by a computer, preliminary metadata for a document of visual element types, the preliminary metadata generated from source content, implicit presentation semantics, and explicit presentation semantics of the document and comprising key/value pairs indicative of the source content, the implicit presentation semantics and the explicit presentation semantics, wherein the preliminary metadata is used to determine different blocks in the document of visual element types;

    identifying, by the computer, one or more blocks as a logical unit in the document that is visually displayed to a display device, by comparing the key/value pairs in the generated preliminary metadata to key/value pairs in a block identifying criterion set, wherein the comparison of the key/value pairs in the block identifying criterion set to the key/value pairs in the generated preliminary metadata evaluates to a true indication when the preliminary metadata indicates a block or a false indication when the preliminary metadata does not indicate a block;

    preparing, by the computer, a block list of the one or more identified blocks;

    processing, by the computer, blocks in the prepared block list of the identified blocks using block operations selected from removing empty block(s), removing overlapping block(s), removing overlapped block(s), removing intermediate block(s), merging block(s), splitting block(s) and combinations thereof;

    identifying based on a profile, by the computer, a visual element within a block in the prepared block list by starting with a block having highest level in the block list to determine if the block in the block list satisfies all rules of the profile, the block in the prepared block list is identified as the visual element of a visual element type by the profile, wherein the visual element is an inline visual element or a block visual element that is selected from paragraph, table, list, menu, fixed width text, key/value, graph/chart, question/answer, timeline, and interactive, the profile is a set of rules that identifies and classifies a matching block into the visual element type;

    generating, by the computer, an index of the identified visual element and storing the index in a data store in a memory;

    receiving, by the computer, a user search query via a Graphical User Interface (GUI) on a computer, wherein the user search query comprises at least one selected visual element type selected from paragraph, table, list, menu, fixed width text, key/value, graph/chart, question/answer, timeline, and interactive;

    upon receiving the user search query, determining, by the computer, whether the data store contains terms that match the user search query by the at least one selected visual element type in the user search query; and

    generating, based on the determining for the data store by the computer, prioritized search results in a response to the user search query with identification of document(s) that comprise of visual element(s) of the user search query'"'"'s the at least one selected visual element type.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×