×

Extensible database framework for management of unstructured and semi-structured documents

  • US 6,968,338 B1
  • Filed: 08/29/2002
  • Issued: 11/22/2005
  • Est. Priority Date: 08/29/2002
  • Status: Expired due to Term
First Claim
Patent Images

1. A computer-implemented method for querying a collection of Unstructured documents, the method comprising:

  • (1) providing an Unstructured collection including at least one document;

    (2) associating with each document in the collection a connected node structure including an ordered sequence of document nodes, with each node being labeled by a document node indicium that provides information on at least four of the following attributes associated with the node and corresponding to at least one document;

    (1) a first attribute that allows identification of a unique number associated with the node;

    (2) a second attribute that specifies a descriptive label for the node;

    (3) a third attribute that specifies data type for the node, from among at least two selected data types, and indicates processing requirements for the node;

    (4) a fourth attribute that provides text data, if any, associated with the node;

    (5) a fifth attribute that specifies a node label, if any, for a node, if any, that serves as a parent node for the node; and

    (6) a sixth attribute that specifies a node label, if any, for a node, if any, that serves as a sibling node for the node, where information from the fourth attribute is included in the node indicium;

    (3) receiving a query, including at least one query keyword, for the collection of documents, and specifying at least one of keyword context and keyword content;

    (4) determining a set of query nodes in the node structure, each of which contains at least one occurrence of the keyword in the fourth attribute;

    (5) providing information on at least one selected fourth attribute containing the keyword, for at least one query node in the query node set;

    (6) determining if the query specifies context for the keyword;

    (7) when the query specifies context for the keyword, determining if the query node provides context for the keyword;

    (8) when the query node does not provide context for the keyword, replacing the query node by a left-adjacent node as a new query node, and returning to step (7) at least once;

    (9) when the query node provides context for the keyword, adding the query node to a context list, and returning to step (5) at least once;

    (10) determining if the query specifies content for the keyword;

    (11) when the query specifies content for the keyword, determining if the query node provides content for the keyword;

    (12) when the query node does not provide content for the keyword, replacing the query node by at least one of a right-adjacent node and a selected child node as a new query node, and returning to step (11) at least once; and

    (13) when the query node provides content for the keyword, adding the query node to a content list, and returning to said step (5) at least once.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×