×

Document retrieval device

  • US 5,943,669 A
  • Filed: 11/21/1997
  • Issued: 08/24/1999
  • Est. Priority Date: 11/25/1996
  • Status: Expired due to Term
First Claim
Patent Images

1. A document retrieval device that classifies a group of documents, each document having structural elements arranged in a logical hierarchical format, each structural element including at least one of a heading and a content, the documents stored in a document storage device, the retrieval device comprising:

  • logical structure analysis means for analyzing the logical hierarchical format of the documents and for obtaining structural elements and a hierarchical relationship between the structural elements within each document;

    classification unit designation means for designating a classification unit showing which level the structural elements to be classified are at within the hierarchical relationship;

    fundamental vector generation means for extracting a key word from the content of each structural element of the classification unit that is designated by the classification unit designation means, and for generating a fundamental vector based on the extracted key word;

    heading vector generation means for extracting a key word from the heading of each structural element that is superordinate to the structural element used to generate the fundamental vector within said hierarchical relationship, and for generating a heading vector based on the extracted key word;

    vector synthesis means for generating, for each structural element of the classification unit, a composite vector based on the corresponding fundamental and heading vectors; and

    classification means for calculating a degree of similarity among the composite vectors and for classifying the structural elements of the documents based on the degree of similarity.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×