×

Information processing using a hierarchy structure of randomized samples

  • US 7,216,129 B2
  • Filed: 02/19/2003
  • Issued: 05/08/2007
  • Est. Priority Date: 02/15/2002
  • Status: Active Grant
First Claim
Patent Images

1. A method for information processing, said information being stored in a database of documents and including attributes, said information at least including a vector of numeral elements and information identifiers to form a matrix, said vector being a node in a hierarchy structure of said information, said method comprising the steps of:

  • transforming documents in the database into vectors using a vector space model to create a document-keyword matrix;

    reducing a dimension of said matrix to a predetermined order to provide a dimension reduced matrix;

    randomly assigning vectors of said dimension-reduced matrix to a set of nodes;

    constructing a hierarchy structure of said nodes, where the document-keyword vectors are introduced with the hierarchy structure using distance between the document-keyword vectors said hierarchy structure being layered with hierarchy levels starting from a top node;

    determining parent nodes and child nodes thereof between adjacent hierarchy levels, said parent nodes being included in an upper level and said child nodes being included in a lower level;

    generating relations between said parent nodes and said child nodes by providing pointers to said parent nodes and said child nodes in relation to said distance;

    registering pointers by starting from a node pair having closest distance until a predetermined number of pairs being generated,providing a similarity-based query to rank said nodes with respect to said query;

    executing a similarity-based information retrieval using the document-keyword matrix;

    selecting said nodes to generate a cluster including said ranked nodes with respect to said query.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×