×

Information data retrieval, where the data is organized in terms, documents and document corpora

  • US 7,593,932 B2
  • Filed: 01/13/2003
  • Issued: 09/22/2009
  • Est. Priority Date: 01/16/2002
  • Status: Active Grant
First Claim
Patent Images

1. A method of processing digitized textual information in a computerized database system, the information being organized in terms, documents and document corpora, where each document contains at least one term and each document corpus contains at least one document, the method comprising:

  • generating, by using a computer, a concept vector for each document in a document corpus wherein the concept vector conceptually classifying contents of the document on a relatively compact format,generating, for each term in the document corpus, a term-to-concept vector describing a relationship between the term and each of the concept vectors wherein the term-to-concept vectors being generated on basis of the concept vectors, comprises;

    receiving the term-to-concept vectors for the document corpus and on basis thereof generating a term-term matrix describing a term-to-term relationship between the terms in the document corpus, wherein the generation of the term-term matrix comprises;

    retrieving, for each term in each combination of two unique terms in the document corpus, a respective term-to-concept vector, generating a relation vector describing the relationship between the terms in the each combination of two unique terms, each component in the relation vector being equal to a lowest component value of corresponding component values in the term-to-concept vectors, generating a relationship value for the each combination of two unique terms as the sum of all component values in the corresponding relation vector, and generating a matrix containing the relationship values of all combinations of two unique terms in the document corpus,processing the term-term matrix into processed textual information and displaying the processed textual information via a user output interface, anddisplaying the processed textual information as a distance graph in which each term constitutes a node wherein the node representing a first term is connected to one or more other nodes representing secondary terms to which the first term has a conceptual relationship of at least a specific strength, and a relevance measure between the first term and at least one second term is represented by a minimum number of node hops between the first term and the at least one second term.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×