×

Information retrieval and text mining using distributed latent semantic indexing

  • US 7,152,065 B2
  • Filed: 05/01/2003
  • Issued: 12/19/2006
  • Est. Priority Date: 05/01/2003
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for processing a collection of data objects for use in information retrieval and data mining operations comprising the steps of:

  • generating a frequency count for each term in each data object in the collection;

    partitioning the collection of data objects into a plurality of sub-collections using the term-by data object information, wherein each sub-collection is based on the conceptual dependence of the data objects within;

    generating a term-by-data object matrix for each sub-collection;

    decomposing the term-by data object matrix of each sub-collection into a reduced singular value representation;

    determining the centroid vectors of each sub-collection;

    finding a predetermined number of terms in each sub-collection closest to centroid vector; and

    ,developing a similarity graph network to establish similarity between sub-collections.

View all claims
  • 10 Assignments
Timeline View
Assignment View
    ×
    ×