×

TECHNIQUES FOR COMPUTING SIMILARITY MEASUREMENTS BETWEEN SEGMENTS REPRESENTATIVE OF DOCUMENTS

  • US 20090300006A1
  • Filed: 05/28/2009
  • Published: 12/03/2009
  • Est. Priority Date: 05/29/2008
  • Status: Active Grant
First Claim
Patent Images

1. In a system for navigating a document repository in which each document in the document repository comprises at least one segment, a method for computing similarity measurements between various ones of a plurality of segments comprising:

  • populating a matrix representative of the plurality of segments in which each segment of the plurality of segments is represented by keyword frequency data spanning a plurality of keywords, the matrix comprising a plurality of sub-matrices in which each sub-matrix of the plurality of sub-matrices corresponds to a non-overlapping portion of the plurality keywords;

    for each sub-matrix of the plurality of sub-matrices, calculating a sub-matrix dot product between a first segment of the plurality of segments and a second segment of the plurality of segments, the sub-matrix dot product spanning at least a portion of the non-overlapping portion of the plurality of keywords, to provide a plurality of sub-matrix dot products; and

    summing the plurality of sub-matrix dot products to provide a similarity measurement between the first segment and the second segment.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×