×

Document similarity evaluation system, document similarity evaluation method, and computer program

  • US 9,235,624 B2
  • Filed: 11/09/2012
  • Issued: 01/12/2016
  • Est. Priority Date: 01/19/2012
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented apparatus evaluating document similarity comprising:

  • a processor; and

    a memory capable of storing instructions to be executed by the processor by causing the processor to execute;

    a segment search unit implemented by hardware including the processor and the memory and which finds common segments in both a first segment string and a second segment string, counts the number of the common segments that are found, and identifies an appearance range within which the common segments appear; and

    a similarity index calculation unit implemented by the hardware and which calculates a second sum that is a sum of the numbers of characters of each segment included in the appearance range identified by the segment search unit, calculates a first sum that is a sum of the numbers of characters of each segment identified as the common segments, and calculates the similarity index indicating the similarity between the first segment string and the second segment string by using the following equation,
    similarity index=F(NTC)/G(NCC)×

    NS(Where, in the above-mentioned equation,NTC is the first sum,NCC is the second sum,NS is the number of the common segments, anda function F and a function G are monotonically increasing functions by which a certain integer value is associated with a positive real value).

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×