×

Determining the relationship between source code bases

  • US 8,290,962 B1
  • Filed: 09/28/2005
  • Issued: 10/16/2012
  • Est. Priority Date: 09/28/2005
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for comparing a first set of documents to a second set of documents, the method comprising:

  • identifying, using a computer-implemented device, the first set of documents based on a first criterion;

    identifying, using the computer-implemented device, the second set of documents based on a second criterion, where the second criterion is different from the first criterion;

    constructing a matrix using the computer-implemented device, the matrix containing information regarding pairs of documents from the first and second sets of documents, where constructing the matrix further comprises;

    mapping, using the computer-implemented device and to each of the pairs of documents, a value representative of a number of lines that are common to both documents in each of the pairs of documents, where each of the pairs of documents comprises a document from the first set of documents and a document from the second set of documents, andexcluding, using the computer-implemented device, a pair of documents from the matrix, when the documents, in the pair of documents, differ in size by at least a particular amount;

    calculating similarity scores, using the computer-implemented device, for each of the pairs of documents based on the matrix, where the similarity score is calculated for a pair of documents, of the pairs of documents, by;

    determining a first ratio of the number of lines that are common to the pair of documents and a number of lines of a first document of the pair of documents,determining a second ratio of the number of lines that are common to the pair of documents and a number of lines of a second document of the pair of documents, where the first ratio is different than the second ratio,selecting the first ratio or the second ratio as a selected ratio, anddetermining the similarity score based on the selected ratio; and

    outputting the similarity scores, using the computer-implemented device.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×