×

Apparatus and Method for Efficient Identification of Code Similarity

  • US 20160127398A1
  • Filed: 10/28/2015
  • Published: 05/05/2016
  • Est. Priority Date: 10/30/2014
  • Status: Active Grant
First Claim
Patent Images

1. An apparatus comprising processing circuitry configured to execute instructions for:

  • receiving a first threshold and a second threshold;

    receiving a plurality of binary reference samples;

    processing each reference sample of the plurality of reference samples via operations including;

    assigning each reference sample a respective unique identifier;

    producing a reference sample fingerprint for each reference sample; and

    registering each respective unique identifier to reference sample fingerprint pair in a reference library via operations including;

    scoring the reference sample fingerprint with each previously stored fingerprint in the reference library to produce a first matching score;

    if the first matching score meets or exceeds the first threshold for a previously stored fingerprint, determining the reference sample fingerprint to be a duplicate of the previously stored fingerprint, and recording only a unique identifier associated with the reference sample fingerprint in the reference library, the unique identifier being marked as a duplicate of the previously stored fingerprint; and

    otherwise, if the first matching score for each previously stored fingerprint is less than the first threshold, storing a corresponding reference sample unique identifier to reference sample fingerprint pair in the reference library;

    receiving a binary query sample;

    processing the binary query sample via operations including;

    producing a query sample fingerprint from the binary query sample;

    scoring the query sample fingerprint with each previously stored fingerprint in the reference library to produce a second matching score;

    for each previously stored fingerprint for which the second matching score meets or exceeds the second threshold, reporting a corresponding reference sample unique identifier associated with the previously stored fingerprint and the second matching score.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×