System and method for fingerprinting datasets
First Claim
Patent Images
1. A method for identifying a candidate audio segment from an outbound telephone call, the method comprising the steps of:
- a) creating a spectrogram of the candidate audio segment;
b) creating a candidate binary acoustic fingerprint bitmap of the spectrogram;
c) comparing the candidate binary acoustic fingerprint bitmap to at least one known binary acoustic fingerprint bitmap of a known network message;
d) if the candidate binary acoustic fingerprint bitmap matches one of said at least one known binary acoustic fingerprint bitmap within a predetermined threshold, declaring the match; and
e) if the candidate binary acoustic fingerprint bitmap does not match one of said at least one known binary acoustic fingerprint bitmap within the predetermined threshold, using an answering machine detection algorithm to analyze the candidate audio segment.
4 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for the matching of datasets, such as input audio segments, with known datasets in a database are disclosed. In an illustrative embodiment, the use of the presently disclosed systems and methods is described in conjunction with recognizing known network message recordings encountered during an outbound telephone call. The methodologies include creation of a ternary fingerprint bitmap to make the comparison process more efficient. Also disclosed are automated methodologies for creating the database of known datasets from a larger collection of datasets.
256 Citations
41 Claims
-
1. A method for identifying a candidate audio segment from an outbound telephone call, the method comprising the steps of:
-
a) creating a spectrogram of the candidate audio segment; b) creating a candidate binary acoustic fingerprint bitmap of the spectrogram; c) comparing the candidate binary acoustic fingerprint bitmap to at least one known binary acoustic fingerprint bitmap of a known network message; d) if the candidate binary acoustic fingerprint bitmap matches one of said at least one known binary acoustic fingerprint bitmap within a predetermined threshold, declaring the match; and e) if the candidate binary acoustic fingerprint bitmap does not match one of said at least one known binary acoustic fingerprint bitmap within the predetermined threshold, using an answering machine detection algorithm to analyze the candidate audio segment. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method for identifying a candidate audio segment from an outbound telephone call, the method comprising the steps of:
-
a) creating a spectrogram of the candidate audio segment; b) creating a candidate binary fingerprint bitmap of the spectrogram; c) comparing the candidate binary fingerprint bitmap to at least one known binary fingerprint bitmap of a known recording; d) if the candidate binary fingerprint bitmap matches one of said at least one known binary fingerprint bitmaps within a predetermined threshold, declaring the match; and e) if the candidate binary fingerprint bitmap does not match one of said at least one known binary fingerprint bitmap within the predetermined threshold, using an alternate process to analyze the candidate audio segment. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13)
-
-
14. A method for creating a ternary bitmap of an audio database from an outbound call, the method comprising the steps of:
-
a) computing a binary fingerprint bitmap of the dataset; b) deleting a first number of samples from the dataset; c) after step (b), computing another binary fingerprint bitmap of the dataset; d) repeating steps (b) and (c) a plurality of times to create a plurality of binary fingerprint bitmaps; and e) combining the plurality of binary fingerprint bitmaps into the ternary bitmap, where each bit in the ternary bitmap is determined as follows; e.1) If a bit is 0 (zero) in a first predetermined number of the plurality of binary bitmaps, set the bit in the ternary bitmap to 0 (zero); e.2) If a bit is 1 (one) in a second predetermined number of the plurality of binary bitmaps, set the bit in the ternary bitmap to 1 (one); and e.3) Otherwise, set the bit of the ternary bitmap to *, wherein * is a Don'"'"'t Care bit. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21)
-
-
22. A method for identifying a candidate audio dataset, the method comprising the steps of:
-
a) computing a binary fingerprint bitmap of a known dataset in a known dataset database; b) deleting a first number of samples from the known dataset; c) after step (b), computing another binary fingerprint bitmap of the known dataset; d) repeating steps (b) and (c) a plurality of times to create a plurality of binary fingerprint bitmaps; and e) combining the plurality of binary fingerprint bitmaps into a ternary bitmap, where each bit in the ternary bitmap is determined as follows; e.1) If a bit is 0 in a first predetermined number of the plurality of binary bitmaps, set the bit in the ternary bitmap to 0; e.2) If a bit is 1 in a second predetermined number of the plurality of binary bitmaps, set the bit in the ternary bitmap to 1; and e.3) Otherwise, set the bit of the ternary bitmap to *, wherein * is a Don'"'"'t Care bit; f) saving the ternary bitmap into a ternary bitmap database; g) repeating steps (a)-(f) for all known datasets in the known dataset database; h) creating a candidate dataset binary fingerprint bitmap from the candidate dataset; and i) comparing the candidate dataset binary fingerprint bitmap to each ternary bitmap in the ternary bitmap database, wherein said comparison ignores the Don'"'"'t Care bit. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29)
-
-
30. A method for creating a ternary bitmap of an audio segment from an outbound call, the method comprising the steps of:
-
a) computing a binary acoustic fingerprint bitmap of the audio segment; b) deleting a first number of samples from the audio segment; c) after step (b), computing another binary acoustic fingerprint bitmap of the audio segment; d) repeating steps (b) and (c) a plurality of times to create a plurality of binary acoustic fingerprint bitmaps; and e) combining the plurality of binary acoustic fingerprint bitmaps into the ternary bitmap, where each bit in the ternary bitmap is determined as follows; e.1) If a bit is 0 in a first predetermined number of the plurality of binary bitmaps, set the bit in the ternary bitmap to 0; e.2) If a bit is 1 in a second predetermined number of the plurality of binary bitmaps, set the bit in the ternary bitmap to 1; and e.3) Otherwise, set the bit of the ternary bitmap to *, wherein * is a Don'"'"'t Care. - View Dependent Claims (31, 32, 33, 34, 35)
-
-
36. A method for identifying a candidate audio segment from an outbound call, the method comprising the steps of:
-
a) computing a binary acoustic fingerprint bitmap of a known audio segment in a known audio segment database; b) deleting a first number of samples from the known audio segment; c) after step (b), computing another binary acoustic fingerprint bitmap of the known audio segment; d) repeating steps (b) and (c) a plurality of times to create a plurality of binary acoustic fingerprint bitmaps; and e) combining the plurality of binary acoustic fingerprint bitmaps into a ternary bitmap, where each bit in the ternary bitmap is determined as follows; e.
1) If a bit is 0 in a first predetermined number of the plurality of binary bitmaps, set the bit in the ternary bitmap to 0;e.2) If a bit is 1 in a second predetermined number of the plurality of binary bitmaps, set the bit in the ternary bitmap to 1; and e.3) Otherwise, set the bit of the ternary bitmap to *, wherein * is a Don'"'"'t Care; f) saving the ternary bitmap into a ternary bitmap database; g) repeating steps (a)-(f) for all known audio segments in the known audio segment database; h) creating a candidate audio segment binary acoustic fingerprint bitmap from the candidate audio segment; and i) comparing the candidate audio segment binary acoustic fingerprint bitmap to each ternary bitmap in the ternary bitmap database, wherein said comparison ignores the Don'"'"'t Care bit. - View Dependent Claims (37, 38, 39, 40, 41)
-
Specification