Methods and apparatus for identifying fraudulent callers

US 9,837,078 B2
Filed: 11/09/2012
Issued: 12/05/2017
Est. Priority Date: 11/09/2012
Status: Active Grant

First Claim

Patent Images

1. A method of voice print matching which comprises:

receiving a telephonic communication from an unknown caller;

separating a first portion of the telephonic communication into silent and non-silent segments;

evaluating the non-silent segments to determine which portions thereof are speech or non-speech;

generating a plurality of parameters that determine what is speech and non-speech in the non-silent segments;

using the generated parameters to determine what is speech and non-speech for at least the remainder of the telephonic communication;

comparing the speech to a Universal Background Model (UBM);

selecting a number of audio elements of the UBM that characterize the speech of the unknown caller relative to other audio elements of the UBM;

selecting audio elements of the speech that correspond to the selected audio elements of the UBM; and

comparing the selected audio elements of the speech to matching audio elements of a plurality of recorded voice prints from a plurality of fraudulent speakers to determine whether the speech belongs to a fraudulent speaker.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The methods, apparatus, and systems described herein are designed to identify fraudulent callers. A voice print of a call is created and compared to known voice prints to determine if it matches one or more of the known voice prints. The methods include a pre-processing step to separate speech from non-speech, selecting a number of elements that affect the voice print the most, and/or computing an adjustment factor based on the scores of each received voice print against known voice prints.

Citations

38 Claims

1. A method of voice print matching which comprises:
- receiving a telephonic communication from an unknown caller;
  
  separating a first portion of the telephonic communication into silent and non-silent segments;
  
  evaluating the non-silent segments to determine which portions thereof are speech or non-speech;
  
  generating a plurality of parameters that determine what is speech and non-speech in the non-silent segments;
  
  using the generated parameters to determine what is speech and non-speech for at least the remainder of the telephonic communication;
  
  comparing the speech to a Universal Background Model (UBM);
  
  selecting a number of audio elements of the UBM that characterize the speech of the unknown caller relative to other audio elements of the UBM;
  
  selecting audio elements of the speech that correspond to the selected audio elements of the UBM; and
  
  comparing the selected audio elements of the speech to matching audio elements of a plurality of recorded voice prints from a plurality of fraudulent speakers to determine whether the speech belongs to a fraudulent speaker.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein the first portion comprises a pre-selected time period.
  - 3. The method of claim 2, wherein the pre-selected time period is the first 30 seconds to 1 minute of the telephonic communication.
  - 4. The method of claim 1, wherein the plurality of parameters are generated for each communication received.
  - 5. The method of claim 1, wherein evaluating the non-silent segments comprises grouping all non-speech sounds together.
  - 6. The method of claim 1, which further comprises identifying a fraudulent speaker if the speech at least substantially matches any of the voice prints of the plurality of fraudulent speakers.
  - 7. The method of claim 6, wherein identifying the fraudulent speaker comprises scoring each of a group of telephonic communications in a range of probabilities that there is a match with a known fraudulent speaker.
  - 8. The method of claim 1, which further comprises recording and storing the telephonic communication in an uncompressed audio format.

9. An audible fraud detection system, comprising:
- a node comprising a processor and a computer readable medium operably coupled thereto, the computer readable medium comprising a plurality of instructions stored therein that are accessible to, and executable by, the processor, wherein the plurality of instructions comprises;
  
  instructions, that when executed, receive a telephonic communication from an unknown caller via a network;
  
  instructions, that when executed, separate a first portion of the communication into silent and non-silent segments;
  
  instructions, that when executed, evaluate the non-silent segments to determine which portions are speech or non-speech;
  
  instructions, that when executed, generate a plurality of parameters based on the evaluated non-silent segments that determine what is speech and non-speech;
  
  instructions, that when executed, use the generated parameters to determine what is speech and non-speech for at least the remainder of the telephonic communication;
  
  instructions, that when executed, compare the speech to a Universal Background Model (UBM);
  
  instructions, that when executed, select a number of audio elements of the UBM that characterize the speech of the unknown caller relative to other audio elements of the UBM;
  
  instructions, that when executed, select audio elements of the speech that correspond to the selected audio elements of the UBM; and
  
  instructions, that when executed, compare the selected audio elements of the speech to matching audio elements of a plurality of recorded voice prints from a plurality of fraudulent speakers to determine whether the speech belongs to a fraudulent speaker.
- View Dependent Claims (10, 11, 12)
- - 10. The system of claim 9, wherein the first portion comprises a pre-selected time period.
  - 11. The system of claim 9, wherein the plurality of parameters are generated for each communication received.
  - 12. The system of claim 9, wherein the instructions, that when executed, evaluate the non-silent segments comprise instructions to group all non-speech sounds together.

13. A non-transitory computer readable medium comprising a plurality of instructions stored therein, the plurality of instructions comprising:
- instructions, that when executed, receive a telephonic communication from an unknown caller;
  
  instructions, that when executed, separate a first portion at the beginning of the communication into silent and non-silent segments;
  
  instructions, that when executed, evaluate the non-silent segments to determine which portions are speech and non-speech;
  
  instructions, that when executed, generate a plurality of parameters based on the evaluated non-silent segments that determine what is speech and non-speech;
  
  instructions, that when executed, use the generated parameters to determine what is speech and non-speech for at least the remainder of the telephonic communication;
  
  instructions, that when executed, compare the speech to a Universal Background Model (UBM);
  
  instructions, that when executed, select a number of audio elements of the UBM that characterize the speech of the unknown caller relative to other audio elements of the UBM;
  
  instructions, that when executed, select audio elements of the speech that correspond to the selected audio elements of the UBM; and
  
  instructions, that when executed, compare the selected audio elements of the speech matching audio elements of a plurality of recorded voice prints to determine whether the speech belongs to a fraudulent speaker.
- View Dependent Claims (14, 15, 16)
- - 14. The non-transitory computer readable medium of claim 13, wherein the first portion comprises a pre-selected time period.
  - 15. The non-transitory computer readable medium of claim 13, wherein the plurality of parameters are generated for each communication received.
  - 16. The non-transitory computer readable medium of claim 13, wherein the instructions, that when executed, evaluate the non-silent segments, comprise instructions to group all non-speech together.

17. A method of detecting a fraudulent speaker comprising:
- receiving a telephonic communication from an unknown caller;
  
  separating a first portion of the telephonic communication into silent and non-silent segments;
  
  evaluating the non-silent segments to determine which portions thereof are speech or non-speech;
  
  generating a plurality of parameters that determine what is speech and non-speech in the non-silent segments;
  
  using the generated parameters to determine what is speech and non-speech for at least the remainder of the telephonic communication;
  
  comparing the speech of the unknown caller to a Universal Background Model (UBM);
  
  selecting a number of audio elements of the UBM that most characterize the creation of a voice print for the unknown caller relative to other audio elements of the UBM;
  
  selecting audio elements of the voice print that correspond to the selected audio elements of the UBM;
  
  comparing the selected audio elements of the voice print to matching audio elements of voice prints of a plurality of fraudulent speakers stored in a database; and
  
  determining if the voice print belongs to a fraudulent speaker.
- View Dependent Claims (18, 19, 20, 21)
- - 18. The method of claim 17, wherein the number of selected audio elements of the UBM is from about 10 to 30.
  - 19. The method of claim 17, which further comprises identifying the gender of the speaker in a voice print.
  - 20. The method of claim 19, which further comprises accessing a library of voice prints for the identified gender.
  - 21. The method of claim 17, wherein determining if the voice print belongs to a fraudulent speaker comprises:
    - scoring a group of telephonic communications within a range of probabilities that a match exists; and
      
      isolating communications with a score above a pre-selected match-probability threshold.

22. An audible fraud detection system, comprising:
- a node comprising a processor and a computer readable medium operably coupled thereto, the computer readable medium comprising a plurality of instructions stored therein and being accessible to, and executable by, the processor, where the plurality of instructions comprises;
  
  instructions, that when executed, receive a voice audio communication from a telephonic communication from an unknown caller via a network;
  
  instructions, that when executed, separate a first portion of the telephonic communication into silent and non-silent segments;
  
  instructions, that when executed, evaluate the non-silent segments to determine which portions thereof are speech or non-speech;
  
  instructions, that when executed generate a plurality of parameters that determine what is speech and non-speech in the non-silent segments;
  
  instructions, that when executed use the generated parameters to determine what is speech and non-speech for at least the remainder of the telephonic communication;
  
  instructions, that when executed, compare the speech of the unknown caller to a Universal Background Model (UBM);
  
  instructions, that when executed, select a number of audio elements of the UBM that most characterize creation of a voice print for the unknown caller relative to other audio elements of the UBM to create a voice print from the voice audio communication;
  
  instructions, that when executed, select audio elements of the voice print that correspond to the selected audio elements of the UBM;
  
  instructions, that when executed, compare the selected audio elements of the voice print to matching audio elements of one or more stored voice prints of a plurality of fraudulent speakers stored in a database; and
  
  instructions, that when executed, determine if the voice print belongs to a fraudulent speaker.
- View Dependent Claims (23, 24, 25)
- - 23. The system of claim 22, wherein the number of selected audio elements of the UBM is from about 10 to 30.
  - 24. The system of claim 22, further comprising instructions, that when executed, identify the gender of the voice print.
  - 25. The system of claim 23, wherein the instructions, that when executed, determine if the voice print belongs to a fraudulent speaker, comprise:
    - instructions to score a group of telephonic communications within a range or probabilities that a match exists; and
      
      isolate communications with a score above a pre-selected match-probability threshold.

26. A non-transitory computer readable medium comprising a plurality of instructions stored therein, the plurality of instructions comprising:
- instructions, that when executed, receive a voice audio communication through a telephonic communication from an unknown caller;
  
  instructions, that when executed, separate a first portion of the telephonic communication into silent and non-silent segments;
  
  instructions, that when executed, evaluate the non-silent segments to determine which portions thereof are speech or non-speech;
  
  instructions, that when executed generate a plurality of parameters that determine what is speech and non-speech in the non-silent segments;
  
  instructions, that when executed use the generated parameters to determine what is speech and non-speech for at least the remainder of the telephonic communication;
  
  instructions, that when executed, compare the speech of the unknown caller to a Universal Background Model (UBM);
  
  instructions, that when executed, select a number of audio elements of the UBM that most characterize creation of a voice print for the unknown caller relative to other audio elements of the UBM to create a voice print from the voice audio communication;
  
  instructions, that when executed, select audio elements of the voice print that correspond to audio elements of the UBM;
  
  instructions, that when executed, compare the selected audio elements of the voice print to matching audio elements of one or more stored voice prints of a plurality of fraudulent speakers in a database; and
  
  instructions, that when executed, determine if the voice print belongs to a fraudulent speaker.
- View Dependent Claims (27, 28)
- - 27. The non-transitory computer readable medium of claim 26, further comprising instructions, that when executed, identify the gender of the voice print.
  - 28. The non-transitory computer readable medium of claim 27, wherein the instructions, that when executed, determine if the voice print belongs to a fraudulent speaker, comprise:
    - instructions to score a group of telephonic communications within a range of probabilities that a match exists; and
      
      isolate communications with a score above a pre-selected match-probability threshold.

29. A method of detecting a fraudulent speaker, which comprises:
- creating a voice print from a received telephonic communication from an unknown caller;
  
  comparing the voice print to a Universal Background Model (UBM);
  
  selecting a number of audio elements of the UBM that characterize the voice print of the unknown caller relative to other audio elements of the UBM;
  
  selecting audio elements of the voice print that correspond to the selected audio elements of the UBM;
  
  scoring the selected audio elements of the voice print against matching audio elements of one or more voice prints of a plurality of fraudulent speakers that are stored in a database;
  
  calculating an adjustment factor based on the scores of the voice print against the stored voice prints and the scores of other unknown voice prints against the stored voice prints; and
  
  comparing the adjustment factor of the voice print to adjustment factors of the other unknown voice prints to determine the probability that the voice print belongs to a fraudulent speaker.
- View Dependent Claims (30, 31, 32)
- - 30. The method of claim 29, which further comprises isolating the scores that meet a threshold value.
  - 31. The method of claim 30, wherein the threshold value is set dynamically.
  - 32. The method of claim 30, wherein the adjustment factor is calculated for each communication received.

33. An audible fraud detection system, which comprises:
- a node comprising a processor and a computer readable medium operably coupled thereto, the computer readable medium comprising a plurality of instructions stored therein that are accessible to, and executable by, the processor, where the plurality of instructions comprises;
  
  instructions, that when executed, receive a telephonic communication from an unknown caller via a network and create an unknown voice print;
  
  instructions, that when executed, compare the unknown voice print to a Universal Background Model (UBM);
  
  instructions, that when executed, select a number of audio elements of the UBM that characterize the unknown voice print of the unknown caller relative to other audio elements of the UBM;
  
  instructions, that when executed, select audio elements of the unknown voice print that correspond to the selected audio elements of the UBM;
  
  instructions, that when executed, score the unknown voice print against stored voice prints in a database by comparing the selected audio elements of the unknown voice print to matching audio elements of the stored voice prints;
  
  instructions, that when executed, compute an adjustment factor for each telecommunication received that is based on the score of each unknown voice print compared to the stored voice prints; and
  
  instructions, that when executed, compare the adjustment factors for each unknown voice print to determine which voice print is from a fraudulent speaker.
- View Dependent Claims (34, 35)
- - 34. The system of claim 33, further comprising instructions, that when executed, isolate the scores that meet or exceed a pre-set, universal threshold value that is indicative of the probability the unknown voice print was created by a fraudster.
  - 35. The system of claim 33, wherein the adjustment factor is calculated for each communication received.

36. A non-transitory computer readable medium comprising a plurality of instructions stored therein, the plurality of instructions comprising:
- instructions, that when executed, receive a telephonic communication from an unknown caller;
  
  instructions, that when executed, separate a first portion of the telephonic communication into silent and non-silent segments;
  
  instructions, that when executed, evaluate the non-silent segments to determine which portions thereof are speech or non-speech;
  
  instructions, that when executed generate a plurality of parameters that determine what is speech and non-speech in the non-silent segments;
  
  instructions, that when executed use the generated parameters to determine what is speech and non-speech for at least the remainder of the telephonic communication;
  
  instructions, that when executed, compare the speech of the unknown caller to a Universal Background Model (UBM);
  
  instructions, that when executed, select a number of audio elements of the UBM that characterize an unknown voice print created from the communication from the unknown caller relative to other audio elements of the UBM;
  
  instructions, that when executed, select audio elements of the unknown voice print that correspond to the selected audio elements of the UBM;
  
  instructions, that when executed, compare the selected audio elements of the unknown voice print to matching audio elements of voice prints stored in a database to create a score for each unknown voice print;
  
  instructions, that when executed, compute an adjustment factor based on the score of each voice print against stored voice prints; and
  
  instructions, that when executed, compare the adjustment factors for each unknown voiceprint to determine which voice print is a fraudster.
- View Dependent Claims (37, 38)
- - 37. The non-transitory computer readable medium of claim 36, further comprising instructions, that when executed, isolate the scores that meet or exceed a pre-set, universal threshold value that is indicative of the probability the voice print was created by a fraudster.
  - 38. The non-transitory computer readable medium of claim 36, wherein the adjustment factor is calculated for each communication received.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Mattersight Corporation (Nice Ltd)
Original Assignee
Mattersight Corporation (Nice Ltd)
Inventors
Warford, Roger, Brown, Douglas, Danson, Christopher, Gustafson, David
Primary Examiner(s)
Sharma, Neeraj

Application Number

US13/673,187
Publication Number

US 20140136194A1
Time in Patent Office

1,852 Days
Field of Search

704233, 704246, 704210, 704244, 704236, 704219, 704243, 704249-250, 704273, 700 94, 381312, 381 59, 37911414
US Class Current
CPC Class Codes

G10L 17/00   Speaker identification or v...

G10L 17/02   Preprocessing operations, e...

G10L 17/04   Training, enrolment or mode...

G10L 17/06   Decision making techniques;...

G10L 2025/783   based on threshold decision

G10L 25/27   characterised by the analys...

G10L 25/51   for comparison or discrimin...

G10L 25/78   Detection of presence or ab...

Methods and apparatus for identifying fraudulent callers

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

38 Claims

Specification

Solutions

Use Cases

Quick Links

Methods and apparatus for identifying fraudulent callers

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

38 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links