System and method for improving the accuracy of audio searching

US 7,725,318 B2
Filed: 08/01/2005
Issued: 05/25/2010
Est. Priority Date: 07/30/2004
Status: Active Grant

First Claim

Patent Images

1. A method for improving the searching of an audio stream with improved accuracy, the method comprising:

gathering the audio stream carrying voice of an unknown speaker, by a call recording system;

determining a plurality of acoustic models;

indexing said audio stream using said plurality of acoustic models to generate a plurality of phonetic search tracks, at least one of the plurality of phonetic search tracks comprising a first sequence of phonemes;

collecting at least one keyword;

processing said plurality of phonetic search tracks and said at least one keyword to obtain a plurality of search results by matching a pattern of phonemes in the at least one keyword with a pattern of phonemes in each of said plurality of phonetic search tracks, such that each of said plurality of search results corresponds to one of said plurality of acoustic models, and each of said plurality of search results indicates whether the at least one keyword was found in one of said plurality of search tracks, wherein each of said plurality of search results includes at least one hit indicating detection of the at least one keyword within one of said plurality of phonetic search tracks, the at least one hit having a time offset; and

combining said plurality of search results into a unified search result˜

said combining comprising;

grouping at least two hits having time offsets which differ in at most a predetermined threshold into a cluster; and

determining a single hit from the cluster as the unified search result, the single hit indicating that the at least one keyword appears in the audio stream; and

wherein each of said plurality of acoustic models represents a language or dialect.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method for improving the accuracy of audio searching using multiple models to process an audio file or stream to obtain search tracks. The search tracks are processed to locate at least one search term and generate multiple search results. The number of search results is equivalent to the number of models used to process the audio stream. The search results are combined to generate a unified search result. The multiple models may represent different languages, dialects and accents.

Citations

16 Claims

1. A method for improving the searching of an audio stream with improved accuracy, the method comprising:
- gathering the audio stream carrying voice of an unknown speaker, by a call recording system;
  
  determining a plurality of acoustic models;
  
  indexing said audio stream using said plurality of acoustic models to generate a plurality of phonetic search tracks, at least one of the plurality of phonetic search tracks comprising a first sequence of phonemes;
  
  collecting at least one keyword;
  
  processing said plurality of phonetic search tracks and said at least one keyword to obtain a plurality of search results by matching a pattern of phonemes in the at least one keyword with a pattern of phonemes in each of said plurality of phonetic search tracks, such that each of said plurality of search results corresponds to one of said plurality of acoustic models, and each of said plurality of search results indicates whether the at least one keyword was found in one of said plurality of search tracks, wherein each of said plurality of search results includes at least one hit indicating detection of the at least one keyword within one of said plurality of phonetic search tracks, the at least one hit having a time offset; and
  
  combining said plurality of search results into a unified search result˜
  
  said combining comprising;
  
  grouping at least two hits having time offsets which differ in at most a predetermined threshold into a cluster; and
  
  determining a single hit from the cluster as the unified search result, the single hit indicating that the at least one keyword appears in the audio stream; and
  
  wherein each of said plurality of acoustic models represents a language or dialect.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method according to claim 1, wherein each of said plurality of acoustic models is a phonetic acoustic model.
  - 3. The method according to claim 1, wherein the at least one hit comprises a confidence score.
  - 4. The method according to claim 3, wherein the step of combining said search results further comprises determining a resultant confidence score for cluster.
  - 5. The method according to claim 4, wherein the step of determining a resultant confidence score includes computing a simple average.
  - 6. The method according to claim 4, wherein the step of determining a resultant confidence score includes computing a weighted average.
  - 7. The method according to claim 4, wherein the step of determining a resultant confidence score includes computing a maximal confidence.
  - 8. The method according to claim 4, wherein the step of determining a resultant confidence score includes computing a confidence value with a non-linear rule.
  - 9. The method according to claim 1 wherein each of the plurality of acoustic models represents phonemes.

10. A method for searching an audio streams with improved accuracy, the method comprising:
- gathering the audio stream carrying voice of an unknown speaker, by a call recording system;
  
  determining a plurality of acoustic models;
  
  reducing said plurality of acoustic models using a language determining module;
  
  indexing said audio stream using said plurality of acoustic models to generate a plurality of phonetic search tracks, at least one of the plurality of phonetic search tracks comprising a first sequence of phonemes;
  
  collecting at least one keyword;
  
  processing said plurality of phonetic search tracks and said at least one keyword to obtain a plurality of search results by matching a pattern of phonemes in the at least one keyword with a pattern of phonemes in each of said plurality of phonetic search tracks, such that each of said plurality of search results corresponds to one of said plurality of acoustic models, and wherein each of said plurality of search results indicates whether the at least one keyword was found in one of said plurality of phonetic search tracks, wherein each of said plurality of search results includes at least one hit indicating detection of the at least one keyword within one of said plurality of phonetic search tracks, the at least one hit having a time offset, and;
  
  combining said plurality of search results into a unified search result, said combining comprising;
  
  grouping hits having time offsets which differ in at most a predetermined threshold into a cluster; and
  
  determining a single hit from the cluster, the single hit indicating that the at least one keyword appears in the audio stream; and
  
  wherein each of said plurality of acoustic models represents a language or dialect.
- View Dependent Claims (11, 12, 13, 14, 15, 16)
- - 11. The method according to claim 10 further comprising:
    - training for estimating the plurality of acoustic models, wherein at least one of the plurality of models is in a target language; and
      
      testing for determining a probabilistic score that a speech utterance signal is in the target language.
  - 12. The method according to claim 11, wherein said training further comprises:
    - inputting a plurality of speech utterance signals corresponding to a plurality of target languages;
      
      processing said plurality of speech utterance signals to extract feature vectors from said plurality of signals; and
      
      estimating the plurality of acoustic models from said feature vectors.
  - 13. The method according to claim 11, wherein testing further comprises the steps of:
    - inputting the speech utterance signal;
      
      processing the speech utterance signal to extract feature vectors from said signal; and
      
      applying a pattern matching technique to the speech utterance signal to calculate a probabilistic score.
  - 14. The method according to claim 13, wherein said pattern matching technique is performed by an algorithm.
  - 15. The method according to claim 13, wherein said probabilistic score represents the likelihood that said speech utterance was spoken in the target language.
  - 16. The method according to claim 10 wherein each of the plurality of acoustic models represents phonemes.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nice Systems Incorporated (Nice Ltd)
Original Assignee
Nice Systems Incorporated (Nice Ltd)
Inventors
Wasserblat, Moshe, Gavalda, Marsal
Primary Examiner(s)
Smits; Talivaldis Ivars
Assistant Examiner(s)
BORSETTI, GREG

Application Number

US11/195,144
Publication Number

US 20060074898A1
Time in Patent Office

1,758 Days
Field of Search

704/246, 704/251, 704/243, 704/245, 704/253, 704/256, 704/256.2
US Class Current

704/251
CPC Class Codes

G06F 16/3344   using natural language anal...

G06F 16/685   using automatically derived...

G10L 15/32   Multiple recognisers used i...

Y10S 707/916   Audio

System and method for improving the accuracy of audio searching

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

16 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for improving the accuracy of audio searching

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

16 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links