System and method for improving the accuracy of audio searching
First Claim
Patent Images
1. A method for improving the searching of an audio stream with improved accuracy, the method comprising:
- gathering the audio stream carrying voice of an unknown speaker, by a call recording system;
determining a plurality of acoustic models;
indexing said audio stream using said plurality of acoustic models to generate a plurality of phonetic search tracks, at least one of the plurality of phonetic search tracks comprising a first sequence of phonemes;
collecting at least one keyword;
processing said plurality of phonetic search tracks and said at least one keyword to obtain a plurality of search results by matching a pattern of phonemes in the at least one keyword with a pattern of phonemes in each of said plurality of phonetic search tracks, such that each of said plurality of search results corresponds to one of said plurality of acoustic models, and each of said plurality of search results indicates whether the at least one keyword was found in one of said plurality of search tracks, wherein each of said plurality of search results includes at least one hit indicating detection of the at least one keyword within one of said plurality of phonetic search tracks, the at least one hit having a time offset; and
combining said plurality of search results into a unified search result˜
said combining comprising;
grouping at least two hits having time offsets which differ in at most a predetermined threshold into a cluster; and
determining a single hit from the cluster as the unified search result, the single hit indicating that the at least one keyword appears in the audio stream; and
wherein each of said plurality of acoustic models represents a language or dialect.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method for improving the accuracy of audio searching using multiple models to process an audio file or stream to obtain search tracks. The search tracks are processed to locate at least one search term and generate multiple search results. The number of search results is equivalent to the number of models used to process the audio stream. The search results are combined to generate a unified search result. The multiple models may represent different languages, dialects and accents.
-
Citations
16 Claims
-
1. A method for improving the searching of an audio stream with improved accuracy, the method comprising:
-
gathering the audio stream carrying voice of an unknown speaker, by a call recording system; determining a plurality of acoustic models; indexing said audio stream using said plurality of acoustic models to generate a plurality of phonetic search tracks, at least one of the plurality of phonetic search tracks comprising a first sequence of phonemes; collecting at least one keyword; processing said plurality of phonetic search tracks and said at least one keyword to obtain a plurality of search results by matching a pattern of phonemes in the at least one keyword with a pattern of phonemes in each of said plurality of phonetic search tracks, such that each of said plurality of search results corresponds to one of said plurality of acoustic models, and each of said plurality of search results indicates whether the at least one keyword was found in one of said plurality of search tracks, wherein each of said plurality of search results includes at least one hit indicating detection of the at least one keyword within one of said plurality of phonetic search tracks, the at least one hit having a time offset; and combining said plurality of search results into a unified search result˜
said combining comprising;grouping at least two hits having time offsets which differ in at most a predetermined threshold into a cluster; and determining a single hit from the cluster as the unified search result, the single hit indicating that the at least one keyword appears in the audio stream; and wherein each of said plurality of acoustic models represents a language or dialect. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method for searching an audio streams with improved accuracy, the method comprising:
-
gathering the audio stream carrying voice of an unknown speaker, by a call recording system; determining a plurality of acoustic models; reducing said plurality of acoustic models using a language determining module; indexing said audio stream using said plurality of acoustic models to generate a plurality of phonetic search tracks, at least one of the plurality of phonetic search tracks comprising a first sequence of phonemes; collecting at least one keyword; processing said plurality of phonetic search tracks and said at least one keyword to obtain a plurality of search results by matching a pattern of phonemes in the at least one keyword with a pattern of phonemes in each of said plurality of phonetic search tracks, such that each of said plurality of search results corresponds to one of said plurality of acoustic models, and wherein each of said plurality of search results indicates whether the at least one keyword was found in one of said plurality of phonetic search tracks, wherein each of said plurality of search results includes at least one hit indicating detection of the at least one keyword within one of said plurality of phonetic search tracks, the at least one hit having a time offset, and; combining said plurality of search results into a unified search result, said combining comprising; grouping hits having time offsets which differ in at most a predetermined threshold into a cluster; and determining a single hit from the cluster, the single hit indicating that the at least one keyword appears in the audio stream; and wherein each of said plurality of acoustic models represents a language or dialect. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
Specification