Method and system for automated speech recognition that rearranges an information data base to disrupt formation of recognition artifacts

US 9,058,809 B2
Filed: 06/08/2013
Issued: 06/16/2015
Est. Priority Date: 01/31/2011
Status: Expired due to Fees

First Claim

Patent Images

1. A method for automated speech recognition, wherein audio input information is matched to data base information stored in a data base by at least one matching algorithm, and wherein before the input information is matched to the data base information, the data base information is arranged in the data base in a data base information structure, the method comprising:

inputting audio input information from an audio input device;

performing a structural analysis of the content of the data base by comparing structural parameters of the data base content to predefined requirements and using the results of the structural analysis to decide whether a rearrangement procedure of the data base information from a data base information structure to a matching information structure is required;

if a rearrangement procedure of the data base information is required;

selecting one of multiple rearrangement procedures based on the result of the structural analysis, the selected rearrangement procedure rearranging the data base information from the data base information structure into a matching information structure which differs from the data base information structure, wherein the selected one of the multiple rearrangement procedures in the step of rearranging performs an algorithm that addresses the relationship between entries of the data base information, which are elements in a word list, by quantifying a degree of similarity between the entries by measuring a relevant phonetic distance between the entries and rearranging the entries in a way that is a function of the degree of similarity;

redistributing entries corresponding to words whose phonetic distance is below a phonetic distance threshold into subdirectories so that the entries whose phonetic distance is below the phonetic distance threshold are separated in different subdirectories in order to disrupt the forming of recognition artifacts due to their similarity; and

applying a speech recognition program that includes the at least one matching algorithm to the rearranged data base information and matching the audio input information to the rearranged data base information to recognize the speech content of the audio input information from the audio input device.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and a method perform information recognition. The method arranges data base information in a data base information structure. The method matches input information to the data base information using at least one matching algorithm and using a matching information structure. In accordance with the system and the method, the matching information structure differs from the data base information structure.

25 Citations

15 Claims

1. A method for automated speech recognition, wherein audio input information is matched to data base information stored in a data base by at least one matching algorithm, and wherein before the input information is matched to the data base information, the data base information is arranged in the data base in a data base information structure, the method comprising:
- inputting audio input information from an audio input device;
  
  performing a structural analysis of the content of the data base by comparing structural parameters of the data base content to predefined requirements and using the results of the structural analysis to decide whether a rearrangement procedure of the data base information from a data base information structure to a matching information structure is required;
  
  if a rearrangement procedure of the data base information is required;
  
  selecting one of multiple rearrangement procedures based on the result of the structural analysis, the selected rearrangement procedure rearranging the data base information from the data base information structure into a matching information structure which differs from the data base information structure, wherein the selected one of the multiple rearrangement procedures in the step of rearranging performs an algorithm that addresses the relationship between entries of the data base information, which are elements in a word list, by quantifying a degree of similarity between the entries by measuring a relevant phonetic distance between the entries and rearranging the entries in a way that is a function of the degree of similarity;
  
  redistributing entries corresponding to words whose phonetic distance is below a phonetic distance threshold into subdirectories so that the entries whose phonetic distance is below the phonetic distance threshold are separated in different subdirectories in order to disrupt the forming of recognition artifacts due to their similarity; and
  
  applying a speech recognition program that includes the at least one matching algorithm to the rearranged data base information and matching the audio input information to the rearranged data base information to recognize the speech content of the audio input information from the audio input device.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 13)
- - 2. The method according to claim 1, wherein the data base information structure is an order of entries of the data base information in the data base, the matching information structure being achieved by rearranging the entries in the data base.
  - 3. The method according to claim 1, wherein the data base information structure is an order of entries of the data base information in the data base, the matching information structure being achieved by accessing the entries, for the purpose of matching, in a matching order.
  - 4. The method according to claim 1, wherein:
    - the data base information is rearranged into a plurality of information subsets;
      
      the input information is matched with each subset; and
      
      each information subset match results in a candidate set of match candidates.
  - 5. The method according to claim 4, wherein the data base information is divided into the subsets, only the sum of all subsets comprising the complete data base information.
  - 6. The method according to claim 4, wherein the data base information is rearranged from the data base information structure into the subsets, each subset containing the complete data base information in the matching information structure.
  - 7. The method according to claim 4, wherein:
    - the input information is matched in a first match with each information subset, each subset match resulting in a candidate set of match candidates; and
      
      the input information is matched in a second match with some or all of the match candidates retrieved in the first match.
  - 8. The method according to claim 7, wherein the input information being matched in one or more further matches with the match candidates of the respective preceding step.
  - 9. The method according to claim 1, wherein after rearranging the data base information from the data base information structure into the matching information structure the rearranged data base information is distributed into a plurality of information subsets, the input information being matched with each subset, each information subset match resulting in a candidate set of match candidates.
  - 10. The method according to claim 1, wherein after the matching which results in a candidate set of match candidates, the method further comprises:
    - determining whether one of the candidates resulting from the matching is acceptable as a final result; and
      
      if no candidate is acceptable as a final result, repeating the step of matching the input information to the rearranged data base information with at least one of the found candidates used as new data base information.
  - 13. The system according to claim 1, wherein:
    - the measuring of the relevant phonetic distance between the entries is performed by a Metaphone algorithm.

11. A system for automated speech recognition with a data base containing data base information being stored in the data base in a data base information structure, the system comprising:
- an audio input device that provides audio input information;
  
  at least one matching means containing at least one matching algorithm as computer program to match the audio input information to data base information;
  
  a rearranging means to rearrange the data base information into a matching information structure to be matched with the audio input information by the at least one matching means, the rearranging means performing an algorithm that addresses the relationship between entries of the data base information, which are elements in a word list, by quantifying a degree of similarity between the entries by measuring a relevant phonetic distance between the entries and rearranging the entries in a way that is a function of the degree of similarity;
  
  wherein;
  
  entries corresponding to words sounding too similar, the rearranging means redistribute the entries either within the data base or within or between subdirectories in order to disrupt the forming of recognition artifacts due to their similarity, by grouping words having a measured relevant phonetic distance below a selected threshold level into different subsets;
  
  anda plurality of speech recognition programs each with a matching algorithm applied to the data base information by matching the audio input information to each subset of the rearranged data base information, wherein the audio input information is matched to each subset by different speech recognition programs, and wherein each information subset match results in a candidate set of match candidates.
- View Dependent Claims (12)
- - 12. The system according to claim 11, wherein:
    - the rearranging means is configured to restructure the data base information into information subsets and to feed the subsets to the at least one matching means for matching the input information with each information subset, each subset match resulting in a candidate set of match candidates.

14. A method for automated speech recognition, wherein audio input information is matched to data base information stored in a data base by at least one matching algorithm, the data base information comprising a plurality of entries stored in the data base, wherein before the input information is matched to the data base information, the data base information is arranged in the data base in a data base information structure;
- the method comprising;
  
  inputting audio input information from an audio input device;
  
  rearranging the data base information into a plurality of information subsets, rearranging the data base information from the data base information structure into a matching information structure in the subsets which differs from the data base information structure by redistributing data base information corresponding to words whose phonetic distance is below a phonetic distance threshold into subdirectories so that the entries whose phonetic distance is below the phonetic distance threshold are separated in different subdirectories in order to disrupt the forming of recognition artifacts due to their similarity, andapplying a plurality of speech recognition programs each with a matching algorithm to the data base information by matching the audio input information to each subset of the rearranged data base information, wherein the audio input information is matched to each subset by different speech recognition programs, and wherein each information subset match results in a candidate set of match candidates.

15. A method for automated speech recognition, wherein audio input information is matched to data base information stored in a data base by at least one matching algorithm, and wherein before the input information is matched to the data base information, the data base information is arranged in the data base in a data base information structure, the method comprising:
- inputting audio input information from an audio input device;
  
  performing a structural analysis of the content of the data base by comparing structural parameters of the data base content to predefined requirements;
  
  deciding, based on the result of the structural analysis, whether a rearrangement procedure of the data base information from a data base information structure to a matching information structure is required, and if a rearrangement procedure of the data base information is required;
  
  selecting one of multiple rearrangement procedures based on the result of the structural analysis procedure; and
  
  rearranging the data base information from the data base information structure into a matching information structure which differs from the data base information structure by redistributing data base information corresponding to words whose phonetic distance is below a phonetic distance threshold into subdirectories so that the entries whose phonetic distance is below the phonetic distance threshold are separated in different subdirectories in order to disrupt the forming of recognition artifacts due to their similarity; and
  
  applying a plurality of speech recognition programs each with a matching algorithm to the data base information by matching the audio input information to each subset of the rearranged data base information, wherein the audio input information is matched to each subset by different speech recognition programs, and wherein each information subset match results in a candidate set of match candidates.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Walter Steven Rosenbaum
Original Assignee
Walter Steven Rosenbaum
Inventors
Rosenbaum, Walter Steven, Bach, Joern
Primary Examiner(s)
Serrou, Abdelali

Application Number

US13/913,434
Publication Number

US 20140156273A1
Time in Patent Office

738 Days
Field of Search

704/251, 704/231, 704/235, 704/243, 704/252, 704/270, 704/257, 704/249, 704/E15.04, 704/E15.009, 704/E15.049, 704/E15.014, 704/275, 704/244, 704/246, 704/247
US Class Current

1/1
CPC Class Codes

G06F 16/211   Schema design and management

G06F 16/60   of audio data

G06F 16/61   Indexing; Data structures t...

G06F 16/901   Indexing; Data structures t...

G10L 15/06   Creation of reference templ...

G10L 15/26   Speech to text systems G10L...

G10L 15/32   Multiple recognisers used i...

G10L 2015/228   of application context

Method and system for automated speech recognition that rearranges an information data base to disrupt formation of recognition artifacts

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

25 Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Method and system for automated speech recognition that rearranges an information data base to disrupt formation of recognition artifacts

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

25 Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links