Method, system, and computer readable medium for comparing phonetic similarity of return words to resolve ambiguities during voice recognition

US 8,494,855 B1
Filed: 10/06/2004
Issued: 07/23/2013
Est. Priority Date: 10/06/2004
Status: Active Grant

First Claim

Patent Images

1. A method for a speech recognition system to select a return value corresponding to a spoken input, the method comprising:

(a) generating a dictionary comprising return values preexisting in the speech recognition system;

(b) generating a grammar for each return value in the dictionary;

(c) for each return value in the dictionary, analyzing the grammar to determine a subset of return values from the dictionary that are likely alternatives for the return value, comprising for each string in the grammar for the return value, comparing the string with every other string in the dictionary that is not in the grammar for that return value; and

if said comparison indicates that the strings are related based on one of a phonetic similarity threshold and a synonym relationship then adding the return value associated with the other string to the subset;

(d) selecting a first return value corresponding to the spoken input based on the grammar;

(e) if the first return value is not confirmed by a user, then presenting the return values in the subset for the first return value at once to the user for selection, wherein the user is notified of strings that have a high likelihood of being confused so that the user can make changes to the grammar.

View all claims

8 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In one embodiment, the invention provides a method for a speech recognition system to select a return value corresponding to a spoken input. The method comprises generating a dictionary comprising return values associated with data provisioned in the speech recognition system; generating a grammar for each return value in the dictionary; analyzing the grammar to determine a subset of return values from the dictionary that are likely alternatives for each return value in the dictionary, based on the grammar; selecting a return value corresponding to the spoken input based on the grammar; and if the selected return value is not confirmed by a user, then presenting the likely alternative for the selected return value to the user.

29 Citations

View as Search Results

14 Claims

1. A method for a speech recognition system to select a return value corresponding to a spoken input, the method comprising:
- (a) generating a dictionary comprising return values preexisting in the speech recognition system;
  
  (b) generating a grammar for each return value in the dictionary;
  
  (c) for each return value in the dictionary, analyzing the grammar to determine a subset of return values from the dictionary that are likely alternatives for the return value, comprising for each string in the grammar for the return value, comparing the string with every other string in the dictionary that is not in the grammar for that return value; and
  
  if said comparison indicates that the strings are related based on one of a phonetic similarity threshold and a synonym relationship then adding the return value associated with the other string to the subset;
  
  (d) selecting a first return value corresponding to the spoken input based on the grammar;
  
  (e) if the first return value is not confirmed by a user, then presenting the return values in the subset for the first return value at once to the user for selection, wherein the user is notified of strings that have a high likelihood of being confused so that the user can make changes to the grammar.
- View Dependent Claims (2, 3, 4)
- - 2. The method of claim 1, wherein comparing the strings uses a dynamic programming algorithm.
  - 3. The method of claim 2, wherein the dynamic programming algorithm is based on the Smith-Waterman/Needleman-Wunsch algorithm.
  - 4. The method of claim 1, wherein steps (a) to (c) are performed at compile time as opposed to at runtime.

5. A speech recognition system, comprising:
- a memory configured to store logic instructions and a processor configured to execute the logic instructions that when executed cause the;
  
  logic instructions to generate a dictionary comprising return values preexisting in the speech recognition system;
  
  logic instructions to generate a grammar for each return value in the dictionary;
  
  logic instructions to select a return value corresponding to a spoken input;
  
  logic instructions to confirm the selected return value with a user;
  
  logic instructions to generate a subset of alternative return values for the spoken input, wherein each alternative return value is related to the selected return value based on one of a synonym relationship and a phonetic similarity threshold between grammars for the return value and the alternative return value; and
  
  logic instructions to present the alternative return values to the user for selection, wherein the user is notified of at least a string that has a high likelihood of being confused so that the user can make changes to the grammar.
- View Dependent Claims (6, 7, 8, 9)
- - 6. The speech recognition system of claim 5, wherein the phonetic similarity threshold is calculated by comparing a phonetic representation of strings in the grammars for the return value and the alternative return value.
  - 7. The speech recognition system of claim 6, wherein the phonetic similarity threshold is calculated using a dynamic programming algorithm.
  - 8. The speech recognition system of claim 7, wherein the dynamic programming algorithm is based on the Smith-Waterman/Needleman-Wunsch algorithm.
  - 9. The speech recognition system of claim 5, wherein the logic to generate a subset of alternative return values generates the return values at compile time as opposed to at runtime.

10. A non-transitory computer-readable medium, having stored thereon a sequence of instructions, which when executed by a computer processor, cause the computer processor to perform a speech recognition algorithm to select a preexisting return value corresponding to a spoken input, the computer processor being further configured to perform:
- generating a dictionary comprising return values preexisting in the speech recognition system;
  
  generating a grammar for each return value in the dictionary;
  
  for each return value in the dictionary, analyzing the grammar to determine a subset of return values from the dictionary that are likely alternatives for the return value, comprising for each string in the grammar for the return value, comparing the string with every other string in the dictionary that is not in the grammar for that return value; and
  
  if said comparison indicates that the strings are related based on one of a phonetic similarity threshold and a synonym relationship then adding the return value associated with the other string to the subset;
  
  selecting a first return value corresponding to the spoken input based on the grammar; and
  
  if the first return value is not confirmed by a user, then presenting the return values in the subset for the first return value at once to the user for selection, wherein the user is notified of strings that have a high likelihood of being confused so that the user can make changes to the grammar.

11. A speech recognition system, comprising:
- a processor; and
  
  a memory coupled to the processor, the memory storing instructions which when executed by the processor, cause the system to perform a method for selecting a preexisting return value corresponding to a spoken input, the method comprising;
  
  generating a dictionary comprising return values preexisting in the speech recognition system;
  
  generating a grammar for each return value in the dictionary;
  
  for each return value in the dictionary, analyzing the grammar to determine a subset of return values from the dictionary that are likely alternatives for the return value, comprising for each string in the grammar for the return value, comparing the string with every other string in the dictionary that is not in the grammar for that return value; and
  
  if said comparison indicates that the strings are related based on one of a phonetic similarity threshold and a synonym relationship then adding the return value associated with the other string to the subset;
  
  selecting a first return value corresponding to the spoken input based on the grammar;
  
  if the first return value is not confirmed by a user, then presenting the return values in the subset for the first return value at once to the user for selection, wherein the user is notified of strings that have a high likelihood of being confused so that the user can make changes to the grammar.

12. A non-transitory computer-readable storage medium, having stored thereon, a sequence of instructions which when executed by a computer processor, cause the computer processor to perform:
- generating a dictionary comprising return values preexisting in the speech recognition system;
  
  generating a grammar for each return value in the dictionary;
  
  selecting a return value corresponding to a spoken input;
  
  confirming the selected return value with a user;
  
  generating a subset of alternative return values for the spoken input, wherein each alternative return value is related to the selected return value based on one of a synonym relationship and a phonetic similarity threshold return value between the grammars for the return value and the alternative return value; and
  
  presenting the alternative return values to the user for selection, wherein the user is notified of at least a string that has a high likelihood of being confused so that the user can make changes to the grammar.

13. A method for a speech recognition system, comprising:
- generating a dictionary comprising return values preexisting in the speech recognition system;
  
  generating a grammar for each return value in the dictionary;
  
  selecting a return value corresponding to a spoken input;
  
  confirming the selected return value with a user;
  
  generating a subset of alternative return values for the spoken input, wherein each alternative return value is related to the selected return value based on one of a synonym relationship and a phonetic similarity threshold between the grammars for the return value and the alternative return value; and
  
  presenting the alternative return values to the user for selection, wherein the user is notified of at least a string that has a high likelihood of being confused so that the user can make changes to the grammar.
- View Dependent Claims (14)
- - 14. The method of claim 13, wherein the step of generating the subset of alternative return values is performed at compile time as opposed to at runtime.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Intrado Interactive Services Corp.
Original Assignee
West Interactive Corporation (West Corporation)
Inventors
Khosla, Ashok
Primary Examiner(s)
PULLIAS, JESSE SCOTT

Application Number

US10/960,198
Time in Patent Office

3,212 Days
Field of Search

704/270, 704/275, 704/E15, 704231-257, 7042701-274
US Class Current

704/251
CPC Class Codes

G10L 15/06   Creation of reference templ...

G10L 15/193   Formal grammars, e.g. finit...

G10L 2015/221   Announcement of recognition...

Method, system, and computer readable medium for comparing phonetic similarity of return words to resolve ambiguities during voice recognition

First Claim

8 Assignments

0 Petitions

Accused Products

Abstract

29 Citations

14 Claims

Specification

Solutions

Use Cases

Quick Links

Method, system, and computer readable medium for comparing phonetic similarity of return words to resolve ambiguities during voice recognition

First Claim

8 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

29 Citations

14 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links