Generation and selection of voice recognition grammars for conducting database searches

US 7,729,913 B1
Filed: 03/18/2003
Issued: 06/01/2010
Est. Priority Date: 03/18/2003
Status: Active Grant

First Claim

Patent Images

1. A method for conducting database searches by telephone, the method comprising:

subdividing a master grammar set into multiple speech recognition grammars, each of which corresponds to a different sequence of N telephone keys;

prompting a telephone user to enter N characters of a search query on a telephone keypad with one key selection per character, and detecting a resulting sequence of N telephone keys selected by the user;

prompting the telephone user to say the search query, and receiving a resulting utterance from the user; and

selecting the speech recognition grammar that corresponds to the sequence of N telephone keys selected by the telephone user, and using said speech recognition grammar to interpret the utterance;

wherein the speech recognition grammar is selected without requiring the user to uniquely specify said N characters of the search query;

wherein N is less than the number of characters of the search query as uttered by the user, such that the user does not enter all of the characters of the search query on the telephone keypad;

wherein the method is performed by a computerized system that comprises one or more computing devices.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Various processes are disclosed for conducting database searches by telephone. A user is initially prompted to utter N characters of a search query, and/or to enter these characters on a telephone keypad. Based on the user'"'"'s entry, a speech recognition grammar is either selected or is generated dynamically for processing the user'"'"'s utterance of the complete search query. In one embodiment, the speech recognition grammar is selected based on a sequence of N telephone keys selected by the user, without requiring the user to uniquely specify the N characters to which these keys correspond. In another embodiment, the speech recognition grammar is selected based solely on utterances of the N characters by the user such that the correct grammar is selected even if an utterance of one character is misidentified as an utterance of a similar sounding character. Also disclosed are methods for generating speech recognition grammars from query logs.

116 Citations

View as Search Results

44 Claims

1. A method for conducting database searches by telephone, the method comprising:
- subdividing a master grammar set into multiple speech recognition grammars, each of which corresponds to a different sequence of N telephone keys;
  
  prompting a telephone user to enter N characters of a search query on a telephone keypad with one key selection per character, and detecting a resulting sequence of N telephone keys selected by the user;
  
  prompting the telephone user to say the search query, and receiving a resulting utterance from the user; and
  
  selecting the speech recognition grammar that corresponds to the sequence of N telephone keys selected by the telephone user, and using said speech recognition grammar to interpret the utterance;
  
  wherein the speech recognition grammar is selected without requiring the user to uniquely specify said N characters of the search query;
  
  wherein N is less than the number of characters of the search query as uttered by the user, such that the user does not enter all of the characters of the search query on the telephone keypad;
  
  wherein the method is performed by a computerized system that comprises one or more computing devices.

2. The method of claim 1, wherein the step of subdividing the master grammar set comprises grouping together terms and phrases whose first N characters correspond to the same N telephone keys.

3. The method of claim 1, further comprising generating the master grammar set at least in part by extracting search terms and search phrases from a query log reflective of search queries submitted by a population of users.

4. The method of claim 1, wherein N=3 or 4.

5. The method of claim 1, further comprising executing a database search using a search query string generated by interpreting the utterance, said database search being a search for at least one of the following:
- creative works represented in a database, physical products, web pages indexed by a crawler.

6. The method of claim 1, wherein the step of subdividing the master grammar set into multiple speech recognition grammars is performed off-line, and comprises grouping together master grammar set entries based on the first N characters of said entries.

7. The method of claim 6, wherein the step of prompting the telephone user is performed after said step of subdividing the master grammar set into multiple speech recognition grammars.

8. The method of claim 1, wherein subdividing the master grammar set comprises storing the multiple speech recognition grammars in a data repository in association with the respective sequences of N telephone keys to which they correspond prior to said step of prompting the telephone user to enter the N characters of the search query, and wherein selecting the speech recognition grammar that corresponds to the sequence of N telephone keys comprises looking up the speech recognition grammar in the data repository.

9. The method of claim 1, wherein the method is performed at least partly by a system that hosts a web site, and the method further comprises executing the search query to identify content of said web site to audibly output to the telephone user.

10. A system that operates according to the method of claim 1.

11. A method for conducting database searches by telephone, the method comprising:
- subdividing a master grammar set into multiple speech recognition grammars, each of which corresponds to a different sequence of N telephone keys;
  
  receiving from a user an indication of a sequence of N telephone keys corresponding to N characters of a search query;
  
  receiving from the user an utterance of the search query;
  
  selecting the speech recognition grammar that corresponds to the sequence of N telephone keys, said speech recognition grammar specifying valid utterances; and
  
  interpreting the utterance of the search query with the speech recognition grammar to convert the utterance into a textual representation of the search query;
  
  wherein the method is performed without requiring the user to uniquely specify said N characters;
  
  wherein N is less than the number of characters in the search query as uttered by the user, such that the user need not select a respective telephone key for each character of the search query;
  
  wherein the method is performed by a computerized system that comprises one or more computing devices.

12. The method of claim 11, wherein the indication of the sequence of N telephone keys comprises DTMF tones generated by telephone keypad depressions.

13. The method of claim 11, wherein the indication of the sequence of N telephone keys comprises utterances by the user of numerical telephone digits.

14. The method of claim 11, wherein the speech recognition grammar specifies substantially all valid utterances that start with an N-character sequence corresponding to the sequence of N telephone keys.

15. The method of claim 11, wherein the speech recognition grammar comprises search queries starting with multiple different sets of N characters.

16. The method of claim 11, wherein N=3 or 4.

17. The method of claim 11, wherein the method is performed at least partly by a system that hosts a web site, and the method further comprises executing the search query to identify content of said web site to audibly output to the user.

18. A system that embodies the method of claim 11.

19. A method for conducting database searches by telephone, the method comprising:
- receiving a sequence of N character utterances from a telephone user, said character utterances corresponding to respective characters of a search query;
  
  receiving an utterance of the search query from the user;
  
  interpreting the sequence of character utterances to generate a sequence of N characters;
  
  translating the sequence of N characters to a corresponding sequence of N telephone digits; and
  
  selecting a speech recognition grammar that corresponds to the sequence of N telephone digits, said speech recognition grammar corresponding to multiple different possible N-character sequences; and
  
  interpreting the utterance of the search query using the selected speech recognition grammar to generate a textual representation of the search query;
  
  wherein N is less than the number of characters in the search query as uttered by the user such that the user need not verbally spell the entire search query;
  
  wherein the method is performed by a system that comprises computer hardware;
  
  wherein the speech recognition grammar is selected from a repository of pre-generated speech recognition grammars in which different speech recognition grammars correspond to different sequences of N telephone digits, said repository of pre-generated speech recognition grammars generated by subdividing a master grammar set based on the first N characters of entries in said master grammar set.

20. The method of claim 19, wherein the speech recognition grammar specifies substantially all valid utterances that start with an N-character sequence corresponding to the sequence of N telephone digits.

21. The method of claim 19, wherein the speech recognition grammar comprises search queries starting with multiple different sets of N characters.

22. The method of claim 19, wherein N=3 or 4.

23. The method of claim 19, wherein the method is performed at least partly by a system that hosts a web site, and the method further comprises executing the search query to identify content of said web site to audibly output to the telephone user.

24. A system that embodies the method of claim 19.

25. A fault-tolerant method of capturing search queries submitted by voice, the method comprising:
- receiving a sequence of N character utterances from a user that specify a portion of a search query, wherein N is less than the number of characters in the search query;
  
  interpreting the sequence of N character utterances to generate a sequence of N characters;
  
  selecting a speech recognition grammar that corresponds to the sequence of N characters such that the same speech recognition grammar is selected regardless of whether an utterance of a first character by the user is misinterpreted as an utterance of a second, similar sounding character; and
  
  interpreting an utterance of the search query by the user with the selected speech recognition grammar;
  
  whereby the search query is captured without requiring the user to utter all of the characters of the search query;
  
  wherein the method is performed by a computerized system that comprises one or more computing devices;
  
  wherein the step of selecting a speech recognition grammar comprises selecting from a plurality of pre-generated speech recognition grammars, each pre-generated speech recognition grammar representing a different respective subset of a master grammar set, said subsets formed by grouping together entries based on the first N characters of each entry.

26. The method of claim 25, wherein the selected speech recognition grammar corresponds to multiple different possible combinations of N characters that may be uttered by users.

27. The method of claim 25, wherein the selected speech recognition grammar comprises textual representations of valid search query utterances starting with multiple different sequences of N characters.

28. The method of claim 25, wherein the selected speech recognition grammar corresponds uniquely to a sequence of N telephone keys.

29. The method of claim 25, wherein the first and second characters are alphabetic characters that do not appear on a common telephone key of a standard telephone.

30. The method of claim 25, wherein the first and second characters are B and P.

31. The method of claim 25, wherein the first and second characters are, respectively, one of the following:
- B and P, P and B, A and H, H and A, A and K, K and A.

32. The method of claim 25, wherein the method is performed at least partly by a system that hosts a web site, and the method further comprises executing the search query to identify content of said web site to audibly output to the user.

33. A method for generating and using fault-tolerant speech recognition grammars, the method comprising:
- generating a master grammar set that contains textual representations of valid search query utterances;
  
  subdividing the master grammar set according to the first N characters of the textual representations of valid search query utterances to generate multiple speech recognition grammars, such that different speech recognition grammars correspond to different N-character sequences;
  
  storing said multiple speech recognition grammars in a data repository in association with the N-character sequences to which they correspond; and
  
  subsequently, via execution of software by a computerized system, interpreting a sequence of N character utterances of a user to generate a sequence of N characters;
  
  selecting from said data repository the speech recognition grammar corresponding to said sequence of N characters; and
  
  using the selected speech recognition grammar to interpret an utterance by the user of a search query;
  
  wherein N is less than the number of characters in the uttered search query, and wherein subdividing the master grammar set comprises treating a set of two or more similar sounding characters as the same character so that multiple N-character sequences map to the same speech recognition grammar.

34. The method of claim 33, wherein the set of similar sounding characters comprises alphabetic characters that do not appear on a common key of a standard telephone keypad.

35. The method of claim 33, wherein the set of similar sounding characters comprises the letters B and P.

36. The method of claim 33, wherein the set of similar sounding characters comprises the letters A and H.

37. The method of claim 33, wherein the set of similar sounding characters comprises the letters A and K.

38. The method of claim 33, wherein the set of similar sounding characters comprises the letters D and E.

39. The method of claim 33, wherein:
- at least one character of said sequence of N character utterances is a member of said set of similar sounding characters;
  
  whereby misidentification of an uttered character in the set as a similar sounding character within the set does not cause the speech recognition grammar to be incorrectly selected.

40. The method of claim 33, wherein generating the master grammar set comprises extracting search terms and phrases from a query log reflective of search queries submitted by a population of users.

41. The method of claim 33, wherein generating the master grammar set comprises extracting terms and phrases from item records of a database.

42. A computer program that embodies the method of claim 33 represented on or within a computer-readable medium.

43. The method of claim 33, further comprising submitting the search query, as obtained by interpreting the utterance of the search query, to a search engine to search for at least one of the following:
- web pages, physical products, creative works.

44. The method of claim 33, wherein the method is performed at least partly by a system that hosts a web site, and the method further comprises executing the search query to identify content of said web site to audibly output to the user.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
A9.com Incorporated (Amazon.com, Inc.)
Original Assignee
A9.com Incorporated (Amazon.com, Inc.)
Inventors
Lee, Nicholas J., Schoenbaum, Ronald J., Frederick, Robert
Primary Examiner(s)
Hudspeth; David R
Assistant Examiner(s)
Neway; Samuel G

Application Number

US10/392,203
Time in Patent Office

2,632 Days
Field of Search

None
US Class Current

704/254
CPC Class Codes

G06F 16/3329   Natural language query form...

G10L 15/19   Grammatical context, e.g. d...

G10L 15/26   Speech to text systems G10L...

Generation and selection of voice recognition grammars for conducting database searches

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

116 Citations

44 Claims

Specification

Solutions

Use Cases

Quick Links

Generation and selection of voice recognition grammars for conducting database searches

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

116 Citations

44 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links