Enabling voice selection of user preferences

US 9,083,798 B2
Filed: 12/22/2004
Issued: 07/14/2015
Est. Priority Date: 12/22/2004
Status: Expired due to Fees

First Claim

Patent Images

1. A method for voice enabling a user interface in a multimodal content browser, the method comprising acts of:

accessing a first speech grammar, the first speech grammar having stored therein at least one voice command, the first speech grammar further storing a mapping of the at least one voice command to a corresponding placeholder identifier;

prior to performing a voice recognition processing, obtaining a second speech grammar from the first speech grammar, the second speech grammar storing a mapping of the at least one voice command to a navigation action that can be triggered by a user through the user interface, wherein the act of obtaining the second speech grammar comprises substituting a string of characters indicative of the navigation action in place of the placeholder identifier in the first speech grammar to obtain the second speech grammar, the string of characters being different from the placeholder identifier;

using the second speech grammar to perform the voice recognition processing, wherein the voice recognition processing comprises recognizing, from received voice input, the at least one voice command in the second speech grammar;

identifying the navigation action specified by the second speech grammar as corresponding to the at least one voice command; and

invoking logic in the user interface consistent with the navigation action.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method, system and apparatus for voice enabling a user preference interface in a multimodal content browser. A method for voice enabling a user preference interface in a multimodal content browser can include matching voice input to a bound command in a speech grammar and invoking logic in the user preference interface consistent with the bound command in the speech grammar. The matching step can include comparing voice input to entries in a markup language specified speech grammar and locating the bound command in the specified speech grammar based upon the comparison. In this regard, the method further can include identifying a variable in the bound command, looking up the variable in a table, retrieving a corresponding parameter for the variable from the table, and replacing the variable with the corresponding parameter in the bound command.

Citations

20 Claims

1. A method for voice enabling a user interface in a multimodal content browser, the method comprising acts of:
- accessing a first speech grammar, the first speech grammar having stored therein at least one voice command, the first speech grammar further storing a mapping of the at least one voice command to a corresponding placeholder identifier;
  
  prior to performing a voice recognition processing, obtaining a second speech grammar from the first speech grammar, the second speech grammar storing a mapping of the at least one voice command to a navigation action that can be triggered by a user through the user interface, wherein the act of obtaining the second speech grammar comprises substituting a string of characters indicative of the navigation action in place of the placeholder identifier in the first speech grammar to obtain the second speech grammar, the string of characters being different from the placeholder identifier;
  
  using the second speech grammar to perform the voice recognition processing, wherein the voice recognition processing comprises recognizing, from received voice input, the at least one voice command in the second speech grammar;
  
  identifying the navigation action specified by the second speech grammar as corresponding to the at least one voice command; and
  
  invoking logic in the user interface consistent with the navigation action.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1, wherein said act of recognizing further comprises acts of:
    - comparing the voice input to entries in the second speech grammar, wherein the second speech grammar is a markup language specified speech grammar; and
      
      locating said at least one voice command in said specified speech grammar based upon said comparison.
  - 3. The method of claim 2, wherein the placeholder identifier specified by the first speech grammar as corresponding to the at least one voice command is a variable, and wherein the act of obtaining the second speech grammar further comprises acts of:
    - looking up said variable in a table;
      
      retrieving a parameter corresponding to said variable from said table; and
      
      identifying the string of characters based at least in part on the retrieved parameter.
  - 4. The method of claim 1, wherein said act of invoking comprises acts of:
    - formulating an event utilizing said navigation action; and
      
      posting said event to an event handler in the user interface.
  - 5. The method of claim 1, wherein said act of invoking comprises an act of invoking logic programmed to bring a specified grouping of elements in the user interface into focus.
  - 6. The method of claim 1, wherein said act of invoking comprises an act of invoking logic programmed to set a preference in the user interface.
  - 7. The method of claim 5, wherein said act of invoking comprises acts of:
    - first invoking logic programmed to bring a specified grouping of elements in the user interface into focus; and
      
      second invoking logic programmed to set a preference in said specified grouping.
  - 8. The method of claim 1, wherein the string of characters indicative of the navigation action comprises an alphanumeric event string, and wherein the act of obtaining the second speech grammar further comprises:
    - identifying the alphanumeric event string based at least in part on the placeholder identifier, wherein substituting the string of characters comprises substituting the alphanumeric event string in place of the placeholder identifier in the first speech grammar.
  - 9. The method of claim 1, wherein the placeholder identifier mapped by the first speech grammar to the at least one voice command is different from the at least one voice command.

10. A system for voice enabling a user interface in a multimodal content browser, the system comprising:
- a first speech grammar having stored therein at least one voice command entry that stores a mapping of a voice command to a corresponding placeholder identifier; and
  
  at least one processor configured to;
  
  obtain a second speech grammar from the first speech grammar, the second speech grammar storing a mapping of the at least one voice command to a navigation action that can be triggered by a user through the user interface, wherein obtaining the second speech grammar comprises substituting a string of characters indicative of the navigation action in place of the placeholder identifier in the first speech grammar to obtain the second speech grammar, the string of characters being different from the placeholder identifier;
  
  use the second speech grammar to perform voice recognition processing, wherein the voice recognition processing comprises identifying, based on received voice input, the at least one voice command in at least one voice command entry in said second speech grammar;
  
  identify the navigation action specified by the second speech grammar as corresponding to the at least one voice command; and
  
  invoke logic in the user interface consistent with the navigation action.
- View Dependent Claims (11, 12)
- - 11. The system of claim 10, further comprising a table of command variables and corresponding command parameters, wherein the placeholder identifier specified by the first speech grammar as corresponding to the at least one voice command is a variable, and wherein said at least one processor is further configured to:
    - look up the variable in the table;
      
      retrieve a parameter corresponding to the variable from the table; and
      
      identify the string of characters based at least in part on the retrieved parameter.
  - 12. The system of claim 10, wherein the string of characters indicative of the navigation action comprises an alphanumeric event string, and wherein the at least one processor is programmed to obtain the second speech grammar at least in part by:
    - identifying the alphanumeric event string based at least in part on the placeholder identifier, wherein substituting the string of characters comprises substituting the alphanumeric event string in place of the placeholder identifier in the first speech grammar.

13. At least one non-transitory computer-readable medium having stored thereon computer instructions which, when executed, perform a method for voice enabling a user interface in a multimodal content browser, the method comprising acts of:
- accessing a first speech grammar, the first speech grammar having stored therein at least one voice command, the first speech grammar furthering storing a mapping of the at least one voice command to a corresponding placeholder identifier;
  
  prior to performing a voice recognition processing, obtaining a second speech grammar from the first speech grammar, the second speech grammar storing a mapping of the at least one voice command to a navigation action that can be triggered by a user through the user interface, wherein the act of obtaining the second speech grammar comprises substituting a string of characters indicative of the navigation action in place of the placeholder identifier in the first speech grammar to obtain the second speech grammar, the string of characters being different from the placeholder identifier;
  
  using the second speech grammar to perform the voice recognition processing, wherein the voice recognition processing comprises recognizing, from received voice input, the at least one voice command in the second speech grammar;
  
  identifying the navigation action specified by the second speech grammar as corresponding to the at least one voice command; and
  
  invoking logic in the user interface consistent with the navigation action.
- View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
- - 14. The at least one non-transitory computer-readable medium of claim 13, wherein said act of recognizing further comprises acts of:
    - comparing the voice input to entries in the second speech grammar, wherein the second speech grammar is a markup language specified speech grammar; and
      
      locating said at least one voice command in said specified speech grammar based upon said comparison.
  - 15. The at least one non-transitory computer-readable medium of claim 14, wherein the placeholder identifier specified by the first speech grammar as corresponding to the at least one voice command is a variable, and wherein the act of obtaining the second speech grammar further comprises acts of:
    - looking up said variable in a table;
      
      retrieving a parameter corresponding to said variable from said table; and
      
      identifying the string of characters based at least in part on the retrieved parameter.
  - 16. The at least one non-transitory computer-readable medium of claim 13, wherein said act of invoking comprises acts of:
    - formulating an event utilizing said navigation action; and
      
      posting said event to an event handler in the user interface.
  - 17. The at least one non-transitory computer-readable medium of claim 13, wherein said act of invoking comprises an act of invoking logic programmed to bring a specified grouping of elements in the user interface into focus.
  - 18. The at least one non-transitory computer-readable medium of claim 17, wherein said act of invoking comprises acts of:
    - first invoking logic programmed to bring a specified grouping of elements in the user interface into focus; and
      
      second invoking logic programmed to set a preference in said specified grouping.
  - 19. The at least one non-transitory computer-readable medium of claim 13, wherein said act of invoking comprises an act of invoking logic programmed to set a preference in the user interface.
  - 20. The at least one non-transitory computer-readable medium of claim 13, wherein the string of characters indicative of the navigation action comprises an alphanumeric event string, and wherein the act of obtaining the second speech grammar further comprises:
    - identifying the alphanumeric event string based at least in part on the placeholder identifier, wherein substituting the string of characters comprises substituting the alphanumeric event string in place of the placeholder identifier in the first speech grammar.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Cross, Charles W., Li, Yan
Primary Examiner(s)
Dorvil, Richemond
Assistant Examiner(s)
ORTIZ SANCHEZ, MICHAEL

Application Number

US11/022,464
Publication Number

US 20060136222A1
Time in Patent Office

3,856 Days
Field of Search

704/246, 704/270
US Class Current

1/1
CPC Class Codes

G10L 15/19   Grammatical context, e.g. d...

G10L 15/26   Speech to text systems G10L...

H04M 2201/40   using speech recognition

H04M 3/4938   comprising a voice browser ...

Enabling voice selection of user preferences

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Enabling voice selection of user preferences

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links