AUTOMATIC SPEECH RECOGNITION WITH A SELECTION LIST
First Claim
1. A method of automatic speech recognition (‘
- ASR’
), the method implemented with a speech recognition grammar of a multimodal application, with the multimodal application operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and a visual mode, the multimodal application operatively coupled to a grammar interpreter, the method comprising;
accepting by the multimodal application speech input and visual input for selecting or deselecting items in a selection list, the speech input enabled by a speech recognition grammar, the speech recognition grammar including a semantic interpretation script capable of producing a semantic interpretation token having a value that indicates whether to select or deselect items in the selection list;
providing, from the multimodal application to the grammar interpreter, the speech input and the speech recognition grammar;
receiving, by the multimodal application from the grammar interpreter, interpretation results, the interpretation results including matched words from the grammar that correspond to items in the selection list and a semantic interpretation token that specifies whether to select or deselect items in the selection list; and
determining, by the multimodal application in dependence upon the value of the semantic interpretation token, whether to select or deselect items in the selection list that correspond to the matched words.
3 Assignments
0 Petitions
Accused Products
Abstract
Methods, apparatus, and computer program products are described for automatic speech recognition (‘ASR’) that include accepting by the multimodal application speech input and visual input for selecting or deselecting items in a selection list, the speech input enabled by a speech recognition grammar; providing, from the multimodal application to the grammar interpreter, the speech input and the speech recognition grammar; receiving, by the multimodal application from the grammar interpreter, interpretation results including matched words from the grammar that correspond to items in the selection list and a semantic interpretation token that specifies whether to select or deselect items in the selection list; and determining, by the multimodal application in dependence upon the value of the semantic interpretation token, whether to select or deselect items in the selection list that correspond to the matched words.
-
Citations
20 Claims
-
1. A method of automatic speech recognition (‘
- ASR’
), the method implemented with a speech recognition grammar of a multimodal application, with the multimodal application operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and a visual mode, the multimodal application operatively coupled to a grammar interpreter, the method comprising;accepting by the multimodal application speech input and visual input for selecting or deselecting items in a selection list, the speech input enabled by a speech recognition grammar, the speech recognition grammar including a semantic interpretation script capable of producing a semantic interpretation token having a value that indicates whether to select or deselect items in the selection list; providing, from the multimodal application to the grammar interpreter, the speech input and the speech recognition grammar; receiving, by the multimodal application from the grammar interpreter, interpretation results, the interpretation results including matched words from the grammar that correspond to items in the selection list and a semantic interpretation token that specifies whether to select or deselect items in the selection list; and determining, by the multimodal application in dependence upon the value of the semantic interpretation token, whether to select or deselect items in the selection list that correspond to the matched words. - View Dependent Claims (2, 3, 4, 5, 6)
- ASR’
-
7. Apparatus for automatic speech recognition (‘
- ASR’
), the apparatus implemented with a speech recognition grammar of a multimodal application, with the multimodal application operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and a visual mode, the multimodal application operatively coupled to a grammar interpreter, the apparatus comprising a computer processor and a computer memory operatively coupled to the computer processor, the computer memory having disposed within it computer program instructions capable of;accepting by the multimodal application speech input and visual input for selecting or deselecting items in a selection list, the speech input enabled by a speech recognition grammar, the speech recognition grammar including a semantic interpretation script capable of producing a semantic interpretation token having a value that indicates whether to select or deselect items in the selection list; providing, from the multimodal application to the grammar interpreter, the speech input and the speech recognition grammar; receiving, by the multimodal application from the grammar interpreter, interpretation results, the interpretation results including matched words from the grammar that correspond to items in the selection list and a semantic interpretation token that specifies whether to select or deselect items in the selection list; and determining, by the multimodal application in dependence upon the value of the semantic interpretation token, whether to select or deselect items in the selection list that correspond to the matched words. - View Dependent Claims (8, 9, 10, 11, 12)
- ASR’
-
13. A computer program product for automatic speech recognition (‘
- ASR’
), the computer program product comprising a multimodal application that includes a speech recognition grammar, the multimodal application capable of operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and a visual mode, the multimodal application operatively coupled to a grammar interpreter, the computer program product disposed upon a computer-readable, signal-bearing medium, the computer program product comprising computer program instructions capable of;accepting by the multimodal application speech input and visual input for selecting or deselecting items in a selection list, the speech input enabled by a speech recognition grammar, the speech recognition grammar including a semantic interpretation script capable of producing a semantic interpretation token having a value that indicates whether to select or deselect items in the selection list; providing, from the multimodal application to the grammar interpreter, the speech input and the speech recognition grammar; receiving, by the multimodal application from the grammar interpreter, interpretation results, the interpretation results including matched words from the grammar that correspond to items in the selection list and a semantic interpretation token that specifies whether to select or deselect items in the selection list; and determining, by the multimodal application in dependence upon the value of the semantic interpretation token, whether to select or deselect items in the selection list that correspond to the matched words. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
- ASR’
Specification