Speech Capabilities Of A Multimodal Application
First Claim
1. A method of improving speech capabilities of a multimodal application, the method implemented with a multimodal browser and a speech engine operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes, wherein the voice mode includes accepting speech input from a user, digitizing the speech, and providing digitized speech to a speech engine available to the multimodal browser for recognition, and wherein the non-voice mode includes accepting input from a user through physical user interaction with a user input device for the multimodal device;
- wherein the multimodal browser comprises a module of automated computing machinery for executing the multimodal application and the multimodal browser supports execution of a media file player, a module of automated computing machinery for playing media files;
the method comprising;
receiving, by the multimodal browser, a media file having a metadata container;
retrieving, by the multimodal browser, from the metadata container a speech artifact related to content stored in the media file for inclusion in the speech engine available to the multimodal browser;
determining whether the speech artifact includes a grammar rule or a pronunciation rule;
if the speech artifact includes a grammar rule, modifying, by the multimodal browser, the grammar of the speech engine to include the grammar rule; and
if the speech artifact includes a pronunciation rule, modifying, by the multimodal browser, the lexicon of the speech engine to include the pronunciation rule.
2 Assignments
0 Petitions
Accused Products
Abstract
Improving speech capabilities of a multimodal application including receiving, by the multimodal browser, a media file having a metadata container; retrieving, by the multimodal browser, from the metadata container a speech artifact related to content stored in the media file for inclusion in the speech engine available to the multimodal browser; determining whether the speech artifact includes a grammar rule or a pronunciation rule; if the speech artifact includes a grammar rule, modifying, by the multimodal browser, the grammar of the speech engine to include the grammar rule; and if the speech artifact includes a pronunciation rule, modifying, by the multimodal browser, the lexicon of the speech engine to include the pronunciation rule.
88 Citations
15 Claims
-
1. A method of improving speech capabilities of a multimodal application, the method implemented with a multimodal browser and a speech engine operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes, wherein the voice mode includes accepting speech input from a user, digitizing the speech, and providing digitized speech to a speech engine available to the multimodal browser for recognition, and wherein the non-voice mode includes accepting input from a user through physical user interaction with a user input device for the multimodal device;
- wherein the multimodal browser comprises a module of automated computing machinery for executing the multimodal application and the multimodal browser supports execution of a media file player, a module of automated computing machinery for playing media files;
the method comprising; receiving, by the multimodal browser, a media file having a metadata container; retrieving, by the multimodal browser, from the metadata container a speech artifact related to content stored in the media file for inclusion in the speech engine available to the multimodal browser; determining whether the speech artifact includes a grammar rule or a pronunciation rule; if the speech artifact includes a grammar rule, modifying, by the multimodal browser, the grammar of the speech engine to include the grammar rule; and if the speech artifact includes a pronunciation rule, modifying, by the multimodal browser, the lexicon of the speech engine to include the pronunciation rule. - View Dependent Claims (2, 3, 4, 5)
- wherein the multimodal browser comprises a module of automated computing machinery for executing the multimodal application and the multimodal browser supports execution of a media file player, a module of automated computing machinery for playing media files;
-
6. An apparatus for improving speech capabilities of a multimodal application, the apparatus including a multimodal browser and a multimodal application operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes, the apparatus comprising a computer processor and a computer memory operatively coupled to the computer processor, the computer memory having disposed within it computer program instructions for:
-
receiving, by the multimodal browser, a media file having a metadata container; retrieving, by the multimodal browser, from the metadata container a speech artifact related to content stored in the media file for inclusion in the speech engine available to the multimodal browser; determining whether the speech artifact includes a grammar rule or a pronunciation rule; if the speech artifact includes a grammar rule, modifying, by the multimodal browser, the grammar of the speech engine to include the grammar rule; and if the speech artifact includes a pronunciation rule, modifying, by the multimodal browser, the lexicon of the speech engine to include the pronunciation rule. - View Dependent Claims (7, 8, 9, 10)
-
-
11. An computer program product for improving speech capabilities of a multimodal application, the computer program product including a multimodal browser for operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes, the computer program product disposed upon a computer-readable, recording medium, the computer program product comprising computer program instructions capable for:
-
receiving, by the multimodal browser, a media file having a metadata container; retrieving, by the multimodal browser, from the metadata container a speech artifact related to content stored in the media file for inclusion in the speech engine available to the multimodal browser; determining whether the speech artifact includes a grammar rule or a pronunciation rule; if the speech artifact includes a grammar rule, modifying, by the multimodal browser, the grammar of the speech engine to include the grammar rule; and if the speech artifact includes a pronunciation rule, modifying, by the multimodal browser, the lexicon of the speech engine to include the pronunciation rule. - View Dependent Claims (12, 13, 14, 15)
-
Specification