×

Improving speech capabilities of a multimodal application

  • US 8,380,513 B2
  • Filed: 05/19/2009
  • Issued: 02/19/2013
  • Est. Priority Date: 05/19/2009
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method of improving speech capabilities of a multimodal application, the method implemented with a multimodal browser and a speech engine operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes, wherein the voice mode includes accepting speech input from a user, digitizing the speech, and providing digitized speech to a speech engine available to the multimodal browser for recognition, and wherein the non-voice mode includes accepting input from a user through physical user interaction with a user input device for the multimodal device;

  • wherein the multimodal browser comprises a module of automated computing machinery for executing the multimodal application and the multimodal browser supports execution of a media file player, a module of automated computing machinery for playing media files;

    the method comprising;

    receiving, by the multimodal browser, a media file having a metadata container;

    retrieving, by the multimodal browser, from the metadata container a speech artifact related to content stored in the media file for inclusion in the speech engine available to the multimodal browser, wherein said retrieving, by the multimodal browser, from the metadata container the speech artifact for inclusion in a speech engine available to the multimodal browser comprises retrieving an XML document from the metadata container;

    determining whether the speech artifact includes a grammar rule or a pronunciation rule;

    if the speech artifact includes a grammar rule, modifying, by the multimodal browser, the grammar of the speech engine to include the grammar rule, wherein said modifying, by the multimodal browser, the grammar of the speech engine to include the grammar rule includes extracting from the XML document retrieved from the metadata container a grammar rule and including the grammar rule in an XML grammar document in the speech engine; and

    if the speech artifact includes a pronunciation rule, modifying, by the multimodal browser, the lexicon of the speech engine to include the pronunciation rule, wherein said modifying, by the multimodal browser, the lexicon of the speech engine to include the pronunciation rule includes extracting from the XML document retrieved from the metadata container a pronunciation rule and including the pronunciation rule in an XML lexicon document in the speech engine.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×