Configuring a speech engine for a multimodal application based on location

US 8,938,392 B2
Filed: 02/27/2007
Issued: 01/20/2015
Est. Priority Date: 02/27/2007
Status: Active Grant

First Claim

Patent Images

1. A method of configuring a speech engine for a multimodal application based on location, the multimodal application operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to a speech engine, the method comprising:

receiving a location change notification in a location change monitor from a device location manager, the device location manager operatively coupled to a position detection component of the multimodal device, the location change notification specifying a current location of the multimodal device;

identifying in a configuration parameter repository, location-based configuration parameters for the speech engine in dependence upon the current location of the multimodal device, the location-based configuration parameters specifying a configuration for the speech engine at the current location, wherein the location-based configuration parameters include an identifier for an acoustic model from among a plurality of acoustic models including a first acoustic model and a second acoustic model, an identifier for a lexicon from among a plurality of lexicons including a first lexicon and a second lexicon, speech transition times, silence detection times, speech timeouts, gain maps, and a configuration for use by a text-to-speech (‘

TTS’

) engine including a voice used in synthesizing speech from text, wherein each of the first acoustic model and the second acoustic model associates acoustic features with phonemes, wherein the first lexicon and the second lexicon specify a different phoneme representation for a same word; and

updating, by the location change monitor, a current configuration for the speech engine according to the identified location-based configuration parameters.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, apparatus, and products are disclosed for configuring a speech engine for a multimodal application based on location. The multimodal application operates on a multimodal device supporting multiple modes of user interaction with the multimodal application. The multimodal application is operatively coupled to a speech engine. Configuring a speech engine for a multimodal application based on location includes: receiving a location change notification in a location change monitor from a device location manager, the location change notification specifying a current location of the multimodal device; identifying, by the location change monitor, location-based configuration parameters for the speech engine in dependence upon the current location of the multimodal device, the location-based configuration parameters specifying a configuration for the speech engine at the current location; and updating, by the location change monitor, a current configuration for the speech engine according to the identified location-based configuration parameters.

153 Citations

11 Claims

1. A method of configuring a speech engine for a multimodal application based on location, the multimodal application operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to a speech engine, the method comprising:
- receiving a location change notification in a location change monitor from a device location manager, the device location manager operatively coupled to a position detection component of the multimodal device, the location change notification specifying a current location of the multimodal device;
  
  identifying in a configuration parameter repository, location-based configuration parameters for the speech engine in dependence upon the current location of the multimodal device, the location-based configuration parameters specifying a configuration for the speech engine at the current location, wherein the location-based configuration parameters include an identifier for an acoustic model from among a plurality of acoustic models including a first acoustic model and a second acoustic model, an identifier for a lexicon from among a plurality of lexicons including a first lexicon and a second lexicon, speech transition times, silence detection times, speech timeouts, gain maps, and a configuration for use by a text-to-speech (‘
  
  TTS’
  
  ) engine including a voice used in synthesizing speech from text, wherein each of the first acoustic model and the second acoustic model associates acoustic features with phonemes, wherein the first lexicon and the second lexicon specify a different phoneme representation for a same word; and
  
  updating, by the location change monitor, a current configuration for the speech engine according to the identified location-based configuration parameters.
- View Dependent Claims (2, 3, 4)
- - 2. The method of claim 1 further comprising:
    - receiving a position notification in the device location manager for the multimodal device from a position detection module, the position detection module operatively coupled to the position detection component of the multimodal device, the position notification specifying a geographic coordinate for the multimodal device; and
      
      determining, by the device location manager, the current location of the multimodal device in dependence upon the geographic coordinate.
  - 3. The method of claim 1, wherein the first acoustic model and the second acoustic model associate different audio input with a same phoneme.
  - 4. The method of claim 1, wherein the location change notification includes a timestamp specifying a time at which the multimodal device arrived at the current location, and wherein identifying the location-based configuration parameters in the configuration parameter repository comprises identifying the location-based configuration parameters based, at least in part, on the timestamp specified in the location change notification and at least one previous timestamp included in a previous location change notification.

5. An apparatus for configuring a speech engine for a multimodal application based on location, the multimodal application operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to a speech engine, the apparatus comprising a computer processor and a computer memory operatively coupled to the computer processor, the computer memory having disposed within it computer program instructions that, when executed by the computer processor, perform a method of:
- receiving a location change notification in a location change monitor from a device location manager, the device location manager operatively coupled to a position detection component of the multimodal device, the location change notification specifying a current location of the multimodal device;
  
  identifying in a configuration parameter repository, location-based configuration parameters for the speech engine in dependence upon the current location of the multimodal device, the location-based configuration parameters specifying a configuration for the speech engine at the current location, wherein the location-based configuration include an identifier for an acoustic model from among a plurality of acoustic models including a first acoustic model and a second acoustic model, an identifier for a lexicon from among a plurality of lexicons including a first lexicon and a second lexicon, speech transition times, silence detection times, speech timeouts, gain maps, and a configuration for use by a text-to-speech (‘
  
  TTS’
  
  ) engine including a voice used in synthesizing speech from text, wherein each of the first acoustic model and the second acoustic model associates acoustic features with phonemes, wherein the first lexicon and the second lexicon specify a different phoneme representation for a same word; and
  
  updating, by the location change monitor, a current configuration for the speech engine according to the identified location-based configuration parameters.
- View Dependent Claims (6, 7, 8)
- - 6. The apparatus of claim 5 further comprising:
    - receiving a position notification in the device location manager for the multimodal device from a position detection module, the position detection module operatively coupled to the position detection component of the multimodal device, the position notification specifying a geographic coordinate for the multimodal device; and
      
      determining, by the device location manager, the current location of the multimodal device in dependence upon the geographic coordinate.
  - 7. The apparatus of claim 5, wherein the first acoustic model and the second acoustic model associate different audio input with a same phoneme.
  - 8. The apparatus of claim 5, wherein the location change notification includes a timestamp specifying a time at which the multimodal device arrived at the current location, and wherein identifying the location-based configuration parameters in the configuration parameter repository comprises identifying the location-based configuration parameters based, at least in part, on the timestamp specified in the location change notification and at least one previous timestamp included in a previous location change notification.

9. A non-transitory computer readable recordable medium encoded with a plurality of instructions that, when executed on a computer, perform a method of configuring a speech engine for a multimodal application based on location, the multimodal application operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to a speech engine, the method comprising:
- receiving a location change notification in a location change monitor from a device location manager, the device location manager operatively coupled to a position detection component of the multimodal device, the location change notification specifying a current location of the multimodal device, wherein the location change notification includes a timestamp specifying a time at which the multimodal device arrived at the current location;
  
  identifying in a configuration parameter repository, location-based configuration parameters for the speech engine in dependence upon the current location of the multimodal device, the location-based configuration parameters specifying a configuration for the speech engine at the current location, wherein the location-based configuration parameters include an identifier for an acoustic model from among a plurality of acoustic models including a first acoustic model and a second acoustic model, an identifier for a lexicon from among a plurality of lexicons including a first lexicon and a second lexicon, speech transition times, silence detection times, speech timeouts, gain maps, and a configuration for use by a text-to-speech (‘
  
  TTS’
  
  ) engine including a voice used in synthesizing speech from text, wherein each of the first acoustic model and the second acoustic model associates acoustic features with phonemes, wherein the first lexicon and the second lexicon specify a different phoneme representation for a same word; and
  
  updating, by the location change monitor, a current configuration for the speech engine according to the identified location-based configuration parameters.
- View Dependent Claims (10, 11)
- - 10. The computer readable recordable medium of claim 9, wherein the first acoustic model and the second acoustic model associate different audio input with a same phoneme.
  - 11. The computer readable recordable medium of claim 9, wherein the location change notification includes a timestamp specifying a time at which the multimodal device arrived at the current location, and wherein identifying the location-based configuration parameters in the configuration parameter repository comprises identifying the location-based configuration parameters based, at least in part, on the timestamp specified in the location change notification and at least one previous timestamp included in a previous location change notification.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Cross, Charles W. Jr., Jablokov, Igor R.
Primary Examiner(s)
ROBERTS, SHAUN A

Application Number

US11/679,297
Publication Number

US 20080208592A1
Time in Patent Office

2,884 Days
Field of Search

704/270, 704/275, 704/8, 704/231
US Class Current

704/270
CPC Class Codes

G10L 15/24 Speech recognition using no...

Configuring a speech engine for a multimodal application based on location

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

153 Citations

11 Claims

Specification

Solutions

Use Cases

Quick Links

Configuring a speech engine for a multimodal application based on location

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

153 Citations

11 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links