System and method for distributed speech recognition with a cache feature

US 20040254787A1
Filed: 06/12/2003
Published: 12/16/2004
Est. Priority Date: 06/12/2003
Status: Abandoned Application

First Claim

Patent Images

1. A system for decoding speech to access services via a communications device, comprising:

an input device for receiving speech input;

a feature extraction engine, the feature extraction engine extracting at least one feature from the speech input;

a local model store;

a first interface to a network, the network comprising a network model store, the network model store being configured to generate at least one service depending on the at least one feature extracted from the speech input; and

a processor, communicating with the input device, the feature extraction engine, the local model store and the first interface, the processor testing the at least one feature extracted from the speech input against the local model store to act upon a service request, the processor being configured to initiate a transmission of the at least one feature extracted from the speech input to the network via the first interface when no match is found between the local model store and the at least one feature extracted from the speech input.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The invention equips a cellular telephone or other communications device with improved voice recognition and command capability. A cellular handset may be equipped with a digital signal processing or other hardware to enhance speech detection and command decoding, but still be relatively constrained in terms of the amount of electronic memory or other storage available on the device, or the processing power or battery life offered by the device. In embodiments, the cellular handset or other device may perform a first-stage decoding of a voice or other command, for instance to perform a voice browsing function over the Internet or a directory. The handset may perform a look-up of the detected command or service against a local memory cache of stored commands, services and models and if a match is found, proceed directly to performing the desired service. If a match is not found in the device memory, the voice signal may be communicated to a server or other resource in the cellular or other network, for remote or distributed decoding of the command or action. When that service is returned to the handset, the service along with the associated model may be stored into electronic memory or other storage for future access, in caching fashion. A user'"'"'s most frequently used, or latest used, commands and services may be locally stored on the device, for instance, enabling prompt response times within those commands or services.

27 Citations

View as Search Results

59 Claims

1. A system for decoding speech to access services via a communications device, comprising:
- an input device for receiving speech input;
  
  a feature extraction engine, the feature extraction engine extracting at least one feature from the speech input;
  
  a local model store;
  
  a first interface to a network, the network comprising a network model store, the network model store being configured to generate at least one service depending on the at least one feature extracted from the speech input; and
  
  a processor, communicating with the input device, the feature extraction engine, the local model store and the first interface, the processor testing the at least one feature extracted from the speech input against the local model store to act upon a service request, the processor being configured to initiate a transmission of the at least one feature extracted from the speech input to the network via the first interface when no match is found between the local model store and the at least one feature extracted from the speech input.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. A system according to claim 1, wherein the first interface comprises a wired interface.
  - 3. A system according to claim 1, wherein the first interface comprises a wireless interface.
  - 4. A system according to claim 1, wherein the first interface comprises an optical interface.
  - 5. A system according to claim 1, wherein the processor initiates a transmission of the at least one feature extracted from the speech input to the network when a match between the at least one feature extracted from the speech input and the local model store is not found.
  - 6. A system according to claim 5, wherein the network responds to the at least one feature extracted from the speech input to generate the at least one service and transmit the at least one service to the communications device.
  - 7. A system according to claim 6, wherein the processor stores the at least one service in the local model store.
  - 8. A system according to claim 7, wherein the processor deletes an obsolete service upon the storing of the at least one service in the local model store when the local model store is full.
  - 9. A system according to claim 8, wherein the deleting of the obsolete service is performed on a least-recently used basis.
  - 10. A system according to claim 8, wherein the deleting of the obsolete service is performed on a least-frequently used basis.
  - 11. A system according to claim 1, wherein an local model store comprises an initializable local model store downloadable from the network, programmed by a vendor, or trained by a user.
  - 12. A system according to claim 1, wherein the at least one service comprises at least one of voice browsing, voice-activated dialing and voice-activated directory service.
  - 13. A system according to claim 1, wherein the processor initiates a service based on the local model store when a match between the speech input and the local model store is found.
  - 14. A system according to claim 13, wherein the initiation comprises linking to a stored address.
  - 15. A system according to claim 14, wherein the linking to a stored address comprises accessing a URL.

16. A method for decoding speech to access services via a communications device, comprising:
- receiving speech input;
  
  extracting at least one feature from the speech input;
  
  testing the at least one feature extracted from the speech input against a local model store in a communication device to act upon a service request; and
  
  when no match if found between the local model store and the at least one feature extracted from the speech input transmitting the at least one feature extracted from the speech input via a first interface to a network, and generating a link to at least one service depending on the at least one feature extracted from the speech input.
- View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27)
- - 17. A method according to claim 16, further comprising a step of transmitting the link to the communications device.
  - 18. A method according to claim 16, further comprising a step of storing the link in the local model store.
  - 19. A method according to claim 18, further comprising a step of deleting an obsolete service upon the storing of the at least one service in the local model store when the local model store is full.
  - 20. A method according to claim 19, wherein the deleting of the obsolete service is performed on a least recently-used basis.
  - 21. A method according to claim 19, wherein the deleting of the obsolete service is performed on a least-frequently used basis.
  - 22. A method according to claim 16, further comprising a step of initializing the local model store
  - 23. A method according to claim 22, wherein the initializing comprises at least one of downloading an initializable local model store from the network to the communications device, programming by a vendor of the communications device, and training by a user of the communications device.
  - 24. A method according to claim 16, wherein the at least one service comprises at least one of voice browsing, voice-activated dialing and voice-activated directory service.
  - 25. A method according to claim 16, further comprising a step of initiating a service when a match between the at least one feature extracted from the speech input and the local model store is found.
  - 26. A method according to claim 25, wherein the step of initiating comprises linking to a stored address.
  - 27. A method according to claim 26, wherein the step of linking to a stored address comprises accessing a URL.

28. A communications system for decoding speech to access services via a communications device, comprising:
- an input device for receiving speech input;
  
  a feature extraction engine, the feature extraction engine extracting at least one feature from the speech input;
  
  a local model store;
  
  a first interface to a network;
  
  a network, the network comprising a network model store, the network model store being configured to generate at least one service depending on the at least one feature extracted from the speech input; and
  
  a processor, communicating with the input device, the feature extraction engine, the local model store and the first interface, the processor testing the at least one feature extracted from the speech input against the local model store to act upon a service request, the processor being configured to initiate a transmission of the at least one feature extracted from the speech input to the network via the first interface when no match is found between the local model store and the at least one feature extracted from the speech input.
- View Dependent Claims (29, 30, 31, 32, 33, 34, 35, 36, 37, 38)
- - 29. A system according to claim 28, wherein the first interface comprises a wired interface.
  - 30. A system according to claim 28, wherein the first interface comprises a wireless interface.
  - 31. A system according to claim 28, wherein the first interface comprises an optical interface.
  - 32. A system according to claim 28, wherein the processor initiates a transmission of the at least one feature extracted from the speech input to the network when a match between the at least one feature extracted from the speech input and the local model store is not found.
  - 33. A system according to claim 32, wherein the network responds to the at least one feature extracted from the speech input to generate the at least one service and transmit the at least one service to the communications device.
  - 34. A system according to claim 33, wherein the processor stores the at least one service in the local model store.
  - 35. A system according to claim 34, wherein the processor deletes an obsolete service upon the storing of the at least one service in the local model store when the local model store is full.
  - 36. A system according to claim 28, wherein the at least one service comprises at least one of voice browsing, voice-activated dialing and voice-activated directory service.
  - 37. A system according to claim 28, wherein the processor initiates a service when a match between the speech input and the local model store is found.
  - 38. A system according to claim 37, wherein the initiation comprises linking to a stored address.

39. A network system for decoding speech to access services inputted via a communications device, comprising:
- a network model store, the network model store being configured to generate at least one service depending on at least one feature extracted from speech input to a communications device; and
  
  a first interface to the communications device, the communications device comprising an input device for receiving the speech input, a feature extraction engine, the feature extraction engine extracting the at least one feature from the speech input, a local model store, and a processor, communicating with the input device, the feature extraction engine, the local model store and the first interface; and
  
  a network processor, the network processor being configured to test the at least one feature extracted from the speech input against the network model store to act upon a service request, the network processor being configured to initiate a transmission of the at least one service to the communications device.
- View Dependent Claims (40, 41, 42, 43, 44, 45, 46, 47, 48)
- - 40. A system according to claim 39, wherein the first interface comprises a wired interface.
  - 41. A system according to claim 39, wherein the first interface comprises a wireless interface.
  - 42. A system according to claim 39, wherein the first interface comprises an optical interface.
  - 43. A system according to claim 39, wherein the network processor responds to the at least one feature extracted from the speech input to generate the at least one service and transmit the at least one service to the communications device.
  - 44. A system according to claim 43, wherein the processor in the communications device stores the at least one service in the local model store.
  - 45. A system according to claim 44, wherein the processor in the communications device deletes an obsolete service upon the storing of the at least one service in the local model store when the local model store is full.
  - 46. A system according to claim 39, wherein the at least one service comprises at least one of voice browsing, voice-activated dialing and voice-activated directory service.
  - 47. A system according to claim 39, wherein the processor in the communications device initiates the at least one service upon receipt of the at least one service from the network.
  - 48. A system according to claim 47, wherein the initiation comprises linking to a stored address.

49. A system for decoding speech to access services via a communications device, comprising:
- input means for receiving speech input;
  
  feature extraction means, the feature extraction means extracting at least one feature from the speech input;
  
  local model store means;
  
  first interface means to a wireless network, the network comprising network model store means, the network model store means being configured to generate at least one service depending on the at least one feature extracted from the speech input; and
  
  processor means, communicating with the input means, the feature extraction means, the local model store means and the first interface means, the processor means testing the at least one feature extracted from the speech input against the local model store means to act upon a service request, the processor means being configured to initiate a transmission of the at least one feature extracted from the speech input to the network via the first interface means when no match is found between the local model store means and the at least one feature extracted from the speech input.
- View Dependent Claims (50, 51, 52, 53, 54, 55, 56, 57, 58, 59)
- - 50. A system according to claim 49, wherein the first interface comprises a wired interface.
  - 51. A system according to claim 49, wherein the first interface comprises a wireless interface.
  - 52. A system according to claim 49, wherein the first interface comprises an optical interface.
  - 53. A system according to claim 49, wherein the processor means initiates a transmission of the at least one feature extracted from the speech input to the network when a match between the at least one feature extracted from the speech input and the local model store means is not found.
  - 54. A system according to claim 49, wherein the network responds to the at least one feature extracted from the speech input to generate the at least one service and transmit the at least one service to the communications device.
  - 55. A system according to claim 49, wherein the processor means stores the at least one service in the local model store means.
  - 56. A system according to claim 49, wherein the processor means deletes an obsolete service upon the storing of the at least one service in the local model store means when the local model store means is full.
  - 57. A system according to claim 49, wherein the at least one service comprises at least one of voice browsing, voice-activated dialing and voice-activated directory service.
  - 58. A system according to claim 49, wherein the processor means initiates a service when a match between the speech input and the local model store means is found.
  - 59. A system according to claim 58, wherein the initiation comprises linking to a stored address.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Motorola, Inc. (Motorola Solutions, Inc.)
Original Assignee
Motorola, Inc. (Motorola Solutions, Inc.)
Inventors
Schentrup, Philip A., Desai, Pratik, Shah, Sheetal R.

Application Number

US10/460,141
Publication Number

US 20040254787A1
Time in Patent Office

Days
Field of Search
US Class Current

704/219
CPC Class Codes

G10L 15/30 Distributed recognition, e....

System and method for distributed speech recognition with a cache feature

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

27 Citations

59 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for distributed speech recognition with a cache feature

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

27 Citations

59 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links