Robust voice browser system and voice activated device controller

US 10,629,206 B1
Filed: 02/17/2017
Issued: 04/21/2020
Est. Priority Date: 02/04/2000
Status: Expired due to Term

First Claim

Patent Images

1. A method of operating an extended-function computer system by selectively retrieving information in response to spoken commands received by the extended-function computer system, the method comprising:

(a) identifying, as one of a plurality of data characterizing speech commands of a speech-recognition lexicon, audio data indicative of words naturally spoken into a microphone of an electronic-communication device of a user;

(b) using identified data characterizing the speech commands to access a corresponding descriptor file from a plurality of descriptor files, wherein each of the descriptor files identify (i) a web-accessible information source, and (ii) select data of the web-accessible information source;

(c) fetching, from the web-accessible information source identified by an accessed descriptor file, responsive data specified by select data identified by the accessed descriptor file;

(d) generating audio response data containing indicia of a message for the user, which message is responsive to the identified data characterizing the speech commands, and which message is based on the responsive data;

(e) directing the audio response data to the electronic-communication device of the user; and

(f) improving functionality of a voice-responsive system to allow selective retrieval of different kinds of information in response to commands spoken via the electronic-communication device of a user in communication with the voice-responsive system, further comprising;

storing, in a storage device accessible by the voice-responsive system, a first speech recognition grammar that is associated with a first function, and a second speech recognition grammar, different from the first speech recognition grammar, that is associated with a second function, different from the first function; and

storing, in the storage device, for each of the first function and the second function, respective function definitions, different from one another, each configured to be executed by a web browsing server of the voice-responsive system upon recognizing that a command, spoken by the user of an electronic-communication device, corresponds to the respective speech recognition grammar;

wherein each function definition identifies;

(i) a URL of an information source;

(ii) select responsive information to be retrieved from the information source; and

(iii) a responsive message, in a format required by the voice-responsive system so that the voice-responsive system can synthesize an audio response message to be played on a speaker of the electronic-communication device of the user.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention relates to an extended-function device for selectively retrieving information in response to naturally spoken commands provided via an electronic-communication device of a user that is used to query a corresponding descriptor file that identifies a web-accessible information source and fetches responsive data specified by select data identified by the accessed descriptor file. An audio response data containing indicia of a message for the user, which message is responsive to the identified naturally spoken command, and based on the responsive data is directed to the electronic-communication device of the user.

Citations

26 Claims

1. A method of operating an extended-function computer system by selectively retrieving information in response to spoken commands received by the extended-function computer system, the method comprising:
- (a) identifying, as one of a plurality of data characterizing speech commands of a speech-recognition lexicon, audio data indicative of words naturally spoken into a microphone of an electronic-communication device of a user;
  
  (b) using identified data characterizing the speech commands to access a corresponding descriptor file from a plurality of descriptor files, wherein each of the descriptor files identify (i) a web-accessible information source, and (ii) select data of the web-accessible information source;
  
  (c) fetching, from the web-accessible information source identified by an accessed descriptor file, responsive data specified by select data identified by the accessed descriptor file;
  
  (d) generating audio response data containing indicia of a message for the user, which message is responsive to the identified data characterizing the speech commands, and which message is based on the responsive data;
  
  (e) directing the audio response data to the electronic-communication device of the user; and
  
  (f) improving functionality of a voice-responsive system to allow selective retrieval of different kinds of information in response to commands spoken via the electronic-communication device of a user in communication with the voice-responsive system, further comprising;
  
  storing, in a storage device accessible by the voice-responsive system, a first speech recognition grammar that is associated with a first function, and a second speech recognition grammar, different from the first speech recognition grammar, that is associated with a second function, different from the first function; and
  
  storing, in the storage device, for each of the first function and the second function, respective function definitions, different from one another, each configured to be executed by a web browsing server of the voice-responsive system upon recognizing that a command, spoken by the user of an electronic-communication device, corresponds to the respective speech recognition grammar;
  
  wherein each function definition identifies;
  
  (i) a URL of an information source;
  
  (ii) select responsive information to be retrieved from the information source; and
  
  (iii) a responsive message, in a format required by the voice-responsive system so that the voice-responsive system can synthesize an audio response message to be played on a speaker of the electronic-communication device of the user.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
- - 2. The method of claim 1 further comprising:
    - automatically identifying, within the audio data indicative of words naturally spoken into a microphone of an electronic-communication device of a user, a parameter; and
      
      wherein part (c) comprises using the parameter and the accessed descriptor file to identify the responsive data.
  - 3. The method of claim 2 wherein the parameter is indicative of words naturally spoken into the microphone in response to an automatically generated follow-up question to the user seeking a limitation on a speech command identified in part (a).
  - 4. The method of claim 1 further comprising using Internet Protocol to communicate with the electronic-communication device of the user.
  - 5. The method of claim 1 further comprising using a telecommunication network to communicate with the electronic-communication device of the user.
  - 6. The method of claim 1 wherein the electronic-communication device of the user is a voice-enabled wireless unit that is not a telephone.
  - 7. The method of claim 1 wherein the web-accessible information source identified by the accessed descriptor file is a web page, specified by a URL of the web page, and the select data identified by the accessed descriptor file is specified by a location on the web page.
  - 8. The method of claim 1 wherein part (c) comprises fetching the responsive data from a database stored on at least one of a Local Area Network (LAN) and a Wide Area Network (WAN) specified by the corresponding descriptor file.
  - 9. The method of claim 1, further comprising:
    - storing, in the storage device, a third speech recognition grammar, which is also associated with the first function.
  - 10. The method of claim 1 wherein the first speech recognition grammar includes a text representation having at least one optional word and at least one required word.
  - 11. The method of claim 10 wherein the at least one optional word includes a set of alternative words.
  - 12. The method of claim 10 wherein the at least one required word includes a set of alternative words, each word of the set of alternative words corresponding to a different alternative within a category.
  - 13. The method of claim 12 wherein the category is a set of names of cities.
  - 14. The method of claim 12 wherein the first function definition contains instructions for generating the URL in a form that depends on alternative required words of the first speech recognition grammar.
  - 15. The method of claim 12 wherein a first function definition contains instructions for generating the URL as a numeric parameter that depends on alternative required words of the first speech recognition grammar.
  - 16. The method of claim 1 further comprising:
    - storing, in a storage device of the information source identified by a first function definition, data defining an action performed by equipment associated with that information source upon receipt of a message denoting that a command, spoken by the user of an electronic-communication device, has been recognized as corresponding to the first speech recognition grammar.
  - 17. The method of claim 16 wherein the action comprises at least one of activating and deactivating a physical device.
  - 18. The method of claim 16 wherein the action comprises adjusting a physical device.
  - 19. The method of claim 16 wherein the action comprises reporting a status of a physical device.

20. An apparatus having an extended capability of selectively retrieving information in response to naturally spoken commands, the apparatus comprising:
- (a) a transceiver coupled to a network and capable of sending to and receiving information via the network from an electronic-communication device of a user, which device has a microphone;
  
  (b) a database containing a plurality of descriptor files, each of the descriptor files identifying (i) a web-accessible information source, and (ii) select data of the web-accessible information source;
  
  (c) a speech-recognition engine, coupled to the transceiver and having access to the database, programmed to automatically identify, as one of a plurality of speech commands of a speech-recognition lexicon, audio data indicative of words spoken into the microphone of the electronic-communication device of a user;
  
  (d) a media server, coupled to the speech-recognition engine and having access to the database, programmed to access a descriptor file from the plurality of descriptor files in the database based on the identified speech command;
  
  (e) a content fetcher, coupled to the media server, programmed to retrieve, from the web-accessible information source identified by the accessed descriptor file, responsive data specified by the select data identified by the accessed descriptor file, further comprising;
  
  means for improving functionality of a voice-responsive system to allow selective retrieval of different kinds of information in response to commands spoken via the electronic-communication device in communication with the voice-responsive system, further comprising;
  
  a first speech recognition grammar stored in a storage device accessible by the voice-responsive system, that is associated with a first function, and a second speech recognition grammar, different from the first speech recognition grammar, that is associated with a second function, different from the first function; and
  
  stored respective function definitions in the storage device, for each of the first function and the second function, different from one another, each configured to be executed by a web browsing server of the voice-responsive system upon recognizing that a command, spoken by the user of an electronic-communication device, corresponds to the respective speech recognition grammar, wherein each function definition identifies a URL of an information source and a select responsive information to be retrieved from the information source and a responsive message, in a format required by the voice-responsive system; and
  
  (f) a synthesizer coupled to the content fetcher and programmed to automatically generate audio response data containing indicia of a message for the user, which message is responsive to the identified speech command, and which message is based on the responsive data; and
  
  (g) wherein the apparatus is programmed to automatically direct the audio response data to the electronic-communication device of the user.
- View Dependent Claims (21, 22)
- - 21. The apparatus of claim 20 further comprising a content extractor, coupled to the media server and the content fetcher, programmed to use the accessed content descriptor file to format a request for the content fetcher.
  - 22. The apparatus of claim 20 wherein the speech-recognition engine and the synthesizer are within the media server.

23. An electronic-communication device having a capability of selectively retrieving information in response to naturally spoken commands, comprising:
- (i) a microphone;
  
  (ii) wherein the electronic-communication device is in communication with a remote computer system via a network;
  
  (iii) wherein the remote computer system comprises;
  
  (a) a transceiver coupled to the network and capable of sending to and receiving information via the network from the electronic-communication device;
  
  (b) a database containing a plurality of descriptor files, each of the descriptor files identifying (i) a web-accessible information source, and (ii) select data of the web-accessible information source;
  
  (c) a speech-recognition engine, coupled to the transceiver and having access to the database, programmed to automatically identify, as one of a plurality of speech commands of a speech-recognition lexicon, audio data indicative of words spoken into the microphone;
  
  (d) a media server, coupled to the speech-recognition engine and having access to the database, programmed to access a descriptor file from the plurality of descriptor files in the database based on the identified speech command;
  
  (e) a content fetcher, coupled to the media server, programmed to retrieve, from the web-accessible information source identified by the accessed descriptor file, responsive data specified by the select data identified by the accessed descriptor file, further comprising;
  
  means for improving functionality of a voice-responsive system to allow selective retrieval of different kinds of information in response to commands spoken via the electronic-communication device in communication with the voice-responsive system, further comprising;
  
  a first speech recognition grammar stored in a storage device accessible by the voice-responsive system, that is associated with a first function, and a second speech recognition grammar, different from the first speech recognition grammar, that is associated with a second function, different from the first function; and
  
  stored respective function definitions in the storage device, for each of the first function and the second function, different from one another, each configured to be executed by a web browsing server of the voice-responsive system upon recognizing that a command, spoken by the user of an electronic-communication device, corresponds to the respective speech recognition grammar, wherein each function definition identifies a URL of an information source and a select responsive information to be retrieved from the information source and a responsive message, in a format required by the voice-responsive system;
  
  (f) a synthesizer coupled to the content extraction agent and programmed to automatically generate audio response data containing indicia of a message for the user, which message is responsive to the identified speech command, and which message is based on the responsive data; and
  
  (g) wherein the remote computer system is programmed to automatically direct the audio response data to the electronic-communication device; and
  
  (iv) a speaker adapted to convert the audio response data to an audible sound.
- View Dependent Claims (24, 25, 26)
- - 24. The electronic-communication device of claim 23 wherein the network is Internet.
  - 25. The electronic-communication device of claim 23 wherein the network is a telecommunication network.
  - 26. The electronic-communication device of claim 23 wherein the electronic-communication device is a voice-enabled wireless unit that is not a telephone.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Parus Holdings, Inc.
Original Assignee
Parus Holdings, Inc.
Inventors
Kurganov, Alexander, Zhukoff, Valery
Primary Examiner(s)
McFadden, Susan I

Application Number

US15/436,377
Time in Patent Office

1,159 Days
Field of Search

704275
US Class Current
CPC Class Codes

G06F 3/16   Sound input; Sound output s...

G06F 3/167   Audio in a user interface, ...

G10L 15/187   Phonemic context, e.g. pron...

G10L 15/193   Formal grammars, e.g. finit...

G10L 15/22   Procedures used during a sp...

G10L 15/26   Speech to text systems G10L...

G10L 15/30   Distributed recognition, e....

G10L 2015/223   Execution procedure of a sp...

G10L 2015/228   of application context

H04L 67/02   based on web technology, e....

H04M 2201/39   using speech synthesis

H04M 2201/40   using speech recognition

H04M 3/4938   comprising a voice browser ...

Y10S 707/99942   Manipulating data structure...

Y10S 707/99943   Generating database or data...

Robust voice browser system and voice activated device controller

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

26 Claims

Specification

Solutions

Use Cases

Quick Links

Robust voice browser system and voice activated device controller

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

26 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links