SYSTEM AND METHOD OF PROVIDING SPEECH PROCESSING IN USER INTERFACE

US 20160049151A1
Filed: 10/30/2015
Published: 02/18/2016
Est. Priority Date: 01/22/2008
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving, via touch provided on a touch screen of a device, an indication associated with a specific field displayed in a user interface on the touch screen, the indication signaling that speech, which is associated with the specific field, will follow;

receiving the speech via the device;

generating speech data based on the speech;

generating a request for speech recognition, wherein the request comprises;

(1) an application identifier identifying a speech recognizer;

(2) a current location of the device; and

(3) a grammar parameter associated with a home location of a speaker of the speech;

transmitting the speech data and the request to a network node for speech recognition using the speech recognizer;

receiving, at the device, a transcription of the speech from the speech recognizer; and

inserting the transcription into the specific field.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed are systems, methods and computer-readable media for enabling speech processing in a user interface of a device. The method includes receiving an indication of a field and a user interface of a device, the indication also signaling that speech will follow, receiving the speech from the user at the device, the speech being associated with the field, transmitting the speech as a request to public, common network node that receives and processes speech, processing the transmitted speech and returning text associated with the speech to the device and inserting the text into the field. Upon a second indication from the user, the system processes the text in the field as programmed by the user interface. The present disclosure provides a speech mash up application for a user interface of a mobile or desktop device that does not require expensive speech processing technologies.

Citations

20 Claims

1. A method comprising:
- receiving, via touch provided on a touch screen of a device, an indication associated with a specific field displayed in a user interface on the touch screen, the indication signaling that speech, which is associated with the specific field, will follow;
  
  receiving the speech via the device;
  
  generating speech data based on the speech;
  
  generating a request for speech recognition, wherein the request comprises;
  
  (1) an application identifier identifying a speech recognizer;
  
  (2) a current location of the device; and
  
  (3) a grammar parameter associated with a home location of a speaker of the speech;
  
  transmitting the speech data and the request to a network node for speech recognition using the speech recognizer;
  
  receiving, at the device, a transcription of the speech from the speech recognizer; and
  
  inserting the transcription into the specific field.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 10, 11)
- - 2. The method of claim 1, further comprising, upon a second indication from a user, processing the transcription in the specific field.
  - 3. The method of claim 2, wherein the processing of the transcription comprises initiation of a search using the transcription.
  - 4. The method of claim 3, wherein the search is conducted using a search engine.
  - 5. The method of claim 1, further comprising, after receiving the speech, receiving a second touch indication from a user that the speech intended for the specific field has ceased.
  - 6. The method of claim 2, wherein processing the text in the specific field is performed as though the user typed the text in the specific field.
  - 7. The method of claim 1, wherein transmitting the speech data and the request to the network node is performed using one of a representational state transfer protocol, a simple object access protocol, and a web-based protocol.
  - 8. The method of claim 1, wherein the application identifies an application which converts the speech data to the transcription, wherein the application is executed on the network node.
  - 10. The method of claim 1, further comprising presenting an action button associated with the transcription in the specific field only when a confidence level from the speech recognizer is below a threshold.
  - 11. The method of claim 1, wherein when the speech recognizer returns multiple possible interpretations of the speech data, inserting each possible interpretation into a separate text field with an indication instructing a user to select which text field to process.

9. The method of claim 9, wherein the grammar parameter controls a compilation of a plurality of grammars.

12. A system comprising:
- a processor; and
  
  a computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising;
  
  receiving, via touch provided on a touch screen of a device, an indication associated with a specific field displayed in a user interface on the touch screen, the indication signaling that speech, which is associated with the specific field, will follow;
  
  receiving the speech via the device;
  
  generating speech data based on the speech;
  
  generating a request for speech recognition, wherein the request comprises;
  
  (1) an application identifier identifying a speech recognizer;
  
  (2) a current location of the device; and
  
  (3) a grammar parameter associated with a home location of a speaker of the speech;
  
  transmitting the speech data and the request to a network node for speech recognition using the speech recognizer;
  
  receiving, at the device, a transcription of the speech from the speech recognizer; and
  
  inserting the transcription into the specific field.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
- - 13. The system of claim 12, the computer-readable storage medium having additional instructions stored which, when executed by the processor, cause the processor to perform operations comprising, upon a second indication from a user, processing the transcription in the specific field.
  - 14. The system of claim 13, wherein the processing of the transcription comprises initiation of a search using the transcription.
  - 15. The system of claim 14, wherein the search is conducted using a search engine.
  - 16. The system of claim 12, the computer-readable storage medium having additional instructions stored which, when executed by the processor, cause the processor to perform operations comprising, after receiving the speech, receiving a second touch indication from a user that the speech intended for the specific field has ceased.
  - 17. The system of claim 13, wherein processing the text in the specific field is performed as though the user typed the text in the specific field.
  - 18. The system of claim 12, wherein transmitting the speech data and the request to the network node is performed using one of a representational state transfer protocol, a simple object access protocol, and a web-based protocol.
  - 19. The system of claim 12, wherein the application identifies an application which converts the speech data to the transcription, wherein the application is executed on the network node.

20. A computer-readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising:
- receiving, via touch provided on a touch screen of a device, an indication associated with a specific field displayed in a user interface on the touch screen, the indication signaling that speech, which is associated with the specific field, will follow;
  
  receiving the speech via the device;
  
  generating speech data based on the speech;
  
  generating a request for speech recognition, wherein the request comprises;
  
  (1) an application identifier identifying a speech recognizer;
  
  (2) a current location of the device; and
  
  (3) a grammar parameter associated with a home location of a speaker of the speech;
  
  transmitting the speech data and the request to a network node for speech recognition using the speech recognizer;
  
  receiving, at the device, a transcription of the speech from the speech recognizer; and
  
  inserting the transcription into the specific field.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Inventors
WILPON, Jay, STERN, Benjamin J., DI FABBRIZIO, Giuseppe

Granted Patent

US 9,530,415 B2
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G06F 3/0416   Control or interface arrang...

G06F 3/162   Interface to dedicated audi...

G10L 15/22   Procedures used during a sp...

G10L 15/26   Speech to text systems G10L...

G10L 15/30   Distributed recognition, e....

SYSTEM AND METHOD OF PROVIDING SPEECH PROCESSING IN USER INTERFACE

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

SYSTEM AND METHOD OF PROVIDING SPEECH PROCESSING IN USER INTERFACE

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links