Service oriented speech recognition for in-vehicle automated interaction and in-vehicle user interfaces requiring minimal cognitive driver processing for same

US 9,558,745 B2
Filed: 11/13/2015
Issued: 01/31/2017
Est. Priority Date: 03/24/2009
Status: Active Grant

First Claim

Patent Images

1. A method for implementing an interactive automated system, comprising:

using a telematics processing system located in proximity to a person;

receiving in a first interaction an indication of intent from the person;

thereafter, in a second interaction that is separate from the first interaction, receiving from the person a spoken utterance associated with the indicated intent, wherein the spoken utterance associated with the indicated intent matches intended text of a type of intent available to the person; and

processing the spoken utterance of the second interaction using the processing system;

transmitting the processed speech information to a remote data center using a wireless link;

analyzing the transmitted speech information;

based upon the indicated intent, selecting at least one optimal speech recognition engine from a set of speech recognition engines;

converting the analyzed speech information into packet data format to produce packet speech information;

using an internet-protocol transport network, transporting the packet speech information to the selected at least one optimal speech recognition engine and recognizing the converted speech information with the selected at least one optimal speech recognition engine;

retrieving recognition results and an associated confidence score from the selected at least one optimal speech recognition engine;

if the confidence score meets or exceeds a predetermined threshold for a best match, processing the recognition results to;

perform a search;

generate search results;

transport the search results to the processing system; and

present the search results to the person; and

if the confidence score is below the predetermined threshold, selecting at least one alternative optimal speech recognition engine to carry out recognition of the converted speech information.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method for implementing a server-based speech recognition system for multimodal automated interaction in a vehicle includes receiving, by a vehicle driver, audio prompts by an on-board human-to-machine interface and a response with speech to complete tasks such as creating and sending text messages, web browsing, navigation, etc. This service-oriented architecture is utilized to call upon specialized speech recognizers in an adaptive fashion. The human-to-machine interface enables completion of a text input task while driving a vehicle in a way that minimizes the frequency of the driver'"'"'s visual and mechanical interactions with the interface, thereby eliminating unsafe distractions during driving conditions. After the initial prompting, the typing task is followed by a computerized verbalization of the text. Subsequent interface steps can be visual in nature, or involve only sound.

38 Citations

View as Search Results

20 Claims

1. A method for implementing an interactive automated system, comprising:
- using a telematics processing system located in proximity to a person;
  
  receiving in a first interaction an indication of intent from the person;
  
  thereafter, in a second interaction that is separate from the first interaction, receiving from the person a spoken utterance associated with the indicated intent, wherein the spoken utterance associated with the indicated intent matches intended text of a type of intent available to the person; and
  
  processing the spoken utterance of the second interaction using the processing system;
  
  transmitting the processed speech information to a remote data center using a wireless link;
  
  analyzing the transmitted speech information;
  
  based upon the indicated intent, selecting at least one optimal speech recognition engine from a set of speech recognition engines;
  
  converting the analyzed speech information into packet data format to produce packet speech information;
  
  using an internet-protocol transport network, transporting the packet speech information to the selected at least one optimal speech recognition engine and recognizing the converted speech information with the selected at least one optimal speech recognition engine;
  
  retrieving recognition results and an associated confidence score from the selected at least one optimal speech recognition engine;
  
  if the confidence score meets or exceeds a predetermined threshold for a best match, processing the recognition results to;
  
  perform a search;
  
  generate search results;
  
  transport the search results to the processing system; and
  
  present the search results to the person; and
  
  if the confidence score is below the predetermined threshold, selecting at least one alternative optimal speech recognition engine to carry out recognition of the converted speech information.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. The method according to claim 1, wherein the processing system located in proximity to the person is a telematics processing system.
  - 3. The method according to claim 1, wherein the generated search results are in the form of a list of the search results.
  - 4. The method according to claim 3, wherein the list of search results is transported to the processing system and presented to the person.
  - 5. The method according to claim 1, wherein the at least one alternative optimal speech recognition engine is agent-assisted.
  - 6. The method according to claim 1, wherein the selected at least one optimal speech recognition engine is not local.
  - 7. The method according to claim 1, wherein the presentation of the search results is continued with the person prior to, or subsequent to, receiving the recognition results in an asynchronous manner.
  - 8. The method according to claim 1, wherein the presentation of the search results is continued with the person subsequent to receiving the recognition results in a synchronous manner.
  - 9. The method according to claim 1, further comprising logging packet data of the packet speech information and the recognition results for subsequent analysis.
  - 10. The method according to claim 1, wherein the processing system is located on-board a vehicle.
  - 11. The method according to claim 10, further comprising transporting vehicle location information along with the packet speech information to the selected at least one optimal speech recognition engine.
  - 12. The method according to claim 11, further comprising logging the vehicle location information for subsequent analysis.
  - 13. The method according to claim 1, wherein the indicated intent pertains to at least one of:
    - internet browsing; and
      
      navigational information.

14. A method for implementing an interactive automated system, comprising:
- using a telematics processing system located on-board a vehicle;
  
  receiving in a first interaction an indication of intent from a driver of the vehicle;
  
  thereafter, in a second interaction that is separate from the first interaction, receiving from the driver a spoken utterance associated with the indicated intent, wherein the spoken utterance associated with the indicated intent matches intended text of a type of intent available to the driver; and
  
  processing the spoken utterance of the second interaction using the processing system;
  
  transmitting the processed speech information to a remote data center using a wireless link;
  
  analyzing the transmitted speech information;
  
  based upon the indicated intent, selecting at least one optimal speech recognition engine from a set of speech recognition engines;
  
  converting the analyzed speech information into packet data format to produce packet speech information;
  
  using an internet-protocol transport network, transporting the packet speech information and vehicle location information to the selected at least one optimal speech recognition engine and recognizing the converted speech information with the selected at least one optimal speech recognition engine;
  
  retrieving recognition results and an associated confidence score from the selected at least one optimal speech recognition engine;
  
  if the confidence score meets or exceeds a predetermined threshold for a best match, processing the recognition results to;
  
  perform a search;
  
  generate search results;
  
  transport the search results to the processing system; and
  
  present the search results to the vehicle driver; and
  
  if the confidence score is below the predetermined threshold, selecting at least one alternative optimal speech recognition engine to carry out recognition of the converted speech information.
- View Dependent Claims (15, 16, 17, 18, 19)
- - 15. The method according to claim 14, wherein the at least one alternative optimal speech recognition engine is agent-assisted.
  - 16. The method according to claim 14, wherein the selected at least one optimal speech recognition engine is not local.
  - 17. The method according to claim 14, wherein the presentation of the search results is continued with the vehicle driver prior to, or subsequent to, receiving the recognition results in an asynchronous manner.
  - 18. The method according to claim 14, wherein the presentation of the search results is continued with the vehicle driver subsequent to receiving the recognition results in a synchronous manner.
  - 19. The method according to claim 14, wherein the indicated intent pertains to at least one of:
    - internet browsing; and
      
      navigational information.

20. An interactive automated speech recognition system, comprising:
- a telematics processing system located in proximity to a person;
  
  a remote data center;
  
  a wireless link that transmits processed speech information from the processing system to the remote data center,wherein the processing system;
  
  receives in a first interaction an indication of intent from the person;
  
  thereafter, in a second interaction that is separate from the first interaction, receives from the person a spoken utterance associated with the indicated intent, wherein the spoken utterance associated with the indicated intent matches intended text of a type of intent available to the person; and
  
  processes the spoken utterance of the second interaction and transmits the processed speech information to the remote data center using the wireless link, wherein the remote data center analyzes the transmitted processed speech information and converts the analyzed speech information into packet data format;
  
  at least one optimal speech recognition engine selected to recognize the converted speech information, the at least one optimal speech recognition engine being selected from a set of speech recognition engines based upon the indicated intent;
  
  an internet protocol transport network that transports the converted speech information to the selected at least one optimal speech recognition engine; and
  
  wherein the selected at least one optimal speech recognition engine produces recognition results and an associated confidence score, whereby;
  
  if the confidence score meets or exceeds a predetermined threshold for a best match, the recognition results are processed to;
  
  perform a search;
  
  generate search results;
  
  transport the search results to the processing system; and
  
  present the search results to the person; and
  
  if the confidence score is below the predetermined threshold, at least one alternative optimal speech recognition engine is selected to carry out recognition of the converted speech information.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sirius XM Connected Vehicle Services Inc. (Liberty Media Corporation)
Original Assignee
Sirius XM Connected Vehicle Services Inc. (Liberty Media Corporation)
Inventors
Schalk, Thomas Barton, Saenz, Leonel, Burch, Barry
Primary Examiner(s)
ADESANYA, OLUJIMI A

Application Number

US14/940,525
Publication Number

US 20160071518A1
Time in Patent Office

445 Days
Field of Search

704/235, 704270-275
US Class Current

1/1
CPC Class Codes

G06F 16/9537   Spatial or temporal depende...

G06F 3/167   Audio in a user interface, ...

G10L 15/08   Speech classification or se...

G10L 15/22   Procedures used during a sp...

G10L 15/30   Distributed recognition, e....

Service oriented speech recognition for in-vehicle automated interaction and in-vehicle user interfaces requiring minimal cognitive driver processing for same

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

38 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Service oriented speech recognition for in-vehicle automated interaction and in-vehicle user interfaces requiring minimal cognitive driver processing for same

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

38 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links