System and method for speech personalization by need

US 9,837,071 B2
Filed: 04/06/2015
Issued: 12/05/2017
Est. Priority Date: 06/09/2009
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

recognizing speech of each speaker of a plurality of speakers on a conference call, to yield recognized speech for each of the plurality of speakers, wherein the speech of each speaker from the plurality of speakers is received via a speech interface implemented on a computing device;

recording metrics associated with the recognized speech for each of the plurality of speakers, wherein the metrics comprise a request for repetition, a negative response to confirmation, and a task completion;

after recording the metrics, while recording further speech from the each speaker from the plurality of speakers, modifying, via a processor, an allocation of resources of the speech interface based on the metrics, to yield a modified speech interface; and

recognizing additional speech during the conference call from an identified speaker in the plurality of speakers using the modified speech interface.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed herein are systems, computer-implemented methods, and tangible computer-readable storage media for speaker recognition personalization. The method recognizes speech received from a speaker interacting with a speech interface using a set of allocated resources, the set of allocated resources including bandwidth, processor time, memory, and storage. The method records metrics associated with the recognized speech, and after recording the metrics, modifies at least one of the allocated resources in the set of allocated resources commensurate with the recorded metrics. The method recognizes additional speech from the speaker using the modified set of allocated resources. Metrics can include a speech recognition confidence score, processing speed, dialog behavior, requests for repeats, negative responses to confirmations, and task completions. The method can further store a speaker personalization profile having information for the modified set of allocated resources and recognize speech associated with the speaker based on the speaker personalization profile.

Citations

20 Claims

1. A method comprising:
- recognizing speech of each speaker of a plurality of speakers on a conference call, to yield recognized speech for each of the plurality of speakers, wherein the speech of each speaker from the plurality of speakers is received via a speech interface implemented on a computing device;
  
  recording metrics associated with the recognized speech for each of the plurality of speakers, wherein the metrics comprise a request for repetition, a negative response to confirmation, and a task completion;
  
  after recording the metrics, while recording further speech from the each speaker from the plurality of speakers, modifying, via a processor, an allocation of resources of the speech interface based on the metrics, to yield a modified speech interface; and
  
  recognizing additional speech during the conference call from an identified speaker in the plurality of speakers using the modified speech interface.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 2. The method of claim 1, wherein the speech interface utilizes at least one of bandwidth and processor time to allocate resources for the recognizing of the speech.
  - 3. The method of claim 1, wherein the metrics further comprise a speech recognition confidence score, a processing speed, and a dialog behavior.
  - 4. The method of claim 1, wherein the identified speaker was determined to be frustrated and have great difficulty in a prior session.
  - 5. The method of claim 1, further comprising storing a speaker personalization profile having information for the modified speech interface.
  - 6. The method of claim 5, further comprising recognizing speech associated with the identified speaker based on the speaker personalization profile.
  - 7. The method of claim 5, further comprising storing the speaker personalization profile on a personalization server storing multiple speaker personalization profiles.
  - 8. The method of claim 5, wherein multiple speakers are associated with the speaker personalization profile.
  - 9. The method of claim 1, wherein the modified speech interface is associated with a class of similar speakers.
  - 10. The method of claim 1, wherein modifying the allocation of resources is based on a difficulty threshold associated with how well the speaker interacts with the speech interface.
  - 11. The method of claim 1, further comprising progressively applying the modified speech interface.
  - 12. The method of claim 1, wherein the resources each comprise at least one of memory and storage.
  - 13. The method of claim 1, wherein an allocation of resources in the modified speech interface is greater than its corresponding allocation in a set of allocated resources prior to the modifying.
  - 14. The method of claim 1, wherein an allocation of resources in the modified speech interface is less than its corresponding allocation in a set of allocated resources prior to the modifying.

15. A system comprising:
- a processor; and
  
  a computer-readable storage medium having instructions stored which, when executed by the processor, result in the processor performing operations comprising;
  
  recognizing speech of each speaker of a plurality of speakers on a conference call, to yield recognized speech for each of the plurality of speakers, wherein the speech of each speaker from the plurality of speakers is received via a speech interface;
  
  recording metrics associated with the recognized speech for each of the plurality of speakers, wherein the metrics comprise a request for repetition, a negative response to confirmation, and a task completion;
  
  after recording the metrics, while recording further speech from the each speaker from the plurality of speakers, modifying an allocation of resources of the speech interface based on the metrics, to yield a modified speech interface; and
  
  recognizing additional speech during the conference call from an identified speaker in the plurality of speakers using the modified speech interface.
- View Dependent Claims (16, 17, 18, 19)
- - 16. The system of claim 15, wherein the speech interface utilizes at least one of bandwidth and processor time to allocate resources for the recognizing of the speech.
  - 17. The system of claim 15, wherein the metrics further comprise a speech recognition confidence score, a processing speed, and a dialog behavior.
  - 18. The system of claim 15, wherein the identified speaker was determined to be frustrated and have great difficulty in a prior session.
  - 19. The system of claim 15, the computer-readable storage medium having additional instructions stored which, when executed by the processor, result in operations comprising storing a speaker personalization profile having information for the modified speech interface.

20. A computer-readable storage device having instructions stored which, when executed by a computing device, result in the computing device performing operations comprising:
- recognizing speech of each speaker of a plurality of speakers on a conference call, to yield recognized speech for each of the plurality of speakers, wherein the speech of each speaker from the plurality of speakers is received via a speech interface;
  
  recording metrics associated with the recognized speech for each of the plurality of speakers, wherein the metrics comprise a request for repetition, a negative response to confirmation, and a task completion;
  
  after recording the metrics, while recording further speech from the each speaker from the plurality of speakers, modifying an allocation of resources of the speech interface based on the metrics, to yield a modified speech interface; and
  
  recognizing additional speech during the conference call from an identified speaker in the plurality of speakers using the modified speech interface.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Ljolje, Andrej, Conkie, Alistair D., Syrdal, Ann K.
Primary Examiner(s)
Shah, Paras D

Application Number

US14/679,508
Publication Number

US 20150213794A1
Time in Patent Office

974 Days
Field of Search

704270, 7042701, 704275, 704231, 704246, 379 8801
US Class Current
CPC Class Codes

G10L 15/07   to the speaker

G10L 15/10   using distance or distortio...

G10L 15/26   Speech to text systems G10L...

System and method for speech personalization by need

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for speech personalization by need

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links