SYSTEM AND METHOD FOR USING PROSODY FOR VOICE-ENABLED SEARCH

US 20120072217A1
Filed: 09/17/2010
Published: 03/22/2012
Est. Priority Date: 09/17/2010
Status: Active Grant

First Claim

Patent Images

1. A method of processing speech, the method comprising:

receiving a word lattice generated by an automatic speech recognizer based on a user speech representing a query;

receiving a prosodic analysis of the user speech;

generating a reweighted word lattice based on the word lattice and the prosodic analysis;

approximating based on the reweighted word lattice at least one relevant response to the query; and

presenting to a user one of the at least one relevant response to the query.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed herein are systems, methods, and non-transitory computer-readable storage media for approximating relevant responses to a user query with voice-enabled search. A system practicing the method receives a word lattice generated by an automatic speech recognizer based on a user speech and a prosodic analysis of the user speech, generates a reweighted word lattice based on the word lattice and the prosodic analysis, approximates based on the reweighted word lattice one or more relevant responses to the query, and presents to a user the responses to the query. The prosodic analysis examines metalinguistic information of the user speech and can identify the most salient subject matter of the speech, assess how confident a speaker is in the content of his or her speech, and identify the attitude, mood, emotion, sentiment, etc. of the speaker. Other information not described in the content of the speech can also be used.

66 Citations

View as Search Results

20 Claims

1. A method of processing speech, the method comprising:
- receiving a word lattice generated by an automatic speech recognizer based on a user speech representing a query;
  
  receiving a prosodic analysis of the user speech;
  
  generating a reweighted word lattice based on the word lattice and the prosodic analysis;
  
  approximating based on the reweighted word lattice at least one relevant response to the query; and
  
  presenting to a user one of the at least one relevant response to the query.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, wherein the prosodic analysis examines metalinguistic information of the user speech based on at least one of tonality, rhythm, volume, stress, intonation, and speed of the user speech.
  - 3. The method of claim 2, wherein the prosodic analysis identifies a most salient subject matter of the user speech.
  - 4. The method of claim 2, wherein the prosodic analysis assesses how confident a speaker who originated the user speech is in a content of the user speech.
  - 5. The method of claim 2, wherein the prosodic analysis identifies at least one of attitude, mood, emotion, and sentiment of a speaker who originated the user speech.
  - 6. The method of claim 1, wherein the reweighted word lattice is further based on additional information not described in content of the user speech.
  - 7. The method of claim 6, wherein the additional information comprises at least one of time of day, time of year, location of a speaker who originated the user speech and known behavioral history of the speaker.
  - 8. The method of claim 1, wherein approximating the at least one relevant response is performed by composing the reweighted word lattice with a search finite state transducer based on a plurality of pre-indexed documents.
  - 9. The method of claim 8, further comprising generating a finite state transducer based on composing the reweighted word lattice.
  - 10. The method of claim 9, further comprising reranking, based on the finite state transducer, the word lattice generated by the automatic speech recognizer.

11. A system comprising:
- a processor;
  
  a first module configured to control the processor to receive a word lattice generated by an automatic speech recognizer based on a user speech;
  
  a second module configured to control the processor to receive a prosodic analysis of the user speech;
  
  a third module configured to control the processor to generate a reweighted word lattice based on the word lattice and the prosodic analysis;
  
  a fourth module configured to control the processor to approximate based on the reweighted word lattice at least one relevant response to the query; and
  
  a fifth module configured to control the processor to present to a user one of the at least one relevant response to the query.
- View Dependent Claims (12, 13, 14, 15, 16)
- - 12. The system of claim 11, wherein the prosodic analysis within the second module examines metalinguistic information of the user speech based on at least one of tonality, rhythm, volume, stress, intonation, and speed of the user speech.
  - 13. The system of claim 12, wherein the prosodic analysis within the second module identifies a most salient subject matter of the user speech.
  - 14. The system of claim 12, wherein the prosodic analysis within the second module assesses how confident a speaker who originated the user speech is in a content of the user speech.
  - 15. The system of claim 12, wherein the prosodic analysis within the second module identifies at least one of attitude, mood, emotion, and sentiment of a speaker who originated the user speech.
  - 16. The system of claim 11, wherein the third module generates a reweighted word lattice based further on at least one of time of day, time of year, location of a speaker who originated the user speech and known behavioral history of the speaker.

17. A non-transitory computer-readable storage medium storing instructions which, when executed by a computing device, cause the computing device to process speech, the instructions comprising:
- receiving a word lattice generated by an automatic speech recognizer based on a user speech;
  
  receiving a prosodic analysis of the user speech;
  
  generating a reweighted word lattice based on the word lattice and the prosodic analysis;
  
  approximating based on the reweighted word lattice at least one relevant response to the query; and
  
  presenting to a user one of the at least one relevant response to the query.
- View Dependent Claims (18, 19, 20)
- - 18. The non-transitory computer-readable storage medium of claim 17, wherein approximating the at least one relevant response is performed by composing the reweighted word lattice with a search finite state transducer based on a plurality of pre-indexed documents.
  - 19. The non-transitory computer-readable storage medium of claim 18, the instructions further comprising generating a finite state transducer based on composing the reweighted word lattice.
  - 20. The non-transitory computer-readable storage medium of claim 19, the instructions further comprising reranking, based on the finite state transducer, the word lattice generated by the automatic speech recognizer.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Inventors
BANGALORE, Srinivas, Feng, Junlan, Johnston, Michael, Mishra, Taniya

Granted Patent

US 10,002,608 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/243
CPC Class Codes

G10L 15/1807   using prosody or stress

G10L 2015/226   using non-speech characteri...

G10L 2015/227   of the speaker; Human-fact...

G10L 25/54   for retrieval

G10L 25/63   for estimating an emotional...

SYSTEM AND METHOD FOR USING PROSODY FOR VOICE-ENABLED SEARCH

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

66 Citations

20 Claims

Specification

Use Cases

Quick Links

Others

SYSTEM AND METHOD FOR USING PROSODY FOR VOICE-ENABLED SEARCH

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

66 Citations

20 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others