System and method for using prosody for voice-enabled search
First Claim
1. A method comprising:
- receiving a word lattice generated by an automatic speech recognizer processing a query, wherein the word lattice is weighted according to the query;
identifying a policy which allows use of a user emotional state in responding to a user who produced the query;
performing, via a processor of the automatic speech recognizer, a prosodic analysis of the query, wherein the prosodic analysis identifies an audible gesture in the query and a rhythm of words spoken in the query;
identifying, according to the prosodic analysis, the user emotional state;
reweighting, via the processor, the word lattice according to the prosodic analysis, the user emotional state and one of a time of day, a time of year, and a behavioral history of the user, to yield a reweighted word lattice;
determining, via the processor and according to the reweighted word lattice, a response to the query, the response addressing the audible gesture; and
presenting to the user the response to the query.
3 Assignments
0 Petitions
Accused Products
Abstract
Disclosed herein are systems, methods, and non-transitory computer-readable storage media for approximating relevant responses to a user query with voice-enabled search. A system practicing the method receives a word lattice generated by an automatic speech recognizer based on a user speech and a prosodic analysis of the user speech, generates a reweighted word lattice based on the word lattice and the prosodic analysis, approximates based on the reweighted word lattice one or more relevant responses to the query, and presents to a user the responses to the query. The prosodic analysis examines metalinguistic information of the user speech and can identify the most salient subject matter of the speech, assess how confident a speaker is in the content of his or her speech, and identify the attitude, mood, emotion, sentiment, etc. of the speaker. Other information not described in the content of the speech can also be used.
45 Citations
17 Claims
-
1. A method comprising:
-
receiving a word lattice generated by an automatic speech recognizer processing a query, wherein the word lattice is weighted according to the query; identifying a policy which allows use of a user emotional state in responding to a user who produced the query; performing, via a processor of the automatic speech recognizer, a prosodic analysis of the query, wherein the prosodic analysis identifies an audible gesture in the query and a rhythm of words spoken in the query; identifying, according to the prosodic analysis, the user emotional state; reweighting, via the processor, the word lattice according to the prosodic analysis, the user emotional state and one of a time of day, a time of year, and a behavioral history of the user, to yield a reweighted word lattice; determining, via the processor and according to the reweighted word lattice, a response to the query, the response addressing the audible gesture; and presenting to the user the response to the query. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system comprising:
-
a processor; and a computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising; receiving a word lattice generated by an automatic speech recognizer a query, wherein the word lattice is weighted according to the query; identifying a policy which allows use of a user emotional state in responding to a user who produced the query; performing a prosodic analysis of the query, wherein the prosodic analysis identifies an audible gesture in the query and a rhythm of words spoken in the query; identifying, according to the prosodic analysis, the user emotional state; reweighting the word lattice according to the prosodic analysis, the user emotional state and one of a time of day, a time of year, and a behavioral history of the user to yield a reweighted word lattice; determining, according to the reweighted word lattice, a response to the query, the response addressing the audible gesture; and presenting to the user the response to the query. - View Dependent Claims (10, 11, 12, 13)
-
-
14. A computer-readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising:
-
receiving a word lattice generated by an automatic speech recognizer processing a query, wherein the word lattice is weighted according to the query; identifying a policy which allows use of a user emotional state in responding to a user who produced the query; performing a prosodic analysis of the query, wherein the prosodic analysis identifies an audible gesture in the query and a rhythm of words spoken in the query; identifying, according to the prosodic analysis, the user emotional state; reweighting the word lattice according to the prosodic analysis, the user emotional state and one of a time of day, a time of year, and a behavioral history of the user, to yield a reweighted word lattice; determining, according to the reweighted word lattice, a response to the query, the response addressing the audible gesture; and presenting to the user the response to the query. - View Dependent Claims (15, 16, 17)
-
Specification