Computer generated emulation of a subject
First Claim
Patent Images
1. A system for creating a response to an inputted user query, said system comprising:
- a user interface configured to emulate a subject by displaying a talking head including a face of the subject, and output speech from a mouth of the face with a voice of the subject, the user interface further including a receiver to receive a query from a user, the emulated subject being configured to respond to the query received from the user;
a personality file memory storing a plurality of documents in an unstructured form and storing model parameters, the model parameters describing probability distributions that relate an acoustic unit to an image vector and a speech vector, the image vector including a plurality of parameters that define the subject'"'"'s face and the speech vector including a plurality of parameters that define the subject'"'"'s voice; and
processing circuitry configured toconvert said query into a word vector;
compare said word vector generated from said query with word vectors generated from the documents in said personality file memory and output identified documents;
compare said word vector selected from said query and passages from said identified documents and to rank said selected passages, said ranking being based on a number of matches between said selected passage and said query;
concatenate selected passages together using sentence connectors to produce the response, wherein said sentence connectors are chosen from a plurality of sentence connectors, said sentence connectors being chosen based on a language model,convert the response into a sequence of acoustic units using a statistical model, the statistical model including a plurality of model parameters, the model parameters being retrieved from the personality file memory,output a sequence of speech vectors and image vectors that are synchronized such that the head appears to talk,output an expressive response such that the face and voice demonstrate expression, anddetermine the expression with which to output the generated response,wherein the model parameters stored in the personality file memory describe probability distributions that relate the acoustic unit to the image vector and the speech vector for an associated expression.
1 Assignment
0 Petitions
Accused Products
Abstract
A system for emulating a subject, to allow a user to interact with a computer generated talking head with the subject'"'"'s face and voice;
- said system comprising a processor, a user interface and a personality storage section,
- the user interface being configured to emulate the subject, by displaying a talking head which comprises the subject'"'"'s face and output speech from the mouth of the face with the subject'"'"'s voice, the user interface further comprising a receiver for receiving a query from the user, the emulated subject being configured to respond to the query received from the user,
- the processor comprising a dialogue section and a talking head generation section,
- wherein said dialogue section is configured to generate a response to a query inputted by a user from the user interface and generate a response to be outputted by the talking head, the response being generated by retrieving information from said personality storage section, said personality storage section comprising content created by or about the subject,
- and said talking head generation section is configured to:
- convert said response into a sequence of acoustic units, the talking head generation section further comprising a statistical model, said statistical model comprising a plurality of model parameters, said model parameters being derived from said personality storage section, the model parameters describing probability distributions which relate an acoustic unit to an image vector and speech vector, said image vector comprising a plurality of parameters which define the subject'"'"'s face and said speech vector comprising a plurality of parameters which define the subject'"'"'s voice, the talking head generation section being further configured to output a sequence of speech vectors and image vectors which are synchronised such that the head appears to talk.
36 Citations
3 Claims
-
1. A system for creating a response to an inputted user query, said system comprising:
-
a user interface configured to emulate a subject by displaying a talking head including a face of the subject, and output speech from a mouth of the face with a voice of the subject, the user interface further including a receiver to receive a query from a user, the emulated subject being configured to respond to the query received from the user; a personality file memory storing a plurality of documents in an unstructured form and storing model parameters, the model parameters describing probability distributions that relate an acoustic unit to an image vector and a speech vector, the image vector including a plurality of parameters that define the subject'"'"'s face and the speech vector including a plurality of parameters that define the subject'"'"'s voice; and processing circuitry configured to convert said query into a word vector; compare said word vector generated from said query with word vectors generated from the documents in said personality file memory and output identified documents; compare said word vector selected from said query and passages from said identified documents and to rank said selected passages, said ranking being based on a number of matches between said selected passage and said query; concatenate selected passages together using sentence connectors to produce the response, wherein said sentence connectors are chosen from a plurality of sentence connectors, said sentence connectors being chosen based on a language model, convert the response into a sequence of acoustic units using a statistical model, the statistical model including a plurality of model parameters, the model parameters being retrieved from the personality file memory, output a sequence of speech vectors and image vectors that are synchronized such that the head appears to talk, output an expressive response such that the face and voice demonstrate expression, and determine the expression with which to output the generated response, wherein the model parameters stored in the personality file memory describe probability distributions that relate the acoustic unit to the image vector and the speech vector for an associated expression. - View Dependent Claims (2, 3)
-
Specification