Conversational interface agent

US 7,391,421 B2
Filed: 08/04/2005
Issued: 06/24/2008
Est. Priority Date: 12/28/2001
Status: Expired due to Term

First Claim

Patent Images

1. A computer readable storage medium having instructions, which when executed on a computer provide a user interface, the instructions comprising:

a speech synthesizer receiving input for synthesis and providing an audio output signal; and

a video rendering module receiving information related to the audio output signal, the video rendering module rendering a representation of a head having a talking mouth portion with movements in accordance with the audio output signal, and wherein the video rendering module renders, as part of the representation a sequence of video frames having the head, and wherein the video rendering module selectively adds, to each frame, a mouth position for the mouth portion based in part on tracked movements of the head, wherein the video rendering module tracks movements of the head in the sequence of video frames, and wherein the tracked movements include translations and rotations of the head.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A video rewrite technique for rendering a talking head or agent completely simulates a conversation by including a waiting or listening state. Smooth transitions are provided to and from a talking state.

35 Citations

View as Search Results

7 Claims

1. A computer readable storage medium having instructions, which when executed on a computer provide a user interface, the instructions comprising:
- a speech synthesizer receiving input for synthesis and providing an audio output signal; and
  
  a video rendering module receiving information related to the audio output signal, the video rendering module rendering a representation of a head having a talking mouth portion with movements in accordance with the audio output signal, and wherein the video rendering module renders, as part of the representation a sequence of video frames having the head, and wherein the video rendering module selectively adds, to each frame, a mouth position for the mouth portion based in part on tracked movements of the head, wherein the video rendering module tracks movements of the head in the sequence of video frames, and wherein the tracked movements include translations and rotations of the head.
- View Dependent Claims (2, 3)
- - 2. The computer readable medium of claim 1, wherein the positions of the mouth portion are added to each frame based upon interpolated physical movements of the head.
  - 3. The computer readable medium of claim 1, wherein for each of a plurality of frames in the sequence, movements of the head are calculated as a function of a corresponding preceding frame and a corresponding succeeding frame.

4. A computer readable storage medium having instructions, which when executed on a computer provide a user interface, the instructions comprising:
- a speech synthesizer receiving input for synthesis and providing an audio output signal; and
  
  a video rendering module receiving information related to the audio output signal, the video rendering module rendering a representation of a head having a talking mouth portion with movements in accordance with the audio output signal, the video rendering module accessing a store having a sequence of frames of the head and rendering at least a portion of each of the frames in the sequence while selectively adding a corresponding mouth position based at least in part on tracked movements of the head, wherein the tracked movements of the head include translations and rotations.
- View Dependent Claims (5, 6)
- - 5. The computer readable medium of claim 4 wherein the mouth positions are added based upon interpolated physical movements of the head.
  - 6. The computer readable medium of claim 5 wherein for each of a plurality of frames, said interpolated physical movements are calculated as a function of a corresponding preceding frame and a corresponding succeeding frame.

7. A computer-implemented method for generating a talking head on a computer display to simulate a conversation, the method comprising:
- rendering a sequence of video frames of a head;
  
  tracking movements of the head throughout the sequence;
  
  selectively adding a corresponding mouth position to frames in the sequence as a function of the tracked movements of the head;
  
  wherein the tracked movements of the head include translations and rotations;
  
  wherein tracking comprises calculating interpolated physical movements of the head based on frames of the sequence; and
  
  wherein calculating interpolated physical movements includes calculating interpolated physical movements as a function of a corresponding preceding frame and a corresponding succeeding frame for each of a plurality of frames in the sequence.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Zhang, Bo, Shum, Heung-Yeung, Guo, Baining
Primary Examiner(s)
Nguyen; Phu K

Application Number

US11/196,893
Publication Number

US 20050270293A1
Time in Patent Office

1,055 Days
Field of Search

345473-475, 704/231, 704/235, 704/260, 704/280
US Class Current

345/473
CPC Class Codes

G06T 13/40 of characters, e.g. humans,...

G10L 2021/105 Synthesis of the lips movem...

Conversational interface agent

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

35 Citations

7 Claims

Specification

Solutions

Use Cases

Quick Links

Conversational interface agent

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

35 Citations

7 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links