Animated Digital Assistant

US 20090044112A1
Filed: 08/09/2007
Published: 02/12/2009
Est. Priority Date: 08/09/2007
Status: Abandoned Application

First Claim

Patent Images

1. A method for interacting with a user comprising:

receiving an input from a device;

determining a text-based response based on the input using a logic engine;

generating an audio stream of a voice-synthesized response based on the text-based response, the voice-synthesized response having a plurality of phonemes;

rendering a video stream based on the plurality of phonemes, the video stream comprising an animated head speaking the voice-synthesized response;

synchronizing the video and the audio;

transmitting the video stream and the audio stream over the network; and

presenting the video stream and the audio stream on the device.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for interacting with a user comprising: receiving an input on a device, determining a text-based response based on the input using a logic engine, generating an audio stream of a voice-synthesized response based on the text-based response, rendering a video stream using a morphing of predetermined shapes based on phonemes in the voice-synthesized response, the video stream comprising an animated head speaking the voice-synthesized response, synchronizing the video stream and the audio stream, transmitting the video stream and the audio stream over the network; and presenting the video stream and the audio stream on the device.

Citations

20 Claims

1. A method for interacting with a user comprising:
- receiving an input from a device;
  
  determining a text-based response based on the input using a logic engine;
  
  generating an audio stream of a voice-synthesized response based on the text-based response, the voice-synthesized response having a plurality of phonemes;
  
  rendering a video stream based on the plurality of phonemes, the video stream comprising an animated head speaking the voice-synthesized response;
  
  synchronizing the video and the audio;
  
  transmitting the video stream and the audio stream over the network; and
  
  presenting the video stream and the audio stream on the device.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1 wherein the step of rendering the video stream comprises morphing a plurality of predetermined shapes based on the plurality of phonemes.
  - 3. The method of claim 1 wherein the input comprises a user identity.
  - 4. The method of claim 1 wherein the input comprises a universal resource locator identifying the page displayed in a browser on the device.
  - 5. The method of claim 1 wherein the user input comprises session data from a session process on the device.
  - 6. The method of claim 1 further comprising:
    - transmitting a menu to the device, the menu comprising a plurality of choices; and
      
      displaying the menu on the device;
      
      the input comprising a selection of at least one of the plurality of choices.

7. A machine-readable medium that provides instructions for a processor, which when executed by the processor cause the processor to perform a method for interacting with a user comprising:
- receiving an input from a device;
  
  determining a text-based response based on the input using a logic engine;
  
  generating an audio stream of a voice-synthesized response based on the text-based response, the voice-synthesized response having a plurality of phonemes;
  
  rendering a video stream based on the plurality of phonemes, the video stream comprising an animated head speaking the voice-synthesized response;
  
  synchronizing the video stream and the audio stream;
  
  transmitting the video stream and the audio stream over the network; and
  
  presenting the video stream and the audio stream on the device.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The machine-readable of claim 7 wherein the step of rendering the video stream comprises morphing a plurality of predetermined shapes based on the plurality of phonemes.
  - 9. The machine-readable of claim 7 wherein the input comprises a user identity.
  - 10. The machine-readable of claim 7 wherein the input comprises a universal resource locator identifying the page displayed in a browser on the device.
  - 11. The machine-readable of claim 7 wherein the user input comprises session data from a session process on the device.
  - 12. The machine-readable of claim 7 further comprising:
    - transmitting a menu to the device, the menu comprising a plurality of choices; and
      
      displaying the menu on the device;
      
      the input comprising a selection of at least one of the plurality of choices.

13. A system for interacting with a user comprising:
- a device configured to receive an input and present a video stream and an audio stream;
  
  a server coupled to the device, the server being configured to receive the input and transmit the video stream and the audio stream to the device;
  
  a logic process coupled to receive the input, the logic process generating a text-based response based on the input;
  
  a text-to-speech process configured to receive the text-based response and generate an audio stream of a voice-synthesized response based on the text-based response, the voice-synthesized response having a plurality of phonemes;
  
  a video rendering process for generating a video stream based on the plurality of phonemes, the video stream comprising an animated head speaking the voice-synthesized response; and
  
  a synchronization process for synchronizing the audio stream and the video stream.
- View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
- - 14. The system of claim 13 wherein the video rendering process comprises morphing a plurality of predetermined shapes based on the plurality of phonemes.
  - 15. The system of claim 13 wherein the logic process uses a rules-based system.
  - 16. The system of claim 13 wherein the logic process uses a neural network.
  - 17. The system of claim 13 wherein the logic process uses a natural language processor.
  - 18. The system of claim 13 wherein the input comprises a user identity.
  - 19. The system of claim 13 wherein the user input comprises session data from a session process on the device.
  - 20. The system of claim 13 wherein the logic process generates a menu comprising a plurality of choices;
    - the server transmitting the menu to the device, the device being configured to display the menu;
      
      the input comprising a selection of at least one of the plurality of choices.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
H-Care SRL (Pat SRL)
Original Assignee
H-Care SRL (Pat SRL)
Inventors
SALVADORI, Fabio, BASSO, Umberto

Application Number

US11/836,750
Publication Number

US 20090044112A1
Time in Patent Office

Days
Field of Search
US Class Current

715/706
CPC Class Codes

G06T 13/205   driven by audio data

G06T 13/40   of characters, e.g. humans,...

G06T 13/80   2D [Two Dimensional] animat...

G06T 2210/44   Morphing

G10L 13/08   Text analysis or generation...

G10L 2021/105   Synthesis of the lips movem...

Animated Digital Assistant

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Animated Digital Assistant

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links