Method and system for aligning natural and synthetic video to speech synthesis

US 7,844,463 B2
Filed: 08/18/2008
Issued: 11/30/2010
Est. Priority Date: 08/05/1997
Status: Expired due to Fees

First Claim

Patent Images

1. A method of aligning video with audio, the method comprising:

identifying a predetermined code associated with an animation mimic in a first stream, wherein the predetermined code comprises an escape sequence followed by a plurality of bits, which define one of a set of possible animation mimics; and

transmitting the predetermined code within a second stream to thereby synchronize the second stream with the first stream.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

According to MPEG-4'"'"'s TTS architecture, facial animation can be driven by two streams simultaneously—text and Facial Animation Parameters. A Text-To-Speech converter drives the mouth shapes of the face. An encoder sends Facial Animation Parameters to the face. The text input can include codes, or bookmarks, transmitted to the Text-to-Speech converter, which are placed between and inside words. The bookmarks carry an encoder time stamp. Due to the nature of text-to-speech conversion, the encoder time stamp does not relate to real-world time, and should be interpreted as a counter. The Facial Animation Parameter stream carries the same encoder time stamp found in the bookmark of the text. The system reads the bookmark and provides the encoder time stamp and a real-time time stamp. The facial animation system associates the correct facial animation parameter with the real-time time stamp using the encoder time stamp of the bookmark as a reference.

Citations

15 Claims

1. A method of aligning video with audio, the method comprising:
- identifying a predetermined code associated with an animation mimic in a first stream, wherein the predetermined code comprises an escape sequence followed by a plurality of bits, which define one of a set of possible animation mimics; and
  
  transmitting the predetermined code within a second stream to thereby synchronize the second stream with the first stream.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1, wherein the first stream is an animation mimics stream and the second stream is a text stream.
  - 3. The method of claim 2, further comprising encoding the first stream containing the animation mimic and the text stream containing the predetermined code.
  - 4. The method of claim 1, wherein the animation mimic is a facial mimic.
  - 5. The method of claim 1, further comprising placing the predetermined code in between words in the second stream.

6. A system for aligning video with audio, the system comprising:
- a processor;
  
  a module configured to control the processor to identify a predetermined code associated with an animation mimic in a first stream, wherein the predetermined code comprises an escape sequence followed by a plurality of bits, which define one of a set of possible animation mimics; and
  
  a module configured to control the processor to transmit the predetermined code within a second stream to thereby synchronize the second stream with the first stream.
- View Dependent Claims (7, 8, 9, 10)
- - 7. The system for claim 6, wherein the first stream is an animation mimics stream and the second stream is a text stream.
  - 8. The system for claim 7, further comprising a module configured to control the processor to encode the first stream containing the animation mimic and the text stream containing the predetermined code.
  - 9. The system for claim 6, wherein the animation mimic is a facial mimic.
  - 10. The system for claim 6, further comprising a module configured to control the processor to place the predetermined code in between words in the second stream.

11. A computer-readable medium storing instructions for controlling a computing device to align a video with audio, the instructions comprising:
- identifying a predetermined code associated with an animation mimic in a first stream, wherein the predetermined code comprises an escape sequence followed by a plurality of bits, which define one of a set of possible animation mimics; and
  
  transmitting the predetermined code within a second stream to thereby synchronize the second stream with the first stream.
- View Dependent Claims (12, 13, 14, 15)
- - 12. The computer-readable medium of claim 11, wherein the first stream is an animation mimics stream and the second stream is a text stream.
  - 13. The computer-readable medium of claim 12, further comprising encoding the first stream containing the animation mimic and the text stream containing the predetermined code.
  - 14. The computer-readable medium of claim 11, wherein the animation mimic is a facial mimic.
  - 15. The computer-readable medium of claim 11, further comprising placing the predetermined code in between words in the second stream.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Inventors
Ostermann, Joern, Basso, Andrea, Beutnagel, Mark Charles
Primary Examiner(s)
Lerner, Martin

Application Number

US12/193,397
Publication Number

US 20080312930A1
Time in Patent Office

834 Days
Field of Search

704/258, 704/260, 704/276, 704/270, 345/473, 715/706
US Class Current

704/260
CPC Class Codes

G06T 9/001   Model-based coding, e.g. wi...

G10L 13/00   Speech synthesis; Text to s...

G10L 21/06   Transformation of speech in...

H04N 21/2368   Multiplexing of audio and v...

H04N 21/4341   Demultiplexing of audio and...

Method and system for aligning natural and synthetic video to speech synthesis

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Method and system for aligning natural and synthetic video to speech synthesis

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links