×

System and method for triphone-based unit selection for visual speech synthesis

  • US 7,209,882 B1
  • Filed: 05/10/2002
  • Issued: 04/24/2007
  • Est. Priority Date: 05/10/2002
  • Status: Active Grant
First Claim
Patent Images

1. A method of generating a video sequence having mouth movements synchronized with speech sounds, the method utilizing a database of n-phones as a smallest selectable unit, where n is larger than 1, the method comprising:

  • calculating a target cost for each candidate n-phone for a target sequence using a phonetic distance, coarticulation parameter, and speech rate;

    for each target frame in the target sequence, searching for candidate n-phones that are phonetically and/or visually similar according to the target cost;

    sampling each candidate n-phone to get a same number of candidate phone frames as in the target sequence;

    building a video frame lattice of candidate video frames based on the candidate n-phones;

    assigning a joint cost to each pair of adjacent video frames; and

    constructing the video sequence according to a Viterbi search on the video frame lattice by finding the optimal path through the lattice according to the minimum of the sum of the target cost and the joint cost over the sequence.

View all claims
  • 11 Assignments
Timeline View
Assignment View
    ×
    ×