×

Low latency real-time vocal tract length normalization

  • US 9,165,555 B2
  • Filed: 11/26/2014
  • Issued: 10/20/2015
  • Est. Priority Date: 01/12/2005
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method comprising:

  • performing, for a speaker specific segment of training data;

    generating spectral data representative of the speaker specific segment, the spectral data comprising a plurality of warping factors;

    selecting a first warping factor as a best warping factor from the plurality of warping factors based on a determination made during speech recognition of the speaker specific segment; and

    generating a warped spectral data representation of the spectral data using the first warping factor;

    iteratively carrying out, until a comparison indicates a warping factor difference below a threshold, the operations of;

    generating another warped spectral data representation using another warping factor;

    comparing the other warped spectral data representation to the warped spectral data representation, to yield the comparison; and

    when the other warping factor produce a closer match to the warped spectral data representation, saving the other warping factor as a best warping factor for the speaker specific segment; and

    training a new acoustic model using a warped spectral data representation of all the training data that is generated using the best warping factor for each of the speaker specific segments.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×