System and method for time aligning speech
First Claim
Patent Images
1. A system for time aligning speech, comprising:
- a data interface for inputting speech data representing speech signals from a speaker;
circuitry for inputting an orthographic transcription including a plurality of words transcribed from said speech signals;
circuitry coupled to said inputting circuitry for generating a sentence model indicating a selected order of said words in response to said orthographic transcription;
circuitry coupled to said inputting circuitry for generating word models in response to said orthographic transcription, said word models being associated with respective ones of said words and being generated from pronunciation representations formed independent of said speech data; and
circuitry coupled to said sentence model generating circuitry, to said word model generating circuitry and to said inputting circuitry, for aligning said orthographic transcription with said speech data in response to said sentence model, to said word models and to said speech data.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and system are provided for time aligning speech. Speech data is input representing speech signals from a speaker. An orthographic transcription is input including a plurality of words transcribed from the speech signals. A sentence model is generated indicating a selected order of the words in response to the orthographic transcription. In response to the orthographic transcription, word models are generated associated with respective ones of the words. The orthographic transcription is aligned with the speech data in response to the sentence model, to the word models and to the speech data.
-
Citations
46 Claims
-
1. A system for time aligning speech, comprising:
-
a data interface for inputting speech data representing speech signals from a speaker; circuitry for inputting an orthographic transcription including a plurality of words transcribed from said speech signals; circuitry coupled to said inputting circuitry for generating a sentence model indicating a selected order of said words in response to said orthographic transcription; circuitry coupled to said inputting circuitry for generating word models in response to said orthographic transcription, said word models being associated with respective ones of said words and being generated from pronunciation representations formed independent of said speech data; and circuitry coupled to said sentence model generating circuitry, to said word model generating circuitry and to said inputting circuitry, for aligning said orthographic transcription with said speech data in response to said sentence model, to said word models and to said speech data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system for time aligning speech, comprising:
-
a data interface for inputting speech data representing speech signals from multiple interlocutors; circuitry for inputting an orthographic transcription including a plurality of words transcribed from said speech signals; circuitry coupled to said inputting circuitry for generating a sentence model indicating a selected order of said words in response to said orthographic transcription; circuitry coupled to said inputting circuitry for generating word models in response to said orthographic transcription, said word models being associated with respective ones of said words; and circuitry coupled to said sentence model generating circuitry, to said word model generating circuitry and to said inputting circuitry, for aligning said orthographic transcription with said speech data in response to said sentence model, to said word models and to said speech data. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A system for time aligning speech, comprising:
-
a data interface for inputting speech data representing unscripted speech signals from a speaker; circuitry for inputting an orthographic transcription including a plurality of words transcribed from said unscripted speech signals; circuitry coupled to said inputting circuitry for generating a sentence model indicating a selected order of said words in response to said orthographic transcription; circuitry coupled to said inputting circuitry for generating word models in response to said orthographic transcription, said word models being associated with respective ones of said words; and circuitry coupled to said sentence model generating circuitry, to said word model generating circuitry and to said inputting circuitry, for aligning said orthographic transcription with said speech data in response to said sentence model, to said word models and to said speech data. - View Dependent Claims (18)
-
-
19. A method of time aligning speech using process circuitry performing the following steps comprising:
-
inputting speech data representing speech signals from a speaker; inputting an orthographic transcription including a plurality of words transcribed from said speech signals; generating a sentence model indicating a selected order of said words in response to said orthographic transcription; in response to said orthographic transcription, generating word models from pronunciation representations formed independent of said speech data, said word models being associated with respective ones of said words; and aligning said orthographic transcription with said speech data in response to said sentence model, to said word models and to said speech data. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28)
-
-
29. A method of time aligning speech using process circuitry performing the following steps comprising:
-
inputting speech data representing speech signals from multiple interlocutors; inputting an orthographic transcription including a plurality of words transcribed from said speech signals; generating a sentence model indicating a selected order of said words in response to said orthographic transcription; in response to said orthographic transcription, generating word models associated with respective ones of said words; and aligning said orthographic transcription with said speech data in response to said sentence model, to said word models and to said speech data. - View Dependent Claims (30, 31, 32, 33, 34, 35, 36)
-
-
37. A method of time aligning speech using process circuitry performing the following steps comprising:
-
inputting speech data representing unscripted speech signals from a speaker; inputting an orthographic transcription including a plurality of words transcribed from said unscripted speech signals; generating a sentence model indicating a selected order of said words in response to said orthographic transcription; in response to said orthographic transcription, generating word models associated with respective ones of said words; and aligning said orthographic transcription with said speech data in response to said sentence model, to said word models and to said speech data. - View Dependent Claims (38)
-
-
39. A system for time aligning speech, comprising:
-
a data interface for inputting speech data representing unscripted speech signals from multiple interlocutors; circuitry for inputting an orthographic transcription including a plurality of words transcribed from said unscripted speech signals; circuitry coupled to said inputting circuitry for generating a sentence model indicating a selected order of said words in response to said orthographic transciption; circuitry coupled to said inputting circuitry for generating word models in response to said orthographic transcription, said word models being associated with respective ones of said words and being generated from pronunciation representations formed independent of said speech data; and circuitry coupled to said sentence model generating circuitry, to said word model generating circuitry and to said inputting circuitry, for aligning said orthographic transcription with said speech data in response to said sentence model, to said word models and to said speech data. - View Dependent Claims (40, 41, 42, 43, 44, 45, 46)
-
Specification