DIALOGUE SPEECH RECOGNITION SYSTEM, DIALOGUE SPEECH RECOGNITION METHOD, AND RECORDING MEDIUM FOR STORING DIALOGUE SPEECH RECOGNITION PROGRAM
First Claim
1. A dialogue speech recognition system comprising:
- a speech recognition unit that receives a speech signal of each speaker in a dialog among a plurality of speakers and turn information indicating whether a speaker having generated the speech signal has turn to speak or indicating a probability that the speaker has turn to speak and performs speech recognition for the speech signal, whereinthe speech recognition unit at least includes;
an acoustic likelihood computation unit that provides a likelihood of occurrence of an input speech signal from a given phoneme sequence;
a linguistic likelihood computation unit that provides a likelihood of occurrence of a given word sequence; and
a maximum likelihood candidate search unit that provides a word sequence with a maximum likelihood of occurrence from a speech signal by using the likelihoods provided by the acoustic likelihood computation unit and the linguistic likelihood computation unit, andthe linguistic likelihood computation unit provides different linguistic likelihoods when a speaker having generated a speech signal input to the speech recognition unit has the turn to speak and when not.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed is a dialogue speech recognition system that can expand the scope of applications by employing a universal dialogue structure as the condition for speech recognition of dialogue speech between persons. An acoustic likelihood computation means (701) provides a likelihood that a speech signal input from a given phoneme sequence will occur. A linguistic likelihood computation means (702) provides a likelihood that a given word sequence will occur. A maximum likelihood candidate search means (703) uses the likelihoods provided by the acoustic likelihood computation means and the linguistic likelihood computation means to provide a word sequence with the maximum likelihood of occurring from a speech signal. Further, the linguistic likelihood computation means (702) provides different linguistic likelihoods when the speaker who generated the acoustic signal input to the speech recognition means does and does not have the turn to speak.
-
Citations
13 Claims
-
1. A dialogue speech recognition system comprising:
-
a speech recognition unit that receives a speech signal of each speaker in a dialog among a plurality of speakers and turn information indicating whether a speaker having generated the speech signal has turn to speak or indicating a probability that the speaker has turn to speak and performs speech recognition for the speech signal, wherein the speech recognition unit at least includes; an acoustic likelihood computation unit that provides a likelihood of occurrence of an input speech signal from a given phoneme sequence; a linguistic likelihood computation unit that provides a likelihood of occurrence of a given word sequence; and a maximum likelihood candidate search unit that provides a word sequence with a maximum likelihood of occurrence from a speech signal by using the likelihoods provided by the acoustic likelihood computation unit and the linguistic likelihood computation unit, and the linguistic likelihood computation unit provides different linguistic likelihoods when a speaker having generated a speech signal input to the speech recognition unit has the turn to speak and when not. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A dialogue speech recognition method comprising:
-
upon receiving a speech signal of each speaker in a dialog among a plurality of speakers and turn information indicating whether a speaker having generated the speech signal has turn to speak or indicating a probability that the speaker has turn to speak, performing speech recognition for the speech signal; at time of the speech recognition, performing acoustic likelihood computation that provides a likelihood of occurrence of an input speech signal from a given phoneme sequence; performing linguistic likelihood computation that provides a likelihood of occurrence of a given word sequence; performing maximum likelihood candidate search that provides a word sequence with a maximum likelihood of occurrence from a speech signal by using the likelihoods provided by the acoustic likelihood computation and the linguistic likelihood computation; and at time of the linguistic likelihood computation, providing different linguistic likelihoods when a speaker having generated an input speech signal has the turn to speak and when not. - View Dependent Claims (11)
-
-
12. A storage medium for storing a dialogue speech recognition program that causes a computer to execute speech recognition processing that, upon receiving a speech signal of each speaker in a dialog among a plurality of speakers and turn information indicating whether a speaker having generated the speech signal has turn to speak or indicating a probability that the speaker has turn to speak, performs speech recognition for the speech signal, wherein
the speech recognition processing at least includes: -
acoustic likelihood computation processing that provides a likelihood of occurrence of an input speech signal from a given phoneme sequence; linguistic likelihood computation processing that provides a likelihood of occurrence of a given word sequence; and maximum likelihood candidate search processing that provides a word sequence with a maximum likelihood of occurrence from a speech signal by using the likelihoods provided by the acoustic likelihood computation processing and the linguistic likelihood computation processing, and the linguistic likelihood computation processing provides different linguistic likelihoods when a speaker having generated the speech signal input to the speech recognition unit has the turn to speak and when not. - View Dependent Claims (13)
-
Specification