Method for segmenting utterances by using partner's response
First Claim
1. An apparatus for dividing a main speech of a first speaker in a conversational dialog comprising the first speaker and a second speaker into at least one utterance, the apparatus comprising:
- a computer processor configured to execute;
a pause detecting section for detecting pauses in the main speech of the first speaker received from a first channel among at least two channels;
an acknowledgement detecting section for detecting acknowledgements in a speech of the second speaker received from a second channel of the at least two channels, wherein the second channel is separate from the first channel;
a boundary-candidate extracting section for extracting boundary candidates in the main speech of the first speaker received from the first channel based, at least in part, on identifying pauses detected by the pause detecting section that are located within a predetermined range before and/or after respective locations of the acknowledgements detected by the acknowledgement detecting section in the speech of the second speaker received from the second channel; and
a recognizing unit for outputting a word string associated with at least one utterance formed by segmenting the main speech of the first speaker received from the first channel according to at least one of the extracted boundary candidates.
3 Assignments
0 Petitions
Accused Products
Abstract
An apparatus, method and program for dividing a conversational dialog into utterance. The apparatus includes: a computer processor; a word database for storing spellings and pronunciations of words; a grammar database for storing syntactic rules on words; a pause detecting section which detects a pause location in a channel making a main speech among conversational dialogs inputted in at least two channels; an acknowledgement detecting section which detects an acknowledgement location in a channel not making the main speech; a boundary-candidate extracting section which extracts boundary candidates in the main speech, by extracting pauses existing within a predetermined range before and after a base point that is the acknowledgement location; and a recognizing unit which outputs a word string of the main speech segmented by one of the extracted boundary candidates after dividing the segmented speech into optimal utterance in reference to the word database and grammar database.
37 Citations
13 Claims
-
1. An apparatus for dividing a main speech of a first speaker in a conversational dialog comprising the first speaker and a second speaker into at least one utterance, the apparatus comprising:
a computer processor configured to execute; a pause detecting section for detecting pauses in the main speech of the first speaker received from a first channel among at least two channels; an acknowledgement detecting section for detecting acknowledgements in a speech of the second speaker received from a second channel of the at least two channels, wherein the second channel is separate from the first channel; a boundary-candidate extracting section for extracting boundary candidates in the main speech of the first speaker received from the first channel based, at least in part, on identifying pauses detected by the pause detecting section that are located within a predetermined range before and/or after respective locations of the acknowledgements detected by the acknowledgement detecting section in the speech of the second speaker received from the second channel; and a recognizing unit for outputting a word string associated with at least one utterance formed by segmenting the main speech of the first speaker received from the first channel according to at least one of the extracted boundary candidates. - View Dependent Claims (2, 3)
-
4. A method for dividing a main speech of a first speaker in a conversational dialog comprising the first speaker and a second speaker into at least one utterance, the method comprising the steps of:
-
detecting pauses in the main speech of the first speaker received from a first channel of a plurality of channels; detecting acknowledgements in a speech of the second speaker received from a second channel of the plurality of channels, wherein the second channel is separate from the first channel; extracting boundary candidates from the main speech of the first speaker received from the first channel at least in part by identifying detected pauses that are located within a predetermined range before and after respective locations of the detected acknowledgements detected in the speech of the second speaker received from the second channel; and outputting a word string associated with at least one utterance formed by segmenting the main speech of the first speaker received from the first channel according to at least one of the extracted boundary candidates. - View Dependent Claims (5, 6, 7, 8)
-
-
9. A computer-readable storage device storing computer-executable instructions that, when executed by at least one processor, perform a method for dividing a main speech of a first speaker in a conversational dialog comprising the first speaker and a second speaker into at least one utterance, the method comprising:
-
detecting pauses in the main speech of the first speaker received from a first channel of a plurality of channels; detecting acknowledgements in a speech of the second speaker received from a second channel of the plurality of channels, wherein the second channel is separate from the first channel; extracting boundary candidates from the main speech of the first speaker received from the first channel at least in part by identifying detected pauses that are located within a predetermined range before and after respective locations of the detected acknowledgements detected in the speech of the second speaker received from the second channel; and outputting a word string associated with at least one utterance formed by segmenting the main speech of the first speaker received from the first channel according to at least one of the extracted boundary candidates. - View Dependent Claims (10, 11, 12, 13)
-
Specification