Speech retrieval method, speech retrieval apparatus, and program for speech retrieval apparatus
First Claim
1. A speech retrieval apparatus, comprising:
- a keyword acquisition unit configured to acquire a keyword designated by a character string, and a phoneme string or a syllable string, the keyword being stored in a non-transitory computer readable storage medium;
a segment detection unit configured to detect one or more coinciding segments by comparing a character string that is a recognition result of word speech recognition with words as recognition units performed for speech data to be retrieved and the character string of the keyword;
an evaluation value calculation unit configured to calculate an evaluation value of each of the one or more coinciding segments by using the phoneme string or the syllable string of the keyword to evaluate a phoneme string or a syllable string that is recognized in each of the detected one or more coinciding segments and that is a recognition result of phoneme speech recognition with phonemes or syllables as recognition units performed for the speech data, wherein the phoneme string or the syllable string associated with each of the segments is a phoneme string or a syllable string associated with a segment in which a start and an end of the segment is expanded by a predetermined time; and
a segment output unit configured to output a segment in which the calculated evaluation value exceeds a predetermined threshold.
2 Assignments
0 Petitions
Accused Products
Abstract
A method for speech retrieval includes acquiring a keyword designated by a character string, and a phoneme string or a syllable string, detecting one or more coinciding segments by comparing a character string that is a recognition result of word speech recognition with words as recognition units performed for speech data to be retrieved and the character string of the keyword, calculating an evaluation value of each of the one or more segments by using the phoneme string or the syllable string of the keyword to evaluate a phoneme string or a syllable string that is recognized in each of the detected one or more segments and that is a recognition result of phoneme speech recognition with phonemes or syllables as recognition units performed for the speech data, and outputting a segment in which the calculated evaluation value exceeds a predetermined threshold.
-
Citations
16 Claims
-
1. A speech retrieval apparatus, comprising:
-
a keyword acquisition unit configured to acquire a keyword designated by a character string, and a phoneme string or a syllable string, the keyword being stored in a non-transitory computer readable storage medium; a segment detection unit configured to detect one or more coinciding segments by comparing a character string that is a recognition result of word speech recognition with words as recognition units performed for speech data to be retrieved and the character string of the keyword; an evaluation value calculation unit configured to calculate an evaluation value of each of the one or more coinciding segments by using the phoneme string or the syllable string of the keyword to evaluate a phoneme string or a syllable string that is recognized in each of the detected one or more coinciding segments and that is a recognition result of phoneme speech recognition with phonemes or syllables as recognition units performed for the speech data, wherein the phoneme string or the syllable string associated with each of the segments is a phoneme string or a syllable string associated with a segment in which a start and an end of the segment is expanded by a predetermined time; and a segment output unit configured to output a segment in which the calculated evaluation value exceeds a predetermined threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A non-transitory computer readable storage medium comprising a computer readable program for a speech retrieval apparatus, wherein the program causes the speech retrieval apparatus to:
-
acquire a keyword designated by a character string, and a phoneme string or a syllable string; detect one or more coinciding segments by comparing a character string that is a recognition result of word speech recognition with words as recognition units performed for speech data to be retrieved and the character string of the keyword; calculate an evaluation value of each of the one or more coinciding segments by using the phoneme string or the syllable string of the keyword to evaluate a phoneme string or a syllable string that is recognized in each of the detected one or more coinciding segments and that is a recognition result of phoneme speech recognition with phonemes or syllables as recognition units performed for the speech data, wherein the phoneme string or the syllable string associated with each of the segments is a phoneme string or a syllable string associated with a segment in which a start and an end of the segment is expanded by a predetermined time; and output a segment in which the calculated evaluation value exceeds a predetermined threshold. - View Dependent Claims (13, 14, 15, 16)
-
Specification