Spoken term detection apparatus, method, program, and storage medium
First Claim
1. A spoken term detection apparatus, comprising:
- a storage unit and a processor, whereinthe storage unit includesan accumulation part to accumulate speech data of a retrieval target,an acoustic model storage section to store an acoustic model retaining a characteristic in an acoustic feature space for each unit of speech recognition,an acoustic feature storage to store an acoustic feature extracted from the speech data, anda standard score storage part to store a standard score calculated from a similarity between the acoustic feature and the acoustic model, wherein processing performed by the processor includesa feature extraction process to extract an acoustic feature from speech data accumulated in the accumulation part and store an extracted acoustic feature in the acoustic feature storage,a first calculation process to calculate the standard score from a similarity between an acoustic feature stored in the acoustic feature storage and an acoustic model stored in the acoustic model storage part,an acceptance process to accept an input keyword,a second calculation process to compare an acoustic model corresponding to an accepted keyword with the acoustic feature stored in the acoustic feature storage part to calculate a score of the keyword, anda retrieval process to retrieve speech data including the keyword from speech data accumulated in the accumulation part based on the score of the keyword calculated by the second calculation process and the standard score stored in the standard score storage part, whereinthe standard score equates to the highest-likelihood phoneme series.
1 Assignment
0 Petitions
Accused Products
Abstract
A spoken term detection apparatus includes: processing performed by a processor includes a feature extraction process extracting an acoustic feature from speech data accumulated in an accumulation part and storing an extracted acoustic feature in an acoustic feature storage, a first calculation process calculating a standard score from a similarity between an acoustic feature stored in the acoustic feature storage and an acoustic model stored in the acoustic model storage part, a second calculation process for comparing an acoustic model corresponding to an input keyword with the acoustic feature stored in the acoustic feature storage part to calculate a score of the keyword, and a retrieval process retrieving speech data including the keyword from speech data accumulated in the accumulation part based on the score of the keyword calculated by the second calculation process and the standard score stored in the standard score storage part.
-
Citations
11 Claims
-
1. A spoken term detection apparatus, comprising:
-
a storage unit and a processor, wherein the storage unit includes an accumulation part to accumulate speech data of a retrieval target, an acoustic model storage section to store an acoustic model retaining a characteristic in an acoustic feature space for each unit of speech recognition, an acoustic feature storage to store an acoustic feature extracted from the speech data, and a standard score storage part to store a standard score calculated from a similarity between the acoustic feature and the acoustic model, wherein processing performed by the processor includes a feature extraction process to extract an acoustic feature from speech data accumulated in the accumulation part and store an extracted acoustic feature in the acoustic feature storage, a first calculation process to calculate the standard score from a similarity between an acoustic feature stored in the acoustic feature storage and an acoustic model stored in the acoustic model storage part, an acceptance process to accept an input keyword, a second calculation process to compare an acoustic model corresponding to an accepted keyword with the acoustic feature stored in the acoustic feature storage part to calculate a score of the keyword, and a retrieval process to retrieve speech data including the keyword from speech data accumulated in the accumulation part based on the score of the keyword calculated by the second calculation process and the standard score stored in the standard score storage part, wherein the standard score equates to the highest-likelihood phoneme series. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A spoken term detection method of retrieving speech data including an accepted keyword using an acoustic model holding a characteristic in an acoustic feature space for each unit of speech recognition, comprising:
-
extracting an acoustic feature from accumulated speech data; storing an extracted acoustic feature in an acoustic features storing device; calculating a standard score from a similarity between a stored acoustic feature and an acoustic feature defined by a stored acoustic model; storing the calculated standard score; accepting a keyword; calculating a score of a keyword by comparing an acoustic model corresponding to the keyword with the acoustic feature stored in the acoustic features storing device; and executing a process for retrieving speech data including the keyword from the accumulated speech data, based on a calculated score of the keyword and the standard scored, wherein the standard score equates to the highest-likelihood phoneme series.
-
-
11. A computer-readable storage medium storing a program to be executed by a computer, wherein
the program is a program to be executed by a computer in which speech data is accumulated by an accumulation device and an acoustic model retaining a characteristic in an acoustic feature space for each unit of speech recognition is stored in an acoustic features storing device, and the program allows the computer to execute: -
an extraction process for extracting an acoustic feature from the accumulated speech data; a first calculation process for calculating a standard score from a similarity between the extracted acoustic feature and an acoustic feature defined by the stored acoustic model; a second calculation process for comparing an acoustic model corresponding with the acoustic feature stored in the acoustic features storing device to calculate an accepted keyword to calculate a score of the keyword; and a retrieval process for retrieving speech data including the keyword from speech data accumulated in the accumulation device based on the score of the keyword calculated by the second calculation process and the calculated standard scored, wherein the standard score equates to the highest-likelihood phoneme series.
-
Specification