Voice recognition device and voice recognition method
First Claim
1. A non-transitory computer-readable recording medium having recorded thereon a computer program for voice recognition that causes a computer to execute a process comprising:
- extracting, from a first voice signal of a user, a first string of phonemes included in the first voice signal;
determining whether or not any keyword among a plurality of registered keywords stored in a memory is detected in the first string;
when any keyword is detected in the first string, outputting information representing the detected keyword;
when any keyword is not detected, storing the first string;
extracting, from a second voice signal of the user, a second string of phonemes included in the second voice signal;
determining whether or not any keyword among the plurality of registered keywords is detected in the second string;
storing the second string when any keyword is not detected in the second string;
extracting a string of common phonemes from the first string and the second string;
calculating, for each of the plurality of registered keywords, a first degree of similarity between a string of phonemes corresponding to the keyword and the string of common phonemes; and
selecting, among the plurality of keywords, a prescribed number of keywords based on the first degree of similarity for each keyword, wherein determination of whether or not any keyword is detected in the first string includes;
calculating, for each of the plurality of registered keywords, a second degree of similarity between a string of phonemes corresponding to the keyword and the first string of phonemes based on a number of coincident phonemes between the first string of phonemes and the string of phonemes corresponding to the keyword, a number of phonemes that are included in the string of phonemes corresponding to the keyword but not included in the first string of phonemes, and a number of phonemes that are included in the string of phonemes corresponding to the keyword and are different from phonemes at corresponding positions in the first string of phonemes; and
determining that, when a maximum value among the second degrees of similarity is larger than a predetermined threshold value, the keyword corresponding to the maximum value is detected in the first string.
1 Assignment
0 Petitions
Accused Products
Abstract
A voice recognition device extracts, from a first voice signal of a user, a first string of phonemes included in the first voice signal, extracts, from a second voice signal of the user, a second string of phonemes included in the second voice signal, extracts a string of common phonemes from the first string and the second string, calculates, for each of a plurality of registered keywords, a degree of similarity between a string of phonemes corresponding to the keyword and the string of common phonemes, and selects, among the plurality of keywords, a prescribed number of keywords based on the degree of similarity for each keyword.
13 Citations
23 Claims
-
1. A non-transitory computer-readable recording medium having recorded thereon a computer program for voice recognition that causes a computer to execute a process comprising:
-
extracting, from a first voice signal of a user, a first string of phonemes included in the first voice signal; determining whether or not any keyword among a plurality of registered keywords stored in a memory is detected in the first string; when any keyword is detected in the first string, outputting information representing the detected keyword; when any keyword is not detected, storing the first string; extracting, from a second voice signal of the user, a second string of phonemes included in the second voice signal; determining whether or not any keyword among the plurality of registered keywords is detected in the second string; storing the second string when any keyword is not detected in the second string; extracting a string of common phonemes from the first string and the second string; calculating, for each of the plurality of registered keywords, a first degree of similarity between a string of phonemes corresponding to the keyword and the string of common phonemes; and selecting, among the plurality of keywords, a prescribed number of keywords based on the first degree of similarity for each keyword, wherein determination of whether or not any keyword is detected in the first string includes; calculating, for each of the plurality of registered keywords, a second degree of similarity between a string of phonemes corresponding to the keyword and the first string of phonemes based on a number of coincident phonemes between the first string of phonemes and the string of phonemes corresponding to the keyword, a number of phonemes that are included in the string of phonemes corresponding to the keyword but not included in the first string of phonemes, and a number of phonemes that are included in the string of phonemes corresponding to the keyword and are different from phonemes at corresponding positions in the first string of phonemes; and determining that, when a maximum value among the second degrees of similarity is larger than a predetermined threshold value, the keyword corresponding to the maximum value is detected in the first string. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A voice recognition device comprising:
-
a memory configured to store a plurality of registered keywords; and a processor configured to; extract, from a first voice signal of a user, a first string of phonemes included in the first voice signal; determine whether or not any keyword among a plurality of registered keywords is detected in the first string; when any keyword is detected in the first string, output information representing the detected keyword; when any keyword is not detected in the first string, store the first string; extract, from a second voice signal of the user, a second string of phonemes included in the second voice signal; determine whether or not any keyword among a plurality of registered keywords is detected in the second string; store the second string when any keyword is not detected in the second string; extract a string of common phonemes from the first string and the second string; calculate, for each of the plurality of registered keywords, a first degree of similarity between a string of phonemes corresponding to the keyword and the string of common phonemes; and select, among the plurality of keywords, a prescribed number of keywords based on the first degree of similarity for each keyword, wherein the processor for the determination of whether or not any keyword is detected executes to; calculate, for each of the plurality of registered keywords, a second degree of similarity between a string of phonemes corresponding to the keyword and the first string of phonemes based on a number of coincident phonemes between the first string of phonemes and the string of phonemes corresponding to the keyword, a number of phonemes that are included in the string of phonemes corresponding to the keyword but not included in the first string of phonemes, and a number of phonemes that are included in the string of phonemes corresponding to the keyword and are different from phonemes at corresponding positions in the first string of phonemes; and determine that, when a maximum value among the second degrees of similarity is larger than a predetermined threshold value, the keyword corresponding to the maximum value is detected in the first string. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. A voice recognition method comprising:
-
extracting, from a first voice signal of a user, a first string of phonemes included in the first voice signal; determining whether or not any keyword among a plurality of registered keywords stored in a memory is detected in the first string; when any keyword is detected in the first string, outputting information representing the detected keyword; when any keyword is not detected, storing the first string; extracting, from a second voice signal of the user, a second string of phonemes included in the second voice signal; determining whether or not any keyword among a plurality of registered keywords is detected in the second string; storing the second string when any keyword is not detected in the second string; extracting a string of common phonemes from the first string and the second string; and calculating, with respect to each of the plurality of registered keywords, a first degree of similarity between a string of phonemes corresponding to the keyword and the string of common phonemes and, among the plurality of keywords, selecting a prescribed number of keywords based on the first degree of similarity for each keyword, wherein determination of whether or not any keyword is detected in the first string includes; calculating, for each of the plurality of registered keywords, a second degree of similarity between a string of phonemes corresponding to the keyword and the first string of phonemes based on a number of coincident phonemes between the first string of phonemes and the string of phonemes corresponding to the keyword, a number of phonemes that are included in the string of phonemes corresponding to the keyword but not included in the first string of phonemes, and a number of phonemes that are included in the string of phonemes corresponding to the keyword and are different from phonemes at corresponding positions in the first string of phonemes; and determining that, when a maximum value among the second degrees of similarity is larger than a predetermined threshold value, the keyword corresponding to the maximum value is detected in the first string.
-
Specification