Apparatus, method, and medium for generating grammar network for use in speech recognition and dialogue speech recognition
First Claim
1. An apparatus for generating a grammar network for speech recognition comprising:
- a dialogue history storage unit storing a dialogue history between a system and a user;
a semantic map formed by clustering words forming each dialogue sentence included in a dialogue sentence corpus depending on semantic correlation, and generating a first candidate group formed of a plurality of words having the semantic correlation extracted for each word forming a dialogue sentence provided from the dialogue history storage unit;
an acoustic map formed by clustering words forming each dialogue sentence included in the dialogue sentence corpus depending on acoustic similarity, and generating a second candidate group formed of a plurality of words having an acoustic similarity extracted for each word forming the dialogue sentence provided from the dialogue history storage unit and each word of the first candidate group; and
a grammar network construction unit constructing a grammar network by combining words included in the first candidate group and the words included in the second candidate group.
1 Assignment
0 Petitions
Accused Products
Abstract
A method, apparatus, and medium for generating a grammar network for speech recognition and a dialogue speech recognition are provided. A method, apparatus, and medium for employing the same are provided. The apparatus for generating a grammar network for speech recognition includes: a dialogue history storage unit storing a dialogue history between a system and a user; a semantic map formed by clustering words forming each dialogue sentence included in a dialogue sentence corpus depending on semantic correlation, and generating a first candidate group formed of a plurality of words having the semantic correlation extracted for each word forming a dialogue sentence provided from the dialogue history storage unit; a sound map formed by clustering words forming each dialogue sentence included in the dialogue sentence corpus depending on acoustic similarity, and generating a second candidate group formed of a plurality of words having an acoustic similarity extracted for each word forming the dialogue sentence provided from the dialogue history storage unit and each word of the first candidate group; and a grammar network construction unit constructing a grammar network by combining the first candidate group and the second candidate group.
31 Citations
24 Claims
-
1. An apparatus for generating a grammar network for speech recognition comprising:
-
a dialogue history storage unit storing a dialogue history between a system and a user;
a semantic map formed by clustering words forming each dialogue sentence included in a dialogue sentence corpus depending on semantic correlation, and generating a first candidate group formed of a plurality of words having the semantic correlation extracted for each word forming a dialogue sentence provided from the dialogue history storage unit;
an acoustic map formed by clustering words forming each dialogue sentence included in the dialogue sentence corpus depending on acoustic similarity, and generating a second candidate group formed of a plurality of words having an acoustic similarity extracted for each word forming the dialogue sentence provided from the dialogue history storage unit and each word of the first candidate group; and
a grammar network construction unit constructing a grammar network by combining words included in the first candidate group and the words included in the second candidate group. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method of generating a grammar network for speech recognition comprising:
-
forming a semantic map by clustering words forming each dialogue sentence included in a dialogue sentence corpus depending on semantic correlation;
forming an acoustic map by clustering words forming each dialogue sentence included in the dialogue sentence corpus depending on acoustic similarity;
activating the semantic map and generating a first candidate group formed of a plurality of words having the semantic correlation extracted for each word forming a dialogue sentence included in a dialogue history performed between a system and a user;
activating the acoustic map and generating a second candidate group formed of a plurality of words having an acoustic similarity extracted for each word forming the dialogue sentence included in the dialogue history and each word of the first candidate group; and
generating a grammar network by combining the first candidate group and the second candidate group. - View Dependent Claims (7, 8, 9)
-
-
10. An apparatus for speech recognition comprising:
-
a feature extraction unit extracting features from a user'"'"'s voice and generating a feature vector string;
a grammar network generation unit generating a grammar network by activating a semantic map and an acoustic map by using contents of a dialogue most recently spoken, whenever the user speaks;
a loading unit loading the grammar network generated by the grammar network generation unit; and
a searching unit searching the grammar network loaded in the loading unit, by using the feature vector string, and generating a candidate recognition sentence formed of a word string matching the feature vector string. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. A method of speech recognition comprising:
-
extracting features from a user'"'"'s voice and generating a feature vector string;
generating a grammar network by activating a semantic map and an acoustic map by using contents of a dialogue most recently spoken, whenever the user speaks;
loading the grammar network; and
searching the loaded grammar network, by using the feature vector string, and generating a candidate recognition sentence formed of a word string matching the feature vector string. - View Dependent Claims (17, 18, 19)
-
-
20. At least one computer readable medium storing instructions that control at least one processor for executing a method of generating a grammar network for speech recognition, wherein the method comprises:
-
forming a semantic map by clustering words forming each dialogue sentence included in a dialogue sentence corpus depending on semantic correlation;
forming an acoustic map by clustering words forming each dialogue sentence included in the dialogue sentence corpus depending on acoustic similarity;
activating the semantic map and generating a first candidate group formed of a plurality of words having the semantic correlation extracted for each word forming a dialogue sentence included in a dialogue history performed between a system and a user;
activating the acoustic map and generating a second candidate group formed of a plurality of words having an acoustic similarity extracted for each word forming the dialogue sentence included in the dialogue history and each word of the first candidate group; and
generating a grammar network by combining the first candidate group and the second candidate group.
-
-
21. At least one computer readable medium storing instructions that control at least one processor for executing a method of speech recognition, wherein the method comprises:
-
extracting features from a user'"'"'s voice and generating a feature vector string;
generating a grammar network by activating a semantic map and an acoustic map by using contents of a dialogue most recently spoken, whenever the user speaks;
loading the grammar network; and
searching the loaded grammar network, by using the feature vector string, and generating a candidate recognition sentence formed of a word string matching the feature vector string.
-
-
22. A method of speech recognition comprising:
-
extracting features from a user'"'"'s voice and generating a feature vector string;
generating a grammar network by activating a semantic map and an acoustic map by using contents of a dialogue spoken by a user; and
searching the grammar network, by using the feature vector string, and generating a candidate recognition sentence formed of a word string matching the feature vector string.
-
-
23. The method of claim 23, wherein the generation of a grammar network comprises combining first candidate group formed by activation of the semantic map and second candidate group formed by activation of the acoustic map.
-
24. At least one computer readable medium storing instructions that control at least one processor for executing a method of speech recognition, wherein the method comprises:
-
extracting features from a user'"'"'s voice and generating a feature vector string;
generating a grammar network by activating a semantic map and an acoustic map by using contents of a dialogue spoken by a user; and
searching the grammar network, by using the feature vector string, and generating a candidate recognition sentence formed of a word string matching the feature vector string.
-
Specification