Apparatus, method, and medium for dialogue speech recognition using topic domain detection
First Claim
1. An apparatus for dialogue speech recognition using topic domain detection, comprising:
- a forward search module to perform a forward search to create a word lattice based on a feature vector, which is extracted from an input voice signal, with reference to a global language model database, a pronunciation dictionary database and an acoustic model database, which have been previously established;
a topic-domain-detection module to detect a topic domain during run-time of a speech recognition procedure from among one or more candidate topic domains, by inferring a topic based on meanings of vocabularies contained in the word lattice using information of the word lattice created as a result of the forward search;
a backward-decoding module to perform a backward decoding relative to the detected topic domain with reference to a specific topic domain language model database, which has been previously established, thereby outputting a speech recognition result for an input voice signal in the form of a text; and
a text-information-management module to store and manage information including information related to the topic domain of the output text which is output by the backward-decoding module, and history information which includes a previous topic domain detected relative to a previous output text obtained as a result of a previous backward decoding of a previous dialogue, andwherein, the topic-domain-detection module further detects the topic domain by determining whether one of the one or more candidate topic domains is the same as a topic domain which is the previous topic domain detected during run-time, using the history information which includes the previous topic domain detected,wherein the topic-domain-detection module includes;
a stop-word-removal module to remove stop words, which are not concerned with the topic, among vocabularies forming the word lattice;
a topic domain distance calculation module, which receives the word lattice, in which the stop words have been removed, to calculate a distance for each of the one or more candidate topic domains based on the vocabularies contained in the word lattice, and receives history information including the previous output text from the text-information-management module to calculate the distance for each of the one or more candidate topic domains, and calculates the distance for each of the one or more candidate topic domains according to a plurality of probability factors,wherein for a first factor, a higher probability weight is given to a candidate topic domain if it is the same as the previous topic domain detected, and a lower probability weight is given to a candidate topic domain if it is different from the previous topic domain detected,wherein for a second factor, a higher probability weight is given to a candidate topic domain in accordance with an increase in a frequency of topic words supporting the candidate topic domain among vocabularies forming the word lattice, andthe first factor and second factor are obtained during run-time of the speech recognition procedure.
1 Assignment
0 Petitions
Accused Products
Abstract
An apparatus, method, and medium for dialogue speech recognition using topic domain detection are disclosed. An apparatus includes a forward search module performing a forward search in order to create a word lattice similar to a feature vector, which is extracted from an input voice signal, with reference to a global language model database, a pronunciation dictionary database and an acoustic model database, which have been previously established, a topic-domain-detection module detecting a topic domain by inferring a topic based on meanings of vocabularies contained in the word lattice using information of the word lattice created as a result of the forward search, and a backward-decoding module performing a backward decoding of the detected topic domain with reference to a specific topic domain language model database, which has been previously established, thereby outputting a speech recognition result for an input voice signal in text form. Accuracy and efficiency for a dialogue sentence are improved.
39 Citations
18 Claims
-
1. An apparatus for dialogue speech recognition using topic domain detection, comprising:
-
a forward search module to perform a forward search to create a word lattice based on a feature vector, which is extracted from an input voice signal, with reference to a global language model database, a pronunciation dictionary database and an acoustic model database, which have been previously established; a topic-domain-detection module to detect a topic domain during run-time of a speech recognition procedure from among one or more candidate topic domains, by inferring a topic based on meanings of vocabularies contained in the word lattice using information of the word lattice created as a result of the forward search; a backward-decoding module to perform a backward decoding relative to the detected topic domain with reference to a specific topic domain language model database, which has been previously established, thereby outputting a speech recognition result for an input voice signal in the form of a text; and a text-information-management module to store and manage information including information related to the topic domain of the output text which is output by the backward-decoding module, and history information which includes a previous topic domain detected relative to a previous output text obtained as a result of a previous backward decoding of a previous dialogue, and wherein, the topic-domain-detection module further detects the topic domain by determining whether one of the one or more candidate topic domains is the same as a topic domain which is the previous topic domain detected during run-time, using the history information which includes the previous topic domain detected, wherein the topic-domain-detection module includes; a stop-word-removal module to remove stop words, which are not concerned with the topic, among vocabularies forming the word lattice; a topic domain distance calculation module, which receives the word lattice, in which the stop words have been removed, to calculate a distance for each of the one or more candidate topic domains based on the vocabularies contained in the word lattice, and receives history information including the previous output text from the text-information-management module to calculate the distance for each of the one or more candidate topic domains, and calculates the distance for each of the one or more candidate topic domains according to a plurality of probability factors, wherein for a first factor, a higher probability weight is given to a candidate topic domain if it is the same as the previous topic domain detected, and a lower probability weight is given to a candidate topic domain if it is different from the previous topic domain detected, wherein for a second factor, a higher probability weight is given to a candidate topic domain in accordance with an increase in a frequency of topic words supporting the candidate topic domain among vocabularies forming the word lattice, and the first factor and second factor are obtained during run-time of the speech recognition procedure. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method of dialogue speech recognition using topic domain detection, comprising:
-
performing a forward search to create a word lattice based on a feature vector, which is extracted from an input voice signal, with reference to a global language model database, a pronunciation dictionary database and an acoustic model database, which have been previously established; detecting a topic domain during run-time of a speech recognition procedure from among one or more candidate topic domains, by inferring a topic based on meanings of vocabularies contained in the word lattice using information of the word lattice created as a result of the forward search; and performing a backward decoding relative to the detected topic domain with reference to a specific topic domain language model database, which has been previously established, thereby outputting a speech recognition result for an input voice signal in the form of a text, wherein, the detecting a topic domain further comprises determining whether one of the one or more candidate topic domains is the same as a topic domain which is the previous topic domain detected during run-time, relative to a previous output text obtained as a result of a previous backward decoding of a previous dialogue, using history information which includes the previous topic domain detected, wherein detecting the topic domain includes; removing stop words, which have no concern with the topic, among vocabularies forming the word lattice; calculating a distance for each of the one or more candidate topic domains based on the vocabularies contained in the word lattice by receiving the word lattice, in which the stop words have been removed, wherein the calculating the distance for each of the one or more candidate topic domains comprises receiving history information including the previous output text, to calculate the distance for each of the one or more candidate topic domains, and calculating the distance for each of the one or more candidate topic domains according to a plurality of probability factors, wherein for a first factor, a higher probability weight is given to a candidate topic domain if it is the same as the previous topic domain detected, and a lower probability weight is given to a candidate topic domain if it is different from the previous topic domain detected, for a second factor, a higher probability weight is given to a candidate topic domain in accordance with an increase in a frequency of topic words supporting the candidate topic domain among vocabularies forming the word lattice, and the first factor and second factor are obtained during run-time of the speech recognition procedure. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14)
-
-
15. A method of dialogue speech recognition using topic domain detection, comprising:
-
performing a forward search to create a word lattice based on a feature vector, which is extracted from an input voice signal, with reference to at least one previously established database; detecting a topic domain during run-time of a speech recognition procedure from among one or more candidate topic domains, by inferring a topic based on meanings of vocabularies contained in the word lattice using information of the word lattice created as a result of the forward search; and performing a backward decoding relative to the detected topic domain with reference to a specific topic domain language model database, which has been previously established, thereby outputting a speech recognition result for an input voice signal in the form of a text, wherein, the detecting a topic domain further comprises determining whether one of the one or more candidate topic domains is the same as a topic domain which is the previous topic domain detected during run time, relative to a previous output text obtained as a result of a previous backward decoding of a previous dialogue, using history information which includes the previous topic domain detected, wherein detecting the topic domain includes; removing stop words, which have no concern with the topic, among vocabularies forming the word lattice; calculating a distance for each of the one or more candidate topic domains based on the vocabularies contained in the word lattice by receiving the word lattice, in which the stop words have been removed, wherein the calculating the distance for each of the one or more candidate topic domains comprises receiving history information including the previous output text, to calculate the distance for each of the one or more candidate topic domains, and calculating the distance for each of the one or more candidate topic domains according to a plurality of probability factors, wherein for a first factor, a higher probability weight is given to a candidate topic domain if it is the same as the previous topic domain detected, and a lower probability weight is given to a candidate topic domain if it is different from the previous topic domain detected, for a second factor, a higher probability weight is given to a candidate topic domain in accordance with an increase in a frequency of topic words supporting the candidate topic domain among vocabularies forming the word lattice, and the first factor and second factor are obtained during run-time of the speech recognition procedure. - View Dependent Claims (16, 17, 18)
-
Specification