Speech data retrieval apparatus, speech data retrieval method, speech data retrieval program and computer usable medium having computer readable speech data retrieval program embodied therein
First Claim
1. A speech data retrieval apparatus comprising:
- a speech database including plural pieces of speech data therein;
a speech recognition unit configured to read speech data from the speech database, carry out a speech recognition process with respect to the read speech data, and output a result of speech recognition process as a lattice in which a phoneme, a syllable, or a word is a base unit;
a confusion network creation unit configured to create a confusion network based on the lattice from the speech recognition unit and output the result of speech recognition process as the confusion network;
an inverted index table creation unit configured to create an inverted index table based on the confusion network from the confusion network creation unit;
a query input unit configured to receive a query input by a user, carry out a speech recognition process with respect to the received query, and output a result of speech recognition process as a character string;
a query conversion unit configured to convert the character string from the query input unit into a label string in which a phoneme, a syllable, or a word is a base unit; and
a label string check unit configured to check the label string from the query conversion unit against the inverted index table from the inverted index table creation unit, retrieve speech data which is included in both of the label string and the speech database, and output a list of pointer which indicates an address in the speech database in which the retrieved speech data is stored.
2 Assignments
0 Petitions
Accused Products
Abstract
A speech data retrieval apparatus (10) includes a speech database (1), a speech recognition unit (2), a confusion network creation unit (3), an inverted index table creation unit (4), a query input unit (6), a query conversion unit (7) and a label string check unit (8). The speech recognition unit (2) reads speech data from the speech database (1), carries out a speech recognition process with respect to the read speech data, and outputs a result of speech recognition process as a lattice in which a phoneme, a syllable, or a word is a base unit. The confusion network creation unit (3) creates a confusion network based on the output lattice and outputs the result of speech recognition process as the confusion network. The inverted index table creation unit (4) creates an inverted index table based on the output confusion network. The query input unit (6) receives a query input by a user, carries out a speech recognition process with respect to the received query, and outputs a result of speech recognition process as a character string. The query conversion unit (7) converts the output character string into a label string in which a phoneme, a syllable, or a word is a base unit. The label string check unit (8) checks the label string against the inverted index table and retrieves speech data which is included in both of the label string and the speech database (1).
-
Citations
6 Claims
-
1. A speech data retrieval apparatus comprising:
-
a speech database including plural pieces of speech data therein; a speech recognition unit configured to read speech data from the speech database, carry out a speech recognition process with respect to the read speech data, and output a result of speech recognition process as a lattice in which a phoneme, a syllable, or a word is a base unit; a confusion network creation unit configured to create a confusion network based on the lattice from the speech recognition unit and output the result of speech recognition process as the confusion network; an inverted index table creation unit configured to create an inverted index table based on the confusion network from the confusion network creation unit; a query input unit configured to receive a query input by a user, carry out a speech recognition process with respect to the received query, and output a result of speech recognition process as a character string; a query conversion unit configured to convert the character string from the query input unit into a label string in which a phoneme, a syllable, or a word is a base unit; and a label string check unit configured to check the label string from the query conversion unit against the inverted index table from the inverted index table creation unit, retrieve speech data which is included in both of the label string and the speech database, and output a list of pointer which indicates an address in the speech database in which the retrieved speech data is stored. - View Dependent Claims (2)
-
-
3. A speech data retrieval apparatus comprising:
-
a speech database including plural pieces of speech data therein; two or more speech recognition units each of which is configured to read speech data from the speech database, carry out a speech recognition process with respect to the read speech data, and output a result of speech recognition process as a lattice in which a phoneme, a syllable, or a word is a base unit, wherein a base unit of lattice output from one speech recognition unit differs from one output from another speech recognition unit; two or more confusion network creation units respectively connected to the two or more speech recognition unit, wherein each confusion network creation unit is configured to create a confusion network based on the lattice from the corresponding speech recognition unit and output the result of speech recognition process as the confusion network; two or more inverted index table creation units respectively connected to the two or more confusion network creation units, wherein each inverted index table creation unit is configured to create an inverted index table based on the confusion network from the corresponding confusion network creation unit; a query input unit configured to receive a query input by a user, carry out a speech recognition process with respect to the received query, and output a result of speech recognition process as a character string; two or more query conversion units each of which is configured to convert the character string from the query input unit into a label string in which a phoneme, a syllable, or a word is a base unit, wherein a base unit of label string converted in one query conversion unit differs from one converted in another query conversion unit; two or more label string check units respectively connected to the two or more inverted index table creation units and the two or more query conversion units, wherein each label string check unit is configured to check the label string from the corresponding query conversion unit against the inverted index table from the corresponding inverted index table creation unit, and retrieve speech data which is included in both of the label string and the speech database; and a retrieval result integration unit configured to read retrieval results from the two or more label string check units, integrate the read retrieval results to create a retrieval result list, and output a list of pointer which indicates an address in the speech database in which speech data included in the retrieval result list is stored.
-
-
4. A speech data retrieval apparatus comprising:
-
a speech database including plural pieces of speech data therein; two or more speech recognition units each of which is configured to read speech data from the speech database, carry out a speech recognition process with respect to the read speech data, and output a result of speech recognition process as a lattice in which a phoneme, a syllable, or a word is a base unit, wherein a base unit of lattice output from one speech recognition unit differs from one output from another speech recognition unit; two or more confusion network creation units respectively connected to the two or more speech recognition unit, wherein each confusion network creation unit is configured to create a confusion network based on the lattice from the corresponding speech recognition unit and output the result of speech recognition process as the confusion network; a confusion network combination unit configured to combine confusion networks from the two or more confusion network creation units to create a combination network, and output the combination network; an inverted index table creation unit configured to create an inverted index table based on the combination network from the confusion network combination unit; a query input unit configured to receive a query input by a user, carry out a speech recognition process with respect to the received query, and output a result of speech recognition process as a character string; a query conversion unit configured to convert the character string from the query input unit into a label string in which two or more of a phoneme, a syllable, and a word are a base unit; a label string check unit configured to check the label string from the query conversion unit against the inverted index table from the inverted index table creation unit, retrieve speech data which is included in both of the label string and the speech database, and output a list of pointer which indicates an address in the speech database in which the retrieved speech data is stored.
-
-
5. A speech data retrieval method comprising:
-
reading speech data from a speech database which includes plural pieces of speech data therein; carrying out a speech recognition process with respect to the read speech data; outputting a result of speech recognition process as a lattice in which a phoneme, a syllable, or a word is a base unit; creating a confusion network based on the output lattice; outputting the result of speech recognition process as the confusion network; creating an inverted index table based on the output confusion network; receiving a query input by a user; carrying out a speech recognition process with respect to the received query; outputting a result of speech recognition process as a character string; converting the output character string into a label string in which a phoneme, a syllable, or a word is a base unit; checking the label string against the inverted index table; retrieving speech data which is included in both of the label string and the speech database; and outputting a list of pointer which indicates an address in the speech database in which the retrieved speech data is stored.
-
-
6. A non-transitory computer usable medium having a computer readable speech data retrieval program embodied therein, the computer readable speech data retrieval program comprising:
-
a first speech data retrieval program code for causing a computer to read speech data from a speech database which includes plural pieces of speech data therein; a second speech data retrieval program code for causing the computer to carry out a speech recognition process with respect to the read speech data; a third speech data retrieval program code for causing the computer to output a result of speech recognition process as a lattice in which a phoneme, a syllable, or a word is a base unit; a fourth speech data retrieval program code for causing the computer to create a confusion network based on the output lattice; a fifth speech data retrieval program code for causing the computer to output the result of speech recognition process as the confusion network; a sixth speech data retrieval program code for causing the computer to create an inverted index table based on the output confusion network; a seventh speech data retrieval program code for causing the computer to receive a query input by a user; an eighth speech data retrieval program code for causing the computer to carry out a speech recognition process with respect to the received query; a ninth speech data retrieval program code for causing the computer to output a result of speech recognition process as a character string; a tenth speech data retrieval program code for causing the computer to convert the output character string into a label string in which a phoneme, a syllable, or a word is a base unit; an eleventh speech data retrieval program code for causing the computer to check the label string against the inverted index table; a twelfth speech data retrieval program code for causing the computer to retrieve speech data which is included in both of the label string and the speech database; and a thirteenth speech data retrieval program code for causing the computer to output a list of pointer which indicates an address in the speech database in which the retrieved speech data is stored.
-
Specification