Voice recognition system and voice processing system
First Claim
1. A voice recognition system comprising:
- a signal processing unit for converting inputted speech voice data into a feature;
an acoustic model storing unit in which an acoustic model obtained by modeling what kind of feature a voice tends to become is stored in advance;
a vocabulary dictionary storing unit in which information of a recognition vocabulary is stored in advance;
a voice section detecting unit for detecting voice sections in the speech voice data according to a predetermined voice section criterion;
a priority determining unit for selecting a voice section to be given priority from among the voice sections detected by the voice section detecting unit according to a predetermined priority criterion;
a decoder for calculating a degree of matching with the recognition vocabulary using the feature of the voice section selected by the priority determining unit and the acoustic model; and
a result output unit for outputting a word sequence having the best score in the matching by the decoder as a recognition result;
wherein the priority determining unit uses as the predetermined priority criterion at least one selected from the group consisting of (1) a length of the voice section, (2) a power or an S/N ratio of the voice section, and (3) a chronological order of the voice section.
1 Assignment
0 Petitions
Accused Products
Abstract
A voice recognition system and a voice processing system in which a self-repair utterance can be inputted and recognized accurately as in a conversation between humans in the case where a user makes the self-repair utterance are provided. An signal processing unit for converting speech voice data into a feature, a voice section detecting unit for detecting voice sections in the speech voice data, a priority determining unit for selecting a voice section to be given priority from among the voice sections detected by the voice section detecting unit according to a predetermined priority criterion, and a decoder for calculating a degree of matching with a recognition vocabulary using the feature of the voice section selected by the priority determining unit and an acoustic model are included. The priority determining unit uses as the predetermined priority criterion at least one selected from the group consisting of (1) a length of the voice section, (2) a power or an S/N ratio of the voice section, and (3) a chronological order of the voice section.
-
Citations
17 Claims
-
1. A voice recognition system comprising:
-
a signal processing unit for converting inputted speech voice data into a feature;
an acoustic model storing unit in which an acoustic model obtained by modeling what kind of feature a voice tends to become is stored in advance;
a vocabulary dictionary storing unit in which information of a recognition vocabulary is stored in advance;
a voice section detecting unit for detecting voice sections in the speech voice data according to a predetermined voice section criterion;
a priority determining unit for selecting a voice section to be given priority from among the voice sections detected by the voice section detecting unit according to a predetermined priority criterion;
a decoder for calculating a degree of matching with the recognition vocabulary using the feature of the voice section selected by the priority determining unit and the acoustic model; and
a result output unit for outputting a word sequence having the best score in the matching by the decoder as a recognition result;
wherein the priority determining unit uses as the predetermined priority criterion at least one selected from the group consisting of (1) a length of the voice section, (2) a power or an S/N ratio of the voice section, and (3) a chronological order of the voice section. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12)
-
-
2. A voice recognition system comprising:
-
a signal processing unit for converting inputted speech voice data into a feature;
an acoustic model storing unit in which an acoustic model obtained by modeling what kind of feature a voice tends to become is stored in advance;
a vocabulary dictionary storing unit in which information of a recognition vocabulary is stored in advance;
a decoder for calculating a degree of matching with the recognition vocabulary using the feature and the acoustic model;
a voice section detecting unit for detecting sections corresponding to a word detected by the decoder to be voice sections;
a priority determining unit for selecting a voice section containing a recognition vocabulary to be used preferentially as a recognition result from among the voice sections detected by the voice section detecting unit according to a predetermined priority criterion; and
a result output unit for outputting a recognition word sequence having the best score in the matching by the decoder as the recognition result;
wherein the priority determining unit uses as the predetermined priority criterion at least one selected from the group consisting of (1) a chronological order with respect to a voice section in which a pre-registered specific vocabulary is detected by the decoder, (2) a chronological order with respect to a voice section in which a pre-registered long vowel is detected by the decoder, and (3) a chronological order with respect to a voice section in which an amount of change in the feature obtained by the signal processing unit continues within a predetermined range. - View Dependent Claims (3)
-
-
4. A voice processing system comprising:
-
a voice recognition unit for recognizing a speech vocabulary sequence from inputted speech voice data; and
a voice input unit for performing an input from a user using a recognition result of the speech voice data generated by the voice recognition unit;
wherein the voice recognition unit comprises a signal processing unit for converting the speech voice data into a feature, an acoustic model storing unit in which an acoustic model obtained by modeling what kind of feature a voice tends to become is stored in advance, a vocabulary dictionary storing unit in which information of a recognition vocabulary is stored in advance, a voice cut-out unit for detecting speech sections in the speech voice data according to a predetermined speech section criterion, a decoder for matching the feature and the acoustic model and calculating a degree of matching between the result of matching and the recognition vocabulary so as to determine a recognition result candidate based on the calculated degree of matching and generate positional information indicating a position of the recognition result candidate within the speech section, and a result output unit for outputting the recognition result candidate determined by the decoder and the positional information to the voice input unit, and the voice input unit comprises a specific vocabulary dictionary storing unit in which information of a specific vocabulary is stored in advance, a specific vocabulary determining unit for determining whether or not the recognition result candidate corresponds to the specific vocabulary by referring to the specific vocabulary dictionary storing unit, and a recognition result selecting unit for selecting a recognition result candidate to be adopted as the recognition result based on the positional information using as a criterion a chronological order with respect to the recognition result candidate corresponding to the specific vocabulary.
-
-
5. A voice processing system comprising:
-
a voice recognition unit for recognizing a speech vocabulary sequence from inputted speech voice data; and
a voice input unit for performing an input from a user using a recognition result of the speech voice data generated by the voice recognition unit;
wherein the voice recognition unit comprises a signal processing unit for converting the speech voice data into a feature, an acoustic model storing unit in which an acoustic model obtained by modeling what kind of feature a voice tends to become is stored in advance, a vocabulary dictionary storing unit in which information of a recognition vocabulary is stored in advance, a voice cut-out unit for detecting speech sections in the speech voice data, a decoder for matching the feature and the acoustic model and calculating a degree of matching between the result of matching and the recognition vocabulary so as to determine a recognition result candidate based on the calculated degree of matching and generate positional information indicating a position of the recognition result candidate within the speech section, and a result output unit for outputting the recognition result candidate determined by the decoder and the positional information to the voice input unit, and the voice input unit comprises a speech speed calculating unit for calculating a speech speed of the recognition result candidate based on the positional information, and a recognition result selecting unit for selecting a recognition result candidate to be adopted as the recognition result using the speech speed as a criterion.
-
-
13. A recording medium storing a program allowing a computer to execute
a signal processing operation of converting inputted speech voice data into a feature; -
a voice section detecting operation of detecting voice sections in the speech voice data according to a predetermined voice section criterion;
a priority determining operation of selecting a voice section to be given priority from among the voice sections detected in the voice section detecting operation according to a predetermined priority criterion;
a matching operation of referring to an acoustic model storing unit in which an acoustic model obtained by modeling what kind of feature a voice tends to become is stored in advance and a vocabulary dictionary storing unit in which information of a recognition vocabulary is stored in advance and using the feature of the voice section selected in the priority determining operation and the acoustic model, thus calculating a degree of matching with the recognition vocabulary; and
a result output operation of outputting a word sequence having the best score in the matching operation as a recognition result;
wherein in the priority determining operation, the program uses as the predetermined priority criterion at least one selected from the group consisting of (1) a length of the voice section, (2) a power or an S/N ratio of the voice section, and (3) a chronological order of the voice section.
-
-
14. A recording medium storing a program allowing a computer to execute
a signal processing operation of converting inputted speech voice data into a feature; -
a matching operation of referring to an acoustic model storing unit in which an acoustic model obtained by modeling what kind of feature a voice tends to become is stored in advance and a vocabulary dictionary storing unit in which information of a recognition vocabulary is stored in advance and using the feature and the acoustic model, thus calculating a degree of matching with the recognition vocabulary;
a voice section detecting operation of detecting voice sections from the speech voice data based on the degree of matching calculated in the matching operation;
a priority determining operation of selecting a voice section containing a recognition vocabulary to be used preferentially as a recognition result from among the voice sections detected in the voice section detecting operation according to a predetermined priority criterion; and
a result output operation of outputting a word sequence having the best score in the matching operation as the recognition result;
wherein in the priority determining operation, at least one selected from the group consisting of (1) a chronological order with respect to a voice section in which a pre-registered specific vocabulary is detected in the matching operation, (2) a chronological order with respect to a voice section in which a pre-registered long vowel is detected in the matching operation, and (3) a chronological order with respect to a voice section in which an amount of change in the feature obtained in the signal processing operation continues within a predetermined range is used as the predetermined priority criterion. - View Dependent Claims (15)
-
-
16. A recording medium storing a program allowing a computer to realize a function of a voice input unit for performing an input from a user using a recognition result generated by a voice recognition unit for recognizing a speech vocabulary sequence from inputted speech voice data,
wherein the voice recognition unit comprises a signal processing unit for converting the speech voice data into a feature, an acoustic model storing unit in which an acoustic model obtained by modeling what kind of feature a voice tends to become is stored in advance, a vocabulary dictionary storing unit in which information of a recognition vocabulary is stored in advance, a voice cutout unit for detecting speech sections in the speech voice data according to a predetermined speech section criterion, a decoder for matching the feature and the acoustic model and calculating a degree of matching between the result of matching and the recognition vocabulary so as to determine a recognition result candidate based on the calculated degree of matching and generate positional information indicating a position of the recognition result candidate within the speech section, and a result output unit for outputting the recognition result candidate determined by the decoder and the positional information as the recognition result, and the program allows a computer to execute a specific vocabulary determining operation of determining whether or not the recognition result candidate corresponds to a specific vocabulary by referring to a specific vocabulary dictionary storing unit in which information of the specific vocabulary is stored in advance, and a recognition result selecting operation of selecting a recognition result candidate to be adopted as the recognition result based on the positional information using as a criterion a chronological order with respect to the recognition result candidate corresponding to the specific vocabulary.
-
17. A recording medium storing a program allowing a computer to realize a function of a voice input unit for performing an input from a user using a recognition result generated by a voice recognition unit for recognizing a speech vocabulary sequence from inputted speech voice data,
wherein the voice recognition unit comprises a signal processing unit for converting the speech voice data into a feature, an acoustic model storing unit in which an acoustic model obtained by modeling what kind of feature a voice tends to become is stored in advance, a vocabulary dictionary storing unit in which information of a recognition vocabulary is stored in advance, a voice cut-out unit for detecting speech sections in the speech voice data according to a predetermined speech section criterion, a decoder for matching the feature and the acoustic model and calculating a degree of matching between the result of matching and the recognition vocabulary so as to determine a recognition result candidate based on the calculated degree of matching and generate positional information indicating a position of the recognition result candidate within the speech section, and a result output unit for outputting the recognition result candidate determined by the decoder and the positional information as the recognition result, and the program allows a computer to execute a speech speed calculating operation of calculating a speech speed of the recognition result candidate based on the positional information, and a recognition result selecting operation of selecting a recognition result candidate to be adopted as the recognition result using the speech speed as a criterion.
Specification