Pronunciation accuracy in speech recognition
First Claim
1. A method for improving reading accuracy in speech recognition using processing by a computer, the method comprising computer-executed steps of:
- obtaining a plurality of candidate word strings from speech recognition results, wherein the speech recognition results contain a speech recognition score for each of the plurality of candidate word strings;
determining a reading of each of the plurality of candidate word strings, wherein two or more candidate word strings have the same reading, and wherein the two or more candidate word strings having the same reading are homophones;
determining a reading score for each candidate word string, wherein the reading score for each of the two or more candidate word strings with the same reading is based on a total value of the speech recognition scores for the two or more candidate word strings with the same reading, and wherein determining the total value of the speech recognition scores for the two or more candidate word strings with the same reading includes computer-executed steps of;
determining two or more candidate word strings with partial tolerable different readings to be treated as having the same reading; and
calculating the total value of the speech recognition scores for the two or more candidate word strings with the same reading includes speech recognition scores for the two or more candidate word strings with partial tolerable different readings to be treated as having the same reading; and
providing a conversion table containing word strings with partial tolerable different readings to be treated as having the same reading, andwherein determining the two or more candidate word strings with partial tolerable different readings to be treated as having the same reading is based on the conversion table; and
selecting a candidate among the plurality of candidate word strings to output on the basis of the reading score and the speech recognition score corresponding to each word string, wherein selecting the candidate includes a computer-executed step selected from the group consisting of(a) weighting and adding together the speech recognition score and the corresponding reading score for each candidate word string to obtain a new score, and selecting the candidate word string with the highest new score; and
(b) selecting a candidate word string with the highest speech recognition score from among the one or more candidate word strings with the highest reading score.
1 Assignment
0 Petitions
Accused Products
Abstract
A reading accuracy-improving system includes: a reading conversion unit for retrieving a plurality of candidate word strings from speech recognition results to determine the reading of each candidate word string; a reading score calculating unit for determining the speech recognition score for each of one or more candidate word strings with the same reading to determine a reading score; and a candidate word string selection unit for selecting a candidate to output from the plurality of candidate word strings on the basis of the reading score and speech recognition score corresponding to each candidate word string.
-
Citations
10 Claims
-
1. A method for improving reading accuracy in speech recognition using processing by a computer, the method comprising computer-executed steps of:
-
obtaining a plurality of candidate word strings from speech recognition results, wherein the speech recognition results contain a speech recognition score for each of the plurality of candidate word strings; determining a reading of each of the plurality of candidate word strings, wherein two or more candidate word strings have the same reading, and wherein the two or more candidate word strings having the same reading are homophones; determining a reading score for each candidate word string, wherein the reading score for each of the two or more candidate word strings with the same reading is based on a total value of the speech recognition scores for the two or more candidate word strings with the same reading, and wherein determining the total value of the speech recognition scores for the two or more candidate word strings with the same reading includes computer-executed steps of; determining two or more candidate word strings with partial tolerable different readings to be treated as having the same reading; and calculating the total value of the speech recognition scores for the two or more candidate word strings with the same reading includes speech recognition scores for the two or more candidate word strings with partial tolerable different readings to be treated as having the same reading; and providing a conversion table containing word strings with partial tolerable different readings to be treated as having the same reading, and wherein determining the two or more candidate word strings with partial tolerable different readings to be treated as having the same reading is based on the conversion table; and selecting a candidate among the plurality of candidate word strings to output on the basis of the reading score and the speech recognition score corresponding to each word string, wherein selecting the candidate includes a computer-executed step selected from the group consisting of (a) weighting and adding together the speech recognition score and the corresponding reading score for each candidate word string to obtain a new score, and selecting the candidate word string with the highest new score; and (b) selecting a candidate word string with the highest speech recognition score from among the one or more candidate word strings with the highest reading score. - View Dependent Claims (2, 3, 4)
-
-
5. A reading accuracy-improving computer system, the computer system comprising:
-
a processor; at least one computer-readable memory; a computer-readable storage device; and a program stored on the computer-readable storage device for execution by the processor via the at least one computer-readable memory, the program comprises program instructions for; obtaining a plurality of candidate word strings from speech recognition results, wherein the speech recognition results contain a speech recognition score for each of the plurality of candidate word strings; determining a reading of each of the plurality of candidate word strings, wherein two or more candidate word strings have the same reading, and wherein the two or more candidate word strings having the same reading are homophones; determining a reading score for each candidate word string, wherein the reading score for each of the two or more candidate word strings with the same reading is based on a total value of the speech recognition scores for the two or more candidate word strings with the same reading, and wherein determining the total value of the speech recognition scores for the two or more candidate word strings with the same reading includes computer-executed steps of; determining two or more candidate word strings with partial tolerable different readings to be treated as having the same reading; and calculating the total value of the speech recognition scores for the two or more candidate word strings with the same reading includes speech recognition scores for the two or more candidate word strings with partial tolerable different readings to be treated as having the same reading; and providing a conversion table containing word strings with partial tolerable different readings to be treated as having the same reading, and wherein determining the two or more candidate word strings with partial tolerable different readings to be treated as having the same reading is based on the conversion table; and selecting a candidate among the plurality of candidate word strings to output on the basis of the reading score and the speech recognition score corresponding to each word string, wherein selecting the candidate includes a computer-executed step selected from the group consisting of (a) weighting and adding together the speech recognition score and the corresponding reading score for each candidate word string to obtain a new score, and selecting the candidate word string with the highest new score; and (b) selecting a candidate word string with the highest speech recognition score from among the one or more candidate word strings with the highest reading score. - View Dependent Claims (6, 7)
-
-
8. A reading-accuracy improving non-transitory computer program product, comprising a computer-readable storage medium having program code embodied therewith, the program code executable by a processor of a computer to perform a method comprising:
-
obtaining, by the processor, a plurality of candidate word strings from speech recognition results, wherein the speech recognition results contain a speech recognition score for each of the plurality of candidate word strings; determining, by the processor, a reading of each of the plurality of candidate word strings, wherein two or more candidate word strings have the same reading, and wherein the two or more candidate word strings having the same reading are homophones; determining, by the processor, a reading score for each candidate word string, wherein the reading score for each of the two or more candidate word strings with the same reading is based on a total value of the speech recognition scores for the two or more candidate word strings with the same reading, and wherein determining the total value of the speech recognition scores for the two or more candidate word strings with the same reading includes; determining, by the processor, two or more candidate word strings with partial tolerable different readings to be treated as having the same reading; and calculating the total value of the speech recognition scores for the two or more candidate word strings with the same reading includes speech recognition scores for the two or more candidate word strings with partial tolerable different readings to be treated as having the same reading; and providing a conversion table containing word strings with partial tolerable different readings to be treated as having the same reading, and wherein determining the two or more candidate word strings with partial tolerable different readings to be treated as having the same reading is based on the conversion table; and selecting, by the processor, a candidate among the plurality of candidate word strings to output on the basis of the reading score and the speech recognition score corresponding to each word string, wherein selecting the candidate includes a computer-executed step selected from the group consisting of (a) weighting and adding together the speech recognition score and the corresponding reading score for each candidate word string to obtain a new score, and selecting the candidate word string with the highest new score; and (b) selecting a candidate word string with the highest speech recognition score from among the one or more candidate word strings with the highest reading score. - View Dependent Claims (9, 10)
-
Specification