Pronunciation accuracy in speech recognition

US 9,384,730 B2
Filed: 04/14/2014
Issued: 07/05/2016
Est. Priority Date: 05/30/2013
Status: Active Grant

First Claim

Patent Images

1. A method for improving reading accuracy in speech recognition using processing by a computer, the method comprising computer-executed steps of:

obtaining a plurality of candidate word strings from speech recognition results, wherein the speech recognition results contain a speech recognition score for each of the plurality of candidate word strings;

determining a reading of each of the plurality of candidate word strings, wherein two or more candidate word strings have the same reading, and wherein the two or more candidate word strings having the same reading are homophones;

determining a reading score for each candidate word string, wherein the reading score for each of the two or more candidate word strings with the same reading is based on a total value of the speech recognition scores for the two or more candidate word strings with the same reading, and wherein determining the total value of the speech recognition scores for the two or more candidate word strings with the same reading includes computer-executed steps of;

determining two or more candidate word strings with partial tolerable different readings to be treated as having the same reading; and

calculating the total value of the speech recognition scores for the two or more candidate word strings with the same reading includes speech recognition scores for the two or more candidate word strings with partial tolerable different readings to be treated as having the same reading; and

providing a conversion table containing word strings with partial tolerable different readings to be treated as having the same reading, andwherein determining the two or more candidate word strings with partial tolerable different readings to be treated as having the same reading is based on the conversion table; and

selecting a candidate among the plurality of candidate word strings to output on the basis of the reading score and the speech recognition score corresponding to each word string, wherein selecting the candidate includes a computer-executed step selected from the group consisting of(a) weighting and adding together the speech recognition score and the corresponding reading score for each candidate word string to obtain a new score, and selecting the candidate word string with the highest new score; and

(b) selecting a candidate word string with the highest speech recognition score from among the one or more candidate word strings with the highest reading score.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A reading accuracy-improving system includes: a reading conversion unit for retrieving a plurality of candidate word strings from speech recognition results to determine the reading of each candidate word string; a reading score calculating unit for determining the speech recognition score for each of one or more candidate word strings with the same reading to determine a reading score; and a candidate word string selection unit for selecting a candidate to output from the plurality of candidate word strings on the basis of the reading score and speech recognition score corresponding to each candidate word string.

Citations

10 Claims

1. A method for improving reading accuracy in speech recognition using processing by a computer, the method comprising computer-executed steps of:
- obtaining a plurality of candidate word strings from speech recognition results, wherein the speech recognition results contain a speech recognition score for each of the plurality of candidate word strings;
  
  determining a reading of each of the plurality of candidate word strings, wherein two or more candidate word strings have the same reading, and wherein the two or more candidate word strings having the same reading are homophones;
  
  determining a reading score for each candidate word string, wherein the reading score for each of the two or more candidate word strings with the same reading is based on a total value of the speech recognition scores for the two or more candidate word strings with the same reading, and wherein determining the total value of the speech recognition scores for the two or more candidate word strings with the same reading includes computer-executed steps of;
  
  determining two or more candidate word strings with partial tolerable different readings to be treated as having the same reading; and
  
  calculating the total value of the speech recognition scores for the two or more candidate word strings with the same reading includes speech recognition scores for the two or more candidate word strings with partial tolerable different readings to be treated as having the same reading; and
  
  providing a conversion table containing word strings with partial tolerable different readings to be treated as having the same reading, andwherein determining the two or more candidate word strings with partial tolerable different readings to be treated as having the same reading is based on the conversion table; and
  
  selecting a candidate among the plurality of candidate word strings to output on the basis of the reading score and the speech recognition score corresponding to each word string, wherein selecting the candidate includes a computer-executed step selected from the group consisting of(a) weighting and adding together the speech recognition score and the corresponding reading score for each candidate word string to obtain a new score, and selecting the candidate word string with the highest new score; and
  
  (b) selecting a candidate word string with the highest speech recognition score from among the one or more candidate word strings with the highest reading score.
- View Dependent Claims (2, 3, 4)
- - 2. The method for improving reading accuracy according to claim 1, wherein the plurality of candidate word strings in the speech recognition results are integrated with N-best lists from a plurality of speech recognition systems.
  - 3. The method for improving reading accuracy according to claim 1, wherein selecting the candidate includes a computer-executed step of selecting all of a plurality of candidate word strings as candidates to be outputted, and rescoring the candidate word strings on the basis of the speech recognition score of each candidate word string and the corresponding reading score.
  - 4. The method for improving reading accuracy according to claim 1, wherein the homophones are homophones with different notations.

5. A reading accuracy-improving computer system, the computer system comprising:
- a processor;
  
  at least one computer-readable memory;
  
  a computer-readable storage device; and
  
  a program stored on the computer-readable storage device for execution by the processor via the at least one computer-readable memory, the program comprises program instructions for;
  
  obtaining a plurality of candidate word strings from speech recognition results, wherein the speech recognition results contain a speech recognition score for each of the plurality of candidate word strings;
  
  determining a reading of each of the plurality of candidate word strings, wherein two or more candidate word strings have the same reading, and wherein the two or more candidate word strings having the same reading are homophones;
  
  determining a reading score for each candidate word string, wherein the reading score for each of the two or more candidate word strings with the same reading is based on a total value of the speech recognition scores for the two or more candidate word strings with the same reading, and wherein determining the total value of the speech recognition scores for the two or more candidate word strings with the same reading includes computer-executed steps of;
  
  determining two or more candidate word strings with partial tolerable different readings to be treated as having the same reading; and
  
  calculating the total value of the speech recognition scores for the two or more candidate word strings with the same reading includes speech recognition scores for the two or more candidate word strings with partial tolerable different readings to be treated as having the same reading; and
  
  providing a conversion table containing word strings with partial tolerable different readings to be treated as having the same reading, andwherein determining the two or more candidate word strings with partial tolerable different readings to be treated as having the same reading is based on the conversion table; and
  
  selecting a candidate among the plurality of candidate word strings to output on the basis of the reading score and the speech recognition score corresponding to each word string, wherein selecting the candidate includes a computer-executed step selected from the group consisting of(a) weighting and adding together the speech recognition score and the corresponding reading score for each candidate word string to obtain a new score, and selecting the candidate word string with the highest new score; and
  
  (b) selecting a candidate word string with the highest speech recognition score from among the one or more candidate word strings with the highest reading score.
- View Dependent Claims (6, 7)
- - 6. The system according to claim 5, wherein the plurality of candidate word strings in the speech recognition results are integrated with N-best lists from a plurality of speech recognition systems.
  - 7. The system according to claim 5, wherein selecting the candidate includes a computer-executed step of selecting all of a plurality of candidate word strings as candidates to be outputted, and rescoring the candidate word strings on the basis of the speech recognition score of each candidate word string and the corresponding reading score.

8. A reading-accuracy improving non-transitory computer program product, comprising a computer-readable storage medium having program code embodied therewith, the program code executable by a processor of a computer to perform a method comprising:
- obtaining, by the processor, a plurality of candidate word strings from speech recognition results, wherein the speech recognition results contain a speech recognition score for each of the plurality of candidate word strings;
  
  determining, by the processor, a reading of each of the plurality of candidate word strings, wherein two or more candidate word strings have the same reading, and wherein the two or more candidate word strings having the same reading are homophones;
  
  determining, by the processor, a reading score for each candidate word string, wherein the reading score for each of the two or more candidate word strings with the same reading is based on a total value of the speech recognition scores for the two or more candidate word strings with the same reading, and wherein determining the total value of the speech recognition scores for the two or more candidate word strings with the same reading includes;
  
  determining, by the processor, two or more candidate word strings with partial tolerable different readings to be treated as having the same reading; and
  
  calculating the total value of the speech recognition scores for the two or more candidate word strings with the same reading includes speech recognition scores for the two or more candidate word strings with partial tolerable different readings to be treated as having the same reading; and
  
  providing a conversion table containing word strings with partial tolerable different readings to be treated as having the same reading, andwherein determining the two or more candidate word strings with partial tolerable different readings to be treated as having the same reading is based on the conversion table; and
  
  selecting, by the processor, a candidate among the plurality of candidate word strings to output on the basis of the reading score and the speech recognition score corresponding to each word string, wherein selecting the candidate includes a computer-executed step selected from the group consisting of(a) weighting and adding together the speech recognition score and the corresponding reading score for each candidate word string to obtain a new score, and selecting the candidate word string with the highest new score; and
  
  (b) selecting a candidate word string with the highest speech recognition score from among the one or more candidate word strings with the highest reading score.
- View Dependent Claims (9, 10)
- - 9. The non-transitory computer program product of claim 8, wherein the plurality of candidate word strings in the speech recognition results, obtained by the processor, are integrated with N-best lists from a plurality of speech recognition systems.
  - 10. The non-transitory computer program product of claim 8, wherein selecting the candidate includes selecting, by the processor, all of a plurality of candidate word strings as candidates to be outputted, and rescoring the candidate word strings, by the processor, on the basis of the speech recognition score of each candidate word string and the corresponding reading score.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Kurata, Gakuto, Nishimura, Masafumi, Tachibana, Ryuki
Primary Examiner(s)
Poon, King
Assistant Examiner(s)
Jackson, Daryl

Application Number

US14/251,786
Publication Number

US 20140358533A1
Time in Patent Office

813 Days
Field of Search
US Class Current

1/1
CPC Class Codes

G10L 15/01   Assessment or evaluation of...

G10L 15/08   Speech classification or se...

G10L 15/187   Phonemic context, e.g. pron...

G10L 15/197   Probabilistic grammars, e.g...

Pronunciation accuracy in speech recognition

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

10 Claims

Specification

Solutions

Use Cases

Quick Links

Pronunciation accuracy in speech recognition

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

10 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links