Speech Recognition By Post Processing Using Phonetic and Semantic Information
First Claim
1. A method for improving speech recognition of an Automatic Speech Recognition System (ASR) comprising:
- providing, on a non-transitory computer readable storage medium, a vocabulary comprising words from a specified language and their corresponding phonemes;
obtaining at least one sequence of phonemes generated by the ASR from at least one sentence spoken by a human user in a specified language into the ASR, the at least one sentence spoken by a human user comprising words occurring in the vocabulary;
comparing the at least one sequence of phonemes obtained from the ASR for each sentence with the phonemes for at least one spoken word in the vocabulary;
determining whether at least one error is present in the sequence of phonemes obtained from the ASR;
assigning contiguous phonemes obtained from the ASR for each sentence to words in the vocabulary;
producing at least one sequence of words from the assigned words in the vocabulary; and
correcting the at least one error, if present, in the sequence of phonemes obtained from the ASRwhere the ASR is executed on a computer system with one or more processors.
0 Assignments
0 Petitions
Accused Products
Abstract
A system is described for improving results of Automatic Speech Recognition (ASR) systems. ASR'"'"'s typically match patterns of incoming sounds to phonemes associated with sounds in a specified language, then associates phonemes with words. ASR'"'"'s typically consider combinations of up to three phonemes and up to three words. The limitation to small combinations of phonemes and words is one source of errors in ASR'"'"'s. The invention described here post processes the output from ASR'"'"'s. In one embodiment, the method forms long combinations of phonemes and words to improve ASR results. In another embodiment, the method detects errors by finding inconsistencies in the ASR'"'"'s output and then corrects these errors. Other embodiments correct errors that are phonetically close to the correct words, determines the right list of words from a large expected list of sentences, and further improves recognition where word errors are phonetically close to the correct words.
9 Citations
18 Claims
-
1. A method for improving speech recognition of an Automatic Speech Recognition System (ASR) comprising:
-
providing, on a non-transitory computer readable storage medium, a vocabulary comprising words from a specified language and their corresponding phonemes; obtaining at least one sequence of phonemes generated by the ASR from at least one sentence spoken by a human user in a specified language into the ASR, the at least one sentence spoken by a human user comprising words occurring in the vocabulary; comparing the at least one sequence of phonemes obtained from the ASR for each sentence with the phonemes for at least one spoken word in the vocabulary; determining whether at least one error is present in the sequence of phonemes obtained from the ASR; assigning contiguous phonemes obtained from the ASR for each sentence to words in the vocabulary; producing at least one sequence of words from the assigned words in the vocabulary; and correcting the at least one error, if present, in the sequence of phonemes obtained from the ASR where the ASR is executed on a computer system with one or more processors. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method for improving speech recognition of an Automatic Speech Recognition System (ASR) comprising
providing, on a non-transitory computer readable storage medium, a vocabulary comprising words from a specified language and a collection of sentences of words; -
obtaining at least one sequence of words generated by the ASR from at least one sentence spoken by a human user in a specified language, the at least one sentence spoken by a human user comprising words occurring in the vocabulary; comparing the at least one sequence of words obtained from the ASR for each sentence with sequences of words that occur together in the collection of sentences; determining whether at least one error is present in the sequence of words obtained from the ASR; producing at least one sequence of words from the assigned words in the vocabulary; and correcting at least one error, if present, in the sequence of words obtained from the ASR where the ASR is executed on a computer system with one or more processors. - View Dependent Claims (8, 9, 10, 11, 12, 13)
-
-
14. A method for improving speech recognition of an Automatic Speech Recognition System (ASR) comprising:
-
providing, on a non-transitory computer readable storage medium, a vocabulary comprising words from a specified language and a collection of sentences of words; obtaining at least one sequence of words generated by the ASR from at least one sentence spoken by a human user in a specified language, the at least one sentence spoken by a human user occurring in the collection of sentences; comparing the at least one sequence of words obtained from the ASR for each sentence with sequences of words that occur together in the collection of sentences; determining a distance of at least one sequence of words obtained from the ASR with the sequence of words occurring in each sentence in the collection of sentences; and obtaining from the vocabulary at least one sentence closest in distance to at least one sequence of words obtained from the ASR where the ASR is executed on a computer system with one or more processors. - View Dependent Claims (15, 16, 17, 18)
-
Specification