Method of and system for recognizing a spoken text
First Claim
1. A method of recognizing a spoken text, comprising the steps of:
- converting spoken text uttered by a speaker into first digital data;
converting the first digital data which represent the spoken text, into second digital data which represent recognized text in a speech recognition process depending on conversion data including;
available lexicon data which represent a lexicon, available language model data which represent a language model, and available reference data which represent phonemes;
communicating the recognized text;
obtaining third digital data which represent corrections to the recognized text, depending on the communication of recognized text;
correcting the second digital data using the third digital data to generate fourth digital data which represent corrected text;
adapting the speech recognition process to the speaker depending on the fourth digital data;
converting the first digital data into fifth digital data which represent additionally recognized text using the adapted speech recognition process; and
adapting the available reference data to the speaker depending on the fifth digital data and the first digital data.
2 Assignments
0 Petitions
Accused Products
Abstract
A system for recognizing spoken text includes a microphone for converting spoken text uttered by a speaker into analog electrical signals. An analog to digital converter to convert the analog spoken text data into digital electronic signals. A speech recognition device uses a lexicon data device; a language model data device; and a reference data device to convert the digital spoken text into recognized text data. The system also includes a keyboard for entering error correction data and an error correction device which generates corrected text depending on the corrected text. Adaptation apparatus of the lexicon data device and adaption apparatus of the language model data device, adapt the lexicon data and the language model data respectively to the speaker depending on the corrected text. Then a second speech recognition process is carried out by the speech recognition device depending on the original spoken text, the adapted lexicon data and the adapted language model data to generate newly recognized repeated text which is transmitted to the reference data device. Adaption apparatus of the reference data device or speech recognition device, adapt the reference data to the speaker depending on the recognized repeated text.
-
Citations
10 Claims
-
1. A method of recognizing a spoken text, comprising the steps of:
-
converting spoken text uttered by a speaker into first digital data; converting the first digital data which represent the spoken text, into second digital data which represent recognized text in a speech recognition process depending on conversion data including;
available lexicon data which represent a lexicon, available language model data which represent a language model, and available reference data which represent phonemes;communicating the recognized text; obtaining third digital data which represent corrections to the recognized text, depending on the communication of recognized text; correcting the second digital data using the third digital data to generate fourth digital data which represent corrected text; adapting the speech recognition process to the speaker depending on the fourth digital data; converting the first digital data into fifth digital data which represent additionally recognized text using the adapted speech recognition process; and adapting the available reference data to the speaker depending on the fifth digital data and the first digital data. - View Dependent Claims (2)
-
-
3. A system for recognizing a spoken text, comprising:
-
conversion means for converting the spoken text uttered by a speaker into first digital text means data which represent the spoken text; a speech recognition unit, including; lexicon data means for storing lexicon data which represent a lexicon; language model data means for storing language model data which represent a language model; reference data means for storing reference data which represent phonemes; and speech recognition means to generate second digital text data which represent recognized text, in a speech recognition process depending on conversion data including;
the first digital text data, the lexicon data, the language model data, and the reference data;means for obtaining third digital text data representing error correction data; error correction means for correcting the recognized text represented by the second digital text data depending on the third digital text data, by changing a part of the second digital text data depending on the third digital text data, and to generate fourth digital text data which represent corrected text; and adaptation means for adapting the speech recognition unit based on digital text data, including; means for adapting the lexicon data to the speaker depending on the fourth digital text data and storing the adapted lexicon data in the lexicon data means; means for adapting the language model data to the speaker depending on the fourth digital text data and storing the adapted language model data in the language model data means; means for adapting the reference data to the speaker depending on the first digital text data and the fourth digital text data and storing the adapted reference data in the reference data means;
re-adaption means, including;means for converting the first digital data into fifth digital data which represent newly recognized text, using the speech recognition unit after the adaptation means has adapted the speech recognition data depending on the fourth digital data; and the adaptation means being for adapting the available reference data to the speaker of the spoken text depending on the fifth digital data and the first digital data. - View Dependent Claims (4)
-
-
5. Apparatus for generating signals for operating a first computer system to recognize spoken text, comprising:
-
conversion means for converting the spoken text uttered by a speaker into first digital text data which represent the spoken text; a speech recognition unit, including; lexicon data means for storing lexicon data which represent a lexicon stored in the lexicon data device, and; language model data means for storing language model data which represent a language model; reference data means for storing reference data which represent phonemes; speech recognition means to generate second digital text data which represent recognized text, in a speech recognition process depending on the first digital text data, the lexicon data, the language model data, and the reference data; means for obtaining third digital text data representing error correction data; and error correction means for correcting the recognized text represented by the second digital text data depending on the third digital text data, by changing a part of the second digital text data depending on the third digital text data, and to generate fourth digital text data which represent corrected text; adaptation means for adapting the speech recognition unit based on digital text data, including; means for adapting the lexicon data to the speaker depending on digital text data and storing the adapted lexicon data in the lexicon data means; means for adapting the language model data to the speaker depending on digital text data and storing the adapted language model data in the language model data means; and means for adapting the reference data to the speaker depending on digital text data and storing the adapted reference data in the reference data means; and re-adaption means, including; means for converting the first digital data into fifth digital data which represent newly recognized text, using the speech recognition unit after the adaptation means had adapted the speech recognition data depending on the fourth digital data; and the adaptation means being for adapting the available reference data to the speaker of the spoken text depending on the fifth digital data and the first digital data. - View Dependent Claims (6, 7)
-
-
8. A method of recognizing spoken text uttered by a speaker, comprising the steps of:
-
converting spoken text uttered by a speaker into first digital data; converting the first digital data which represent the spoken text, into second digital data which represent recognized text in a speech recognition process depending on conversion data including available lexicon data which represent a lexicon, available language model data which represent a language model, and available reference data which represent phonemes; communicating the recognized text; obtaining third digital data which represent corrections to the recognized text, depending on the communication of recognized text; correcting the second digital data using the third digital data to generate fourth digital data which represent corrected text; adapting the conversion data depending on the fourth digital data; converting the first digital data, into fifth digital data which represent recognized text depending on the adapted speech data; and re-adapting the conversion data depending on the fifth digital data.
-
-
9. A system for recognizing spoken text, comprising:
-
conversion means for converting the spoken text uttered by a speaker into first digital text means data which represent the spoken text; a speech recognition unit, including; lexicon data means for storing lexicon data which represent a lexicon stored in the lexicon data device, and; language model data means for storing language model data which represent a language model; reference data means for storing reference data which represent phonemes; speech recognition means to generate second digital text data which represent recognized text, in a speech recognition process depending on the first digital text data the lexicon data, the language model data, and the reference data; means for obtaining third digital text data representing error correction data; error correction means for correcting the recognized text represented by the second digital text data depending on the third digital text data, by changing a part of the second digital text data depending on the third digital text data, and to generate fourth digital text data which represent corrected text; first adaption means for adapting the speech recognition unit to the speaker depending on the fourth digital data; and second adaption means for adapting the available reference data to the speaker, using the adapted speech recognition unit and depending on the first digital data. - View Dependent Claims (10)
-
Specification