Method of and system for recognizing a spoken text

US 6,101,467 A
Filed: 09/29/1997
Issued: 08/08/2000
Est. Priority Date: 09/27/1996
Status: Expired due to Term

First Claim

Patent Images

1. A method of recognizing a spoken text, comprising the steps of:

converting spoken text uttered by a speaker into first digital data;

converting the first digital data which represent the spoken text, into second digital data which represent recognized text in a speech recognition process depending on conversion data including;

available lexicon data which represent a lexicon, available language model data which represent a language model, and available reference data which represent phonemes;

communicating the recognized text;

obtaining third digital data which represent corrections to the recognized text, depending on the communication of recognized text;

correcting the second digital data using the third digital data to generate fourth digital data which represent corrected text;

adapting the speech recognition process to the speaker depending on the fourth digital data;

converting the first digital data into fifth digital data which represent additionally recognized text using the adapted speech recognition process; and

adapting the available reference data to the speaker depending on the fifth digital data and the first digital data.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system for recognizing spoken text includes a microphone for converting spoken text uttered by a speaker into analog electrical signals. An analog to digital converter to convert the analog spoken text data into digital electronic signals. A speech recognition device uses a lexicon data device; a language model data device; and a reference data device to convert the digital spoken text into recognized text data. The system also includes a keyboard for entering error correction data and an error correction device which generates corrected text depending on the corrected text. Adaptation apparatus of the lexicon data device and adaption apparatus of the language model data device, adapt the lexicon data and the language model data respectively to the speaker depending on the corrected text. Then a second speech recognition process is carried out by the speech recognition device depending on the original spoken text, the adapted lexicon data and the adapted language model data to generate newly recognized repeated text which is transmitted to the reference data device. Adaption apparatus of the reference data device or speech recognition device, adapt the reference data to the speaker depending on the recognized repeated text.

Citations

10 Claims

1. A method of recognizing a spoken text, comprising the steps of:
- converting spoken text uttered by a speaker into first digital data;
  
  converting the first digital data which represent the spoken text, into second digital data which represent recognized text in a speech recognition process depending on conversion data including;
  
  available lexicon data which represent a lexicon, available language model data which represent a language model, and available reference data which represent phonemes;
  
  communicating the recognized text;
  
  obtaining third digital data which represent corrections to the recognized text, depending on the communication of recognized text;
  
  correcting the second digital data using the third digital data to generate fourth digital data which represent corrected text;
  
  adapting the speech recognition process to the speaker depending on the fourth digital data;
  
  converting the first digital data into fifth digital data which represent additionally recognized text using the adapted speech recognition process; and
  
  adapting the available reference data to the speaker depending on the fifth digital data and the first digital data.
- View Dependent Claims (2)
- - 2. The method of claim 1 in which:
    - the step of communicating includes displaying the recognized text to the speaker;
      
      the step of communicating includes communicating by sound;
      
      the step of correcting includes replacing portions of the second digital data with portions of the third digital data;
      
      the step of adapting the speech recognition process includes the steps of;
      
      adapting the available lexicon data to the speaker depending on the fourth digital data; and
      
      adapting the available language model data to the speaker depending on the fourth digital data; and
      
      generating fifth digital data in the speech recognition process depending on the adapted lexicon data, adapted language model data, and the first digital data; and
      
      the step of adapting the available reference data depends on the fifth digital data which depends on the first digital data.

3. A system for recognizing a spoken text, comprising:
- conversion means for converting the spoken text uttered by a speaker into first digital text means data which represent the spoken text;
  
  a speech recognition unit, including;
  
  lexicon data means for storing lexicon data which represent a lexicon;
  
  language model data means for storing language model data which represent a language model;
  
  reference data means for storing reference data which represent phonemes; and
  
  speech recognition means to generate second digital text data which represent recognized text, in a speech recognition process depending on conversion data including;
  
  the first digital text data, the lexicon data, the language model data, and the reference data;
  
  means for obtaining third digital text data representing error correction data;
  
  error correction means for correcting the recognized text represented by the second digital text data depending on the third digital text data, by changing a part of the second digital text data depending on the third digital text data, and to generate fourth digital text data which represent corrected text; and
  
  adaptation means for adapting the speech recognition unit based on digital text data, including;
  
  means for adapting the lexicon data to the speaker depending on the fourth digital text data and storing the adapted lexicon data in the lexicon data means;
  
  means for adapting the language model data to the speaker depending on the fourth digital text data and storing the adapted language model data in the language model data means;
  
  means for adapting the reference data to the speaker depending on the first digital text data and the fourth digital text data and storing the adapted reference data in the reference data means;
  
  re-adaption means, including;
  
  means for converting the first digital data into fifth digital data which represent newly recognized text, using the speech recognition unit after the adaptation means has adapted the speech recognition data depending on the fourth digital data; and
  
  the adaptation means being for adapting the available reference data to the speaker of the spoken text depending on the fifth digital data and the first digital data.
- View Dependent Claims (4)
- - 4. A system as claimed in claim 3, in which:
    - the system is implemented by means of a personal computer; and
      
      the means for obtaining include;
      
      an analog to digital converter and a speaker, a display to display the recognized text, and a keyboard for entry of corrections to the recognized speech.

5. Apparatus for generating signals for operating a first computer system to recognize spoken text, comprising:
- conversion means for converting the spoken text uttered by a speaker into first digital text data which represent the spoken text;
  
  a speech recognition unit, including;
  
  lexicon data means for storing lexicon data which represent a lexicon stored in the lexicon data device, and;
  
  language model data means for storing language model data which represent a language model;
  
  reference data means for storing reference data which represent phonemes;
  
  speech recognition means to generate second digital text data which represent recognized text, in a speech recognition process depending on the first digital text data, the lexicon data, the language model data, and the reference data;
  
  means for obtaining third digital text data representing error correction data; and
  
  error correction means for correcting the recognized text represented by the second digital text data depending on the third digital text data, by changing a part of the second digital text data depending on the third digital text data, and to generate fourth digital text data which represent corrected text;
  
  adaptation means for adapting the speech recognition unit based on digital text data, including;
  
  means for adapting the lexicon data to the speaker depending on digital text data and storing the adapted lexicon data in the lexicon data means;
  
  means for adapting the language model data to the speaker depending on digital text data and storing the adapted language model data in the language model data means; and
  
  means for adapting the reference data to the speaker depending on digital text data and storing the adapted reference data in the reference data means; and
  
  re-adaption means, including;
  
  means for converting the first digital data into fifth digital data which represent newly recognized text, using the speech recognition unit after the adaptation means had adapted the speech recognition data depending on the fourth digital data; and
  
  the adaptation means being for adapting the available reference data to the speaker of the spoken text depending on the fifth digital data and the first digital data.
- View Dependent Claims (6, 7)
- - 6. The apparatus of claim 5, comprising computer media.
  - 7. The apparatus of claim 5, comprising a network including a second computer system and means for communication in between the first computer system and second computer system.

8. A method of recognizing spoken text uttered by a speaker, comprising the steps of:
- converting spoken text uttered by a speaker into first digital data;
  
  converting the first digital data which represent the spoken text, into second digital data which represent recognized text in a speech recognition process depending on conversion data including available lexicon data which represent a lexicon, available language model data which represent a language model, and available reference data which represent phonemes;
  
  communicating the recognized text;
  
  obtaining third digital data which represent corrections to the recognized text, depending on the communication of recognized text;
  
  correcting the second digital data using the third digital data to generate fourth digital data which represent corrected text;
  
  adapting the conversion data depending on the fourth digital data;
  
  converting the first digital data, into fifth digital data which represent recognized text depending on the adapted speech data; and
  
  re-adapting the conversion data depending on the fifth digital data.

9. A system for recognizing spoken text, comprising:
- conversion means for converting the spoken text uttered by a speaker into first digital text means data which represent the spoken text;
  
  a speech recognition unit, including;
  
  lexicon data means for storing lexicon data which represent a lexicon stored in the lexicon data device, and;
  
  language model data means for storing language model data which represent a language model;
  
  reference data means for storing reference data which represent phonemes;
  
  speech recognition means to generate second digital text data which represent recognized text, in a speech recognition process depending on the first digital text data the lexicon data, the language model data, and the reference data;
  
  means for obtaining third digital text data representing error correction data;
  
  error correction means for correcting the recognized text represented by the second digital text data depending on the third digital text data, by changing a part of the second digital text data depending on the third digital text data, and to generate fourth digital text data which represent corrected text;
  
  first adaption means for adapting the speech recognition unit to the speaker depending on the fourth digital data; and
  
  second adaption means for adapting the available reference data to the speaker, using the adapted speech recognition unit and depending on the first digital data.
- View Dependent Claims (10)
- - 10. The system of claim 9 in which:
    - the first adaption means includes means for adapting available lexicon data depending on the fourth digital data; and
      
      the first adaption means includes means for adapting available language model data depending on the fourth digital data.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications Austria Gmbh (Microsoft Corporation)
Original Assignee
US Philips Corporation (Koninklijke Philips N.V.)
Inventors
Bartosik, Heinrich
Primary Examiner(s)
Hudspeth, David R.
Assistant Examiner(s)
Lerner, Martin

Application Number

US08/939,548
Time in Patent Office

1,044 Days
Field of Search

704/231, 704/235, 704/243, 704/244, 704/270, 704/276, 704/250, 704/255, 704/257
US Class Current

704/235
CPC Class Codes

G10L 15/18   using natural language mode...

G10L 15/1815   Semantic context, e.g. disa...

G10L 15/26   Speech to text systems G10L...

Method of and system for recognizing a spoken text

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

10 Claims

Specification

Solutions

Use Cases

Quick Links

Method of and system for recognizing a spoken text

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

10 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links