Speech recognizer using speaker categorization for automatic reevaluation of previously-recognized speech data

US 6,122,615 A
Filed: 03/24/1998
Issued: 09/19/2000
Est. Priority Date: 11/19/1997
Status: Expired due to Term

First Claim

Patent Images

1. A speech recognizer comprising:

means for storing speech data;

a speech recognition means for producing one or more speech recognizing results;

means for storing the one or more speech recognition results produced by the speech recognition means;

one or more speaker categorization means for producing one or more speaker categorization results;

means for storing the one or more speaker categorization results produced by the speaker categorization means;

means for linking the speech data to the speech recognition results and the speaker categorization results; and

means for designating the speech recognition results and reevaluating the speech data corresponding to the designated speech recognition results,wherein the speech data is reevaluted based on at least one of the speaker categorization results at a point of time by switching to a recognition dictionary corresponding to the speaker categorization results, and the speech data corresponds to the speech recognition results that have been obtained at that point of time, at that point of time and before, or before that point of time.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech recognizer includes storing means for storing speech data, and reevaluating means for reevaluating the speech data stored in the storing means in response to a request from a data processor utilizing the results of speaker categorization (the results of categorization with speaker categorization means such as a gender-dependent speech model or speaker identification). Thus, the present invention makes it possible to correct the speech data that has been wrongly recognized before.

46 Citations

View as Search Results

20 Claims

1. A speech recognizer comprising:
- means for storing speech data;
  
  a speech recognition means for producing one or more speech recognizing results;
  
  means for storing the one or more speech recognition results produced by the speech recognition means;
  
  one or more speaker categorization means for producing one or more speaker categorization results;
  
  means for storing the one or more speaker categorization results produced by the speaker categorization means;
  
  means for linking the speech data to the speech recognition results and the speaker categorization results; and
  
  means for designating the speech recognition results and reevaluating the speech data corresponding to the designated speech recognition results,wherein the speech data is reevaluted based on at least one of the speaker categorization results at a point of time by switching to a recognition dictionary corresponding to the speaker categorization results, and the speech data corresponds to the speech recognition results that have been obtained at that point of time, at that point of time and before, or before that point of time.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. The speech recognizer according to claim 1, further comprising speech segment data generating means for segmenting the speech data into speech segment data, wherein the storing means stores the speech segment data generated by the speech segment data generating means.
  - 3. The speech recognizer according to claim 2, wherein the storing means includes a plurality of data slots for storing a plurality of sets of speech segment data generated by the speech segment data generating means.
  - 4. The speech recognizer according to claim 3, further comprising a plurality of speech segment data generating means, wherein the storing means includes a plurality of data slots for storing a plurality of sets of speech segment data generated for each speech segment data generating means.
  - 5. The speech recognizer according to claim 4, further comprising sound power calculating means for calculating a sound power from the speech segment data stored in the storing means, wherein the reevaluating means reevaluates only the speech segment data whose sound power calculated by the power calculating means is within a predetermined range.
  - 6. The speech recognizer according to claim 5, further comprising as one of the speaker categorization means gender identifying means for identifying the gender of a speaker based on the speech segment data;
    - and phoneme-recognition dictionary switching means for switching dictionaries so as to be suitably used for phoneme recognition, based on the results of identification of the gender identifying means.
  - 7. The speech recognizer according to claim 6, further comprising word-recognition dictionary switching means for switching dictionaries so as to be suitably used for word recognition, based on the results of identification of the gender identifying means.
  - 8. The speech recognizer according to claim 7, further comprising as one of the speaker categorization means speaker identifying means for identifying a speaker based on the speech data.
  - 9. The speech recognizer according to claim 5, further comprising as one of the speaker categorization means gender identifying means for identifying the gender of a speaker based on the speech segment data;
    - and word-recognition dictionary switching means for switching dictionaries so as to be suitably used for word recognition, based on the results of identification of the gender identifying means.
  - 10. The speech recognizer according to claim 2, further comprising sound power calculating means for calculating a sound power from the speech segment data stored in the storing means, wherein the reevaluating means reevaluates speech segment data whose sound power calculated by the power calculating means is within a predetermined range.
  - 11. The speech recognizer according to claim 10, further comprising as one of the speaker categorization means identifying means for identifying a speaker based on the speech segment data.
  - 12. The speech recognizer according to claim 1, further comprising as one of the speaker categorization means gender identifying means for identifying the gender of a speaker based on the speech data;
    - and phoneme-recognition dictionary switching means for switching dictionaries so as to be suitably used to phoneme-recognition, based on the results of identification of the gender identifying means.
  - 13. The speech recognizer according to claim 12, further comprising word-recognition recognition dictionary switching means for switching dictionaries so as to be suitably used for word recognition, based on the results of identification of the gender identifying means.
  - 14. The speech recognizer according to claim 1, further comprising as one of the speaker categorization means gender identifying means for identifying the gender of a speaker based on the speech data;
    - and word-recognition dictionary switching means for switching dictionaries so as to be suitably used for word recognition, based on the results of identification of the gender identifying means.
  - 15. The speech recognizer according to claim 1, as one of the speaker categorization means speaker identifying means for identifying a speaker based on the speech data.

16. A data processor including input receiving means for receiving an input, the data processor utilizing as the input receiving means a speech recognizer comprising:
- means for storing speech data;
  
  a speech recognition means for producing one or more speech recognition results;
  
  means for storing the one or more speech recognition results produced by the speech recognition means;
  
  one or more speaker categorization means for producing one or more speaker categorization results;
  
  means for storing the one or more speaker categorization results produced by the speaker categorization means;
  
  means for linking the speech data to the speech recognition results and the speaker categorization results; and
  
  means for designating the speech recognition results and reevaluating the speech data corresponding to the designated speech recognition results,wherein the speech data is reevaluted based on at least one of the speaker categorization results at a point of time by switching to a recognition dictionary corresponding to the speaker categorization results, and the speech data corresponds to the speech recognition results that have been obtained at that point of time, at that point of time and before, or before that point of time.

17. A data processor including input receiving means for receiving an input, the data processor utilizing as the input receiving means a speech recognizer comprising:
- means for storing speech data;
  
  gender identifying means for identifying the gender of a speaker based on the speech data;
  
  phoneme-recognition dictionary switching means for switching dictionaries so as to be suitably used for phoneme recognition, based on the results of the gender identifying means;
  
  a speech recognition means for producing one or more speech recognition results;
  
  means for storing the one or more speech recognition results produced by the speech recognition means;
  
  means for linking the speech data to the speech recognition results and the results of the gender identifying means; and
  
  means for designating the speech recognition results and reevaluating the speech data corresponding to the designated speech recognition results,wherein the data processor requests reevaluation of the speech data stored in the storing means when the phoneme-recognition dictionaries are switched by the phoneme-recognition dictionary switching means, andwherein the speech data is reevaluted based on the results of the gender identifying means at a point of time by switching to a recognition dictionary corresponding to the results of the gender identifying means that have been obtained at that point of time, at that point of time and before, or before that point of time.

18. A data processor including input receiving means for receiving an input, the data processor utilizing as the input receiving means a speech recognizer comprising:
- means for storing speech data;
  
  gender identifying means for identifying the gender of a speaker based on the speech data;
  
  word-recognition dictionary switching means for switching dictionaries so as to be suitably used for word recognition, based on the results of identification of the gender identifying means;
  
  a speech recognition means for producing one or more speech recognition results;
  
  means for storing the one or more speech recognition results produced by the speech recognition means;
  
  means for linking the speech data to the speech recognition results and the results of the gender identifying means; and
  
  means for designating the speech recognition results and reevaluating the speech data corresponding to the designated speech recognition results,wherein the data processor requests reevaluation of the speech data stored in the storing means when the word-recognition dictionaries are switched by the word-recognition dictionary switching means, andwherein the speech data is reevaluted based on the results of the gender identifying means at a point of time by switching to a recognition dictionary corresponding to the results of the gender identifying means that have been obtained at that point of time, at that point of time and before, or before that point of time.

19. A data processor including input receiving means for receiving an input,the data processor utilizing as the input receiving means a speech recognizer comprising:
- means for storing speech data;
  
  gender identifying means for identifying the gender of a speaker based on the speech data;
  
  phoneme-recognition dictionary switching means for switching dictionaries so as to be suitably used for phoneme recognition, based on the results of the gender identifying means;
  
  word-recognition dictionary switching means for switching dictionaries so as to be suitably used for word recognition, based on the results of the gender identifying means;
  
  a speech recognition means for producing one or more speech recognition results;
  
  means for storing the one or more speech recognition results produced by the speech recognition means;
  
  means for linking the speech data to the speech recognition results and the results of the gender identifying means; and
  
  means for designating the speech recognition results and reevaluating the speech data corresponding to the designated speech recognition results,wherein the data processor requests reevaluation of the speech data stored in the storing means when the phoneme-recognition dictionaries are switched by the phoneme-recognition dictionary switching means, or the word-recognition dictionaries are switched by the word-recognition dictionary switching means, andwherein the speech data is reevaluted based on the results of the gender identifying means at a point of time by switching to a recognition dictionary corresponding to the results of the gender identifying means that have been obtained at that point of time, at that point of time and before, or before that point of time.

20. A recording medium readable by a computer storing a program, the program allowing the computer to perform a process comprising:
- segmenting speech data into speech segment data;
  
  storing the speech segment data segmented in the segmenting step in a plurality of data slots sequentially;
  
  receiving a request of reevaluation of the speech segment data from a higher system utilizing results of speaker categorization;
  
  generating one or more speech recognition results by speech recognition;
  
  generating one or more speaker categorization results by speaker categorization;
  
  storing the one or more speech recognition results;
  
  storing the one or more speaker categorization results;
  
  linking the speech data to the speech recognition results and the speaker categorization results; and
  
  designating the speech recognition results and reevaluating the speech data corresponding to the designated speech recognition results,wherein the speech data is reevaluated based on at least one of the speaker categorization results at a point of time by switching to a recognition dictionary corresponding to the speaker categorization results, and the speech data corresponds to the speech recognition results that have been obtained at that point of time, at that point of time and before, or before that point of time.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Fujitsu Limited
Original Assignee
Fujitsu Limited
Inventors
Yamamoto, Kenji
Primary Examiner(s)
Hudspeth, David R.
Assistant Examiner(s)
Smits, Talivaldis Ivars

Application Number

US09/046,568
Time in Patent Office

910 Days
Field of Search

704/231, 704/235, 704/246, 704/243, 704/252
US Class Current

704/252
CPC Class Codes

G10L 15/26 Speech to text systems G10L...

G10L 17/00 Speaker identification or v...

Speech recognizer using speaker categorization for automatic reevaluation of previously-recognized speech data

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

46 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognizer using speaker categorization for automatic reevaluation of previously-recognized speech data

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

46 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links