Speech recognition models combining gender-dependent and gender-independent phone states and using phonetic-context-dependence

US 5,953,701 A
Filed: 01/22/1998
Issued: 09/14/1999
Est. Priority Date: 01/22/1998
Status: Expired due to Fees

First Claim

Patent Images

1. A method of gender dependent speech recognition comprising the steps of:

identifying phone state models common to both genders;

identifying gender specific phone state models;

identifying a gender of a speaker; and

recognizing acoustic data from the speaker based on the phone state models.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method of gender dependent speech recognition includes the steps of identifying phone state models common to both genders, identifying gender specific phone state models, identifying a gender of a speaker and recognizing acoustic data from the speaker. A method of constructing a gender-dependent speech recognition model includes the steps of providing training data of a known gender, aligning the training data, tagging the training data with a gender to create gender-tagged data, determining a gender question at a node to determine gender dependence of the gender-tagged data, determining a phonetic context question at the node to determine phonetic context dependence of the gender-tagged data, determining a highest value of an evaluation function between the gender dependence and the phonetic context dependence to determine which dependence is a dominant dependence, splitting the data of the dominant dependence into child nodes according to likelihood criteria, comparing the highest value with a threshold value to determine if additional splitting is necessary, repeating theses steps for each child node until the highest value is below the threshold value and counting the nodes having gender dependence to determine an overall gender dependence level. A gender-dependent speech recognition system includes an input device for inputting speech to a preprocessor. The preprocessor converts the speech into acoustic data, and a processor for identifies gender-dependent phone state models and phone state modes common to both genders. The phone state models are stored in a memory device wherein the processor recognizes the speech in accordance with the phone state models.

103 Citations

View as Search Results

18 Claims

1. A method of gender dependent speech recognition comprising the steps of:
- identifying phone state models common to both genders;
  
  identifying gender specific phone state models;
  
  identifying a gender of a speaker; and
  
  recognizing acoustic data from the speaker based on the phone state models.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of gender dependent speech recognition as recited in claim 1, wherein the step of identifying gender further comprises the steps of:
    - setting the gender to a first gender;
      
      calculating a confidence based on the first gender;
      
      setting the gender to a second gender;
      
      calculating a confidence of the second gender; and
      
      determining gender by selecting the confidence with a higher value.
  - 3. The method of gender dependent speech recognition as recited in claim 2, wherein a likelihood for each gender is used to determine gender.
  - 4. The method of gender dependent speech recognition as recited in claim 2, wherein the steps of calculating the confidences include calculating the confidence by taking the product of word confidences for every word in a hypothesis.
  - 5. The method of gender dependent speech recognition as recited in claim 1, wherein the step of identifying gender includes performing a maximum a posteriori adaptation.
  - 6. The method of gender dependent speech recognition as recited in claim 1, wherein the step of identifying gender includes comparing Gaussian prototypes to a codebook of Gaussian prototypes to determine gender.
  - 7. The method of gender dependent speech recognition as recited in claim 6, wherein the step of comparing Gaussian prototypes includes clustering Gaussian prototypes to create the codebook of Gaussian prototypes.
  - 8. The method of gender dependent speech recognition as recited in claim 1, wherein the step of identifying gender specific phone state models further comprises the step of asking a gender question at a node to determine gender dependence of the acoustic data.

9. A method of constructing a gender-dependent speech recognition model comprising the steps of:
- a) aligning acoustic data with a gender independent systemb) asking a gender question at a node to determine gender dependence of the acoustic data;
  
  c) asking a phonetic context question at the node to determine phonetic context dependence of the acoustic data;
  
  d) determining a highest value of an evaluation function between the gender dependence and the phonetic context dependence to determine which dependence is a dominant dependence;
  
  e) splitting the data of the dominant dependence into child nodes according to the question of dominant dependence; and
  
  f) repeating steps b-e for each child node until a threshold criterion is met.
- View Dependent Claims (10, 11, 12)
- - 10. The method of constructing a gender-dependent speech recognition model as recited in claim 9, further comprises the step of counting the nodes having gender dependence to determine an overall gender dependence level.
  - 11. The method of constructing a gender-dependent speech recognition model as recited in claim 9, wherein the step of repeating steps until a threshold criterion is met includes:
    - comparing the highest value with a threshold value to determine if additional splitting is necessary.
  - 12. The method of constructing a gender-dependent speech recognition model as recited in claim 9, wherein the step of asking the phonetic context question includes the step of asking the phonetic context question at each position between -5 to +5, inclusive from the node.

13. A method of constructing a gender-dependent speech recognition model comprising the steps of:
- a) providing training data of a known gender;
  
  b) aligning the training data;
  
  c) tagging the training data with a gender to create gender-tagged data;
  
  d) asking a gender question at a node to determine gender dependence of the gender-tagged data;
  
  e) asking a phonetic context question at the node to determine phonetic context dependence of the gender-tagged data;
  
  f) determining a highest value of an evaluation function between the gender dependence and the phonetic context dependence to determine which dependence is a dominant dependence;
  
  g) splitting the data of the dominant dependence into child nodes according to a likelihood criterion;
  
  h) comparing the highest value with a threshold value to determine if additional splitting is necessary; and
  
  i) repeating steps d-h for each child node until the highest value is below the threshold value.
- View Dependent Claims (14, 15)
- - 14. The method of constructing a gender-dependent speech recognition model as recited in claim 13, further comprises the step of counting the nodes having gender dependence to determine an overall gender dependence level.
  - 15. The method of constructing a gender-dependent speech recognition model as recited in claim 13, wherein the step of asking the phonetic context question includes the step of asking the phonetic context question at each position between -5 to +5, inclusive from the node.

16. A gender-dependent speech recognition system comprising:
- an input device for inputting speech to a preprocessor, the preprocessor converting speech into acoustic data; and
  
  a processor for identifying gender-dependent phone state models and phone state modes common to both genders, the phone state models being stored in a memory device wherein the processor recognizes the speech in accordance with the phone state models.
- View Dependent Claims (17, 18)
- - 17. The gender-dependent speech recognition system as recited in claim 16, wherein the gender-dependent phone state models reduce an amount of memory storage space needed to store the phone state models.
  - 18. The gender-dependent speech recognition system as recited in claim 16, wherein the processor includes a computer.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Roukos, Salim Estephan, Neti, Chalapathy Venkata
Primary Examiner(s)
Hudspeth, David R.
Assistant Examiner(s)
Smits, Talivaldis Ivars

Application Number

US09/010,466
Time in Patent Office

600 Days
Field of Search

704/240, 704/242, 704/252, 704/254
US Class Current

704/254
CPC Class Codes

G10L 15/07 to the speaker

G10L 15/142 Hidden Markov Models [HMMs]

Speech recognition models combining gender-dependent and gender-independent phone states and using phonetic-context-dependence

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

103 Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition models combining gender-dependent and gender-independent phone states and using phonetic-context-dependence

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

103 Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links