New language context dependent data labeling

US 20020152068A1
Filed: 02/22/2001
Published: 10/17/2002
Est. Priority Date: 09/29/2000
Status: Active Grant

First Claim

Patent Images

1. A method of aligning speech data of a first language to a phone set associated with the first language using a speech recognition system trained in accordance with a second language, the method comprising the steps of:

applying a mapping to a phonetic vocabulary built using the first language phone set to generate a first language phonetic vocabulary mapped to a phone set associated with the second language;

aligning speech data, input in the first language, to the first language phonetic vocabulary mapped to the second language phone set using the speech recognition system trained in accordance with the second language; and

realigning the aligned speech data to the first language phone set.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Bootstrapping of a system from one language to another often works well when the two languages share the similar acoustic space. However, when the new language has sounds that do not occur in the language from which the bootstrapping is to be done, bootstrapping does not produce good initial models and the new language data is not properly aligned to these models. The present invention provides techniques to generate context dependent labeling of the new language data using the recognition system of another language. Then, this labeled data is used to generate models for the new language phones.

28 Citations

View as Search Results

20 Claims

1. A method of aligning speech data of a first language to a phone set associated with the first language using a speech recognition system trained in accordance with a second language, the method comprising the steps of:
- applying a mapping to a phonetic vocabulary built using the first language phone set to generate a first language phonetic vocabulary mapped to a phone set associated with the second language;
  
  aligning speech data, input in the first language, to the first language phonetic vocabulary mapped to the second language phone set using the speech recognition system trained in accordance with the second language; and
  
  realigning the aligned speech data to the first language phone set.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, wherein the mapping applied to the phonetic vocabulary built using the first language phone set is a many-to-one mapping.
  - 3. The method of claim 1, wherein the aligning step comprises labeling feature vectors, generated from the input speech data, by phones that the feature vectors represent in the phonetic space of the second language phone set.
  - 4. The method of claim 3, wherein the realigning step comprises relabeling the feature vectors by clustering the feature vectors according to phones of the first language phone set.
  - 5. The method of claim 4, wherein the relabeling step comprises sequential comparison of phonetic spellings of lexemes aligned to the second language phone set to phonetic spellings associated with the first language phone set.
  - 6. The method of claim 1, wherein the speech recognition system trained in accordance with the second language is a large vocabulary continuous speech recognition system.

7. A method of labeling input speech data of a first language with a phone set associated with the first language using a speech recognition system trained in accordance with a second language, the method comprising the steps of:
- obtaining speech data uttered in the first language;
  
  using the speech recognition system trained in accordance with the second language to label the speech data uttered in the first language using a phonetic vocabulary of the first language which has been mapped to a phone set associated with the second language; and
  
  relabeling the labeled speech data using the first language phone set by sequentially comparing lexeme contexts in the first and second languages.

8. A method of generating a speech recognition system for a first language using a speech recognition system previously generated for a second language, the method comprising the steps of:
- applying a mapping to a phonetic vocabulary built using a phone set of the first language to generate a first language phonetic vocabulary mapped to a phone set associated with the second language;
  
  aligning training speech data, input in the first language, to the first language phonetic vocabulary mapped to the second language phone set using the speech recognition system previously generated for the second language;
  
  realigning the aligned training speech data to the first language phone set;
  
  constructing acoustic models using the realigned training speech data; and
  
  associating the constructed acoustic models with a speech recognition engine for subsequent use in recognizing real-time input speech data uttered in the first language.

9. Apparatus for aligning speech data of a first language to a phone set associated with the first language using a speech recognizer trained in accordance with a second language, the apparatus comprising:
- at least one processor operative to;
  
  (i) apply a mapping to a phonetic vocabulary built using the first language phone set to generate a first language phonetic vocabulary mapped to a phone set associated with the second language;
  
  (ii) align speech data, input in the first language, to the first language phonetic vocabulary mapped to the second language phone set using the speech recognizer trained in accordance with the second language; and
  
  (iii) realign the aligned speech data to the first language phone set; and
  
  memory, coupled to the at least one processor, operative to store at least one of results associated with the mapping, aligning and realigning operations.
- View Dependent Claims (10, 11, 12, 13, 14, 18)
- - 10. The apparatus of claim 9, wherein the mapping applied to the phonetic vocabulary built using the first language phone set is a many-to-one mapping.
  - 11. The apparatus of claim 9, wherein the aligning operation comprises labeling feature vectors, generated from the input speech data, by phones that the feature vectors represent in the phonetic space of the second language phone set.
  - 12. The apparatus of claim 11, wherein the realigning operation comprises relabeling the feature vectors by clustering the feature vectors according to phones of the first language phone set.
  - 13. The apparatus of claim 12, wherein the relabeling operation comprises sequentially comparing phonetic spellings of lexemes aligned to the second language phone set to phonetic spellings associated with the first language phone set.
  - 14. The apparatus of claim 9, wherein the speech recognizer trained in accordance with the second language is a large vocabulary continuous speech recognizer.
  - 18. The system of claim 17, wherein the speech recognizer trained in accordance with the second language is a large vocabulary continuous speech recognizer.

15. Apparatus for labeling input speech data of a first language with a phone set associated with the first language using a speech recognizer trained in accordance with a second language, the apparatus comprising:
- at least one processor operative to;
  
  (i) obtain speech data uttered in the first language;
  
  (ii) use the speech recognizer trained in accordance with the second language to label the speech data uttered in the first language using a phonetic vocabulary of the first language which has been mapped to a phone set associated with the second language; and
  
  (iii) relabel the labeled speech data using the first language phone set by sequentially comparing lexeme contexts in the first and second languages; and
  
  memory, coupled to the at least one processor, operative to store at least one of results associated with the obtaining, labeling and relabeling operations.

16. Apparatus for generating a speech recognizer for a first language using a speech recognizer previously generated for a second language, the apparatus comprising:
- at least one processor operative to;
  
  (i) apply a mapping to a phonetic vocabulary built using a phone set of the first language to generate a first language phonetic vocabulary mapped to a phone set associated with the second language;
  
  (ii) align training speech data, input in the first language, to the first language phonetic vocabulary mapped to the second language phone set using the speech recognizer previously generated for the second language;
  
  (iii) realign the aligned training speech data to the first language phone set;
  
  (iv) construct acoustic models using the realigned training speech data; and
  
  (v) associate the constructed acoustic models with a speech recognition engine for subsequent use in recognizing real-time input speech data uttered in the first language; and
  
  memory, coupled to the at least one processor, operative to store at least one of results associated with the applying, aligning, realigning, constructing and associating operations.

17. A speech data alignment system, comprising:
- a mapping module which applies a first language-to-a second language mapping to a phonetic vocabulary built using a phone set associated with the first language to generate a first language phonetic vocabulary mapped to a phone set associated with the second language;
  
  a speech recognizer trained in accordance with the second language, coupled to the mapping module, which aligns speech data, input in the first language, to the first language phonetic vocabulary mapped to the second language phone set; and
  
  a lexeme context comparator, coupled to the speech recognizer, which realigns the aligned speech data to the first language phone set by sequentially comparing lexeme contexts in the first and second languages.

19. An article of manufacture for aligning speech data of a first language to a phone set associated with the first language using a speech recognition system trained in accordance with a second language, comprising a machine readable medium containing one or more programs which when executed implement the steps of:
- applying a mapping to a phonetic vocabulary built using the first language phone set to generate a first language phonetic vocabulary mapped to a phone set associated with the second language;
  
  aligning speech data, input in the first language, to the first language phonetic vocabulary mapped to the second language phone set using the speech recognition system trained in accordance with the second language; and
  
  realigning the aligned speech data to the first language phone set.

20. An article of manufacture for labeling input speech data of a first language with a phone set associated with the first language using a speech recognition system trained in accordance with a second language, comprising a machine readable medium containing one or more programs which when executed implement the steps of:
- obtaining speech data uttered in the first language;
  
  using the speech recognition system trained in accordance with the second language to label the speech data uttered in the first language using a phonetic vocabulary of the first language which has been mapped to a phone set associated with the second language; and
  
  relabeling the labeled speech data using the first language phone set by sequentially comparing lexeme contexts in the first and second languages.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
International Business Machines Corporation
Inventors
Subramaniam, L. Venkata, Rajput, Nitendra, Verma, Ashish, Neti, Chalapathy Venkata

Granted Patent

US 7,295,979 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/236
CPC Class Codes

G10L 15/06 Creation of reference templ...

G10L 15/187 Phonemic context, e.g. pron...

New language context dependent data labeling

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

28 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

New language context dependent data labeling

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

28 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links