Continuous speech recognition apparatus, continuous speech recognition method, continuous speech recognition program, and program recording medium

US 20050075876A1
Filed: 12/13/2002
Published: 04/07/2005
Est. Priority Date: 01/16/2002
Status: Abandoned Application

First Claim

Patent Images

1. A continuous speech recognition apparatus which uses, as a recognition unit, a sub-word determined depending on an adjacent sub-word and which uses context dependent acoustic models dependent on sub-word context to recognize a continuous input speech, comprising:

an acoustic analysis section analyzing the input speech to obtain feature parameter time series;

a word lexicon in which each of words included in vocabulary is stored in a form of a sub-word network or in a sub-word tree structure;

a language model storage unit in which language models representing information regarding connection between words is stored;

a context dependent acoustic model storage unit in which the context dependent acoustic models are stored in a form of sub-word state trees in each of which state sequences of a plurality of sub-word models of the context dependent acoustic models are organized in a tree structure;

a matching unit developing hypotheses of sub-words by referencing the sub-word state tree representing the context dependent acoustic models, the word lexicon and the language models, and performing matching between the feature parameter time series and the developed hypotheses so as to output, as a word lattice, word information including a word, an accumulated score and a beginning start frame with respect to a hypothesis representing a word end portion; and

a search unit for searching the word lattice to generate recognition results.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Accuracy is assured by using phoneme context dependent acoustic models even at word boundaries and also time increase of a processing amount is suppressed even in large-vocabulary continuous speech recognition. A phoneme context dependent acoustic model storage unit contains phoneme state trees in each of which state sequences each consisting of a preceding phoneme state, a center phoneme state, and a succeeding phoneme state are configured in a tree structure with triphone models with the same preceding phoneme and triphone models with the same center phoneme collected. Accordingly, a forward matching unit has only to develop one phonemic hypothesis regardless of a leading phoneme of the succeeding word, by referencing the phoneme state trees, language models stored in a language model storage unit, and a word lexicon. Thus, development of hypotheses is easy regardless of in-word or word-boundary state. Moreover, an operation amount in performing matching with feature parameter sequences from an acoustic analysis unit can be remarkably reduced.

17 Citations

View as Search Results

8 Claims

1. A continuous speech recognition apparatus which uses, as a recognition unit, a sub-word determined depending on an adjacent sub-word and which uses context dependent acoustic models dependent on sub-word context to recognize a continuous input speech, comprising:
- an acoustic analysis section analyzing the input speech to obtain feature parameter time series;
  
  a word lexicon in which each of words included in vocabulary is stored in a form of a sub-word network or in a sub-word tree structure;
  
  a language model storage unit in which language models representing information regarding connection between words is stored;
  
  a context dependent acoustic model storage unit in which the context dependent acoustic models are stored in a form of sub-word state trees in each of which state sequences of a plurality of sub-word models of the context dependent acoustic models are organized in a tree structure;
  
  a matching unit developing hypotheses of sub-words by referencing the sub-word state tree representing the context dependent acoustic models, the word lexicon and the language models, and performing matching between the feature parameter time series and the developed hypotheses so as to output, as a word lattice, word information including a word, an accumulated score and a beginning start frame with respect to a hypothesis representing a word end portion; and
  
  a search unit for searching the word lattice to generate recognition results.
- View Dependent Claims (2, 3, 4, 5, 7, 8)
- - 2. The continuous speech recognition apparatus as defined in claim 1, wherein the context dependent acoustic models stored in the context dependent acoustic model storage unit are context dependent acoustic models in which a center sub-word depends on sub-words preceding and succeeding the center sub-word respectively, and the state sequences of sub-word models having identical preceding sub-words and identical center sub-words are organized in a tree structure.
  - 3. The continuous speech recognition apparatus as defined in claim 2, wherein the context dependent acoustic models are state sharing models in which a plurality of sub-word models share states.
  - 4. The continuous speech recognition apparatus as defined in claim 1, wherein when developing the hypotheses by referencing the sub-word state tree, the matching unit puts a flag on states connectable to each other in the sub-word state trees that represent the hypotheses, by using information on connectable sub-words obtained from the word lexicon and the language model.
  - 5. The continuous speech recognition apparatus as defined in claim 1, wherein during a matching operation, the matching unit calculates scores of the developed hypotheses based on the feature parameter time series, and prunes the hypotheses in conformity to criteria including a threshold value of the scores or a quantity of hypotheses.
  - 7. A continuous speech recognition program that makes a computer function as the acoustic analysis section, the word lexicon, the language model storage unit, the context dependent acoustic model storage unit, the matching unit and the search unit as recited in claim 1.
  - 8. A program recording medium readable by computer, having the continuous speech recognition program as defined in claim 7 stored therein.

6. A continuous speech recognition method which uses, as a recognition unit, a sub-word determined depending on an adjacent sub-word and which uses context dependent acoustic models dependent on sub-word context to recognize a continuous input speech, comprising:
- analyzing the input speech to obtain feature parameter time series by an acoustic analysis section;
  
  developing hypotheses of sub-words by referencing a sub-word state tree formed by placing state sequences of the context dependent acoustic models in a tree structure, a word lexicon describing each of words included in vocabulary in a form of a sub-word network or in a sub-word tree structure, and a language model representing information regarding connection between words, and performing matching between the feature parameter time series and the developed hypotheses so as to generate, as a word lattice, word information including a word, an accumulated score and a beginning start frame with respect to a hypothesis regarding a word end portion, by a matching unit; and
  
  searching the word lattice to generate recognition results by a search unit.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sharp Kabushiki Kaisha (Hon Hai Precision Industry Co., Ltd.)
Original Assignee
Sharp Kabushiki Kaisha (Hon Hai Precision Industry Co., Ltd.)
Inventors
Tsuruta, Akira

Application Number

US10/501,502
Publication Number

US 20050075876A1
Time in Patent Office

Days
Field of Search
US Class Current

704/251
CPC Class Codes

G10L 15/187 Phonemic context, e.g. pron...

Continuous speech recognition apparatus, continuous speech recognition method, continuous speech recognition program, and program recording medium

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

17 Citations

8 Claims

Specification

Solutions

Use Cases

Quick Links

Continuous speech recognition apparatus, continuous speech recognition method, continuous speech recognition program, and program recording medium

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

17 Citations

8 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links