Continuous speech recognition method and system using inter-word phonetic information

US 20040172247A1
Filed: 02/24/2004
Published: 09/02/2004
Est. Priority Date: 02/24/2003
Status: Active Grant

First Claim

Patent Images

1. A continuous speech recognition method comprising:

(a) constructing a pronunciation dictionary database including at least one pronunciation representation for each word which is influenced by applying phonological rules, wherein the pronunciation representation for the coda of a first word or the pronunciation representation for the onset of a second word following the first word is additionally indexed with an identifier if it does not match the phonetic pronunciation of its spelling;

(b) forming inter-word phonetic information in matrix form by combination of a number of all probable phonetic pairs, each of which is basically comprised of the coda of a first word and the onset of a second word following the first word, wherein the coda of the first word or the onset of the second word is indexed with an identifier if they undergo phonological changes; and

(c) performing speech recognition on feature vectors extracted from an input speech signal with reference to the pronunciation dictionary database and the inter-word phonetic information.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A continuous speech recognition method and system are provided. The continuous speech recognition method includes constructing a pronunciation dictionary database including at least one pronunciation representation for each word which is influenced by applying phonological rules, wherein the pronunciation representation for the coda of a first word or the pronunciation representation for the onset of a second word following the first word is additionally indexed with an identifier if it does not match the phonetic pronunciation of its spelling, forming inter-word phonetic information in matrix form by combination of a number of all probable phonetic pairs, each of which is basically comprised of the coda of a first word and the onset of a second word following the first word, wherein the coda of the first word or the onset of the second word is indexed with an identifier if they undergo phonological changes and performing speech recognition on feature vectors extracted from an input speech signal with reference to the pronunciation dictionary database and the inter-word phonetic information.

58 Citations

View as Search Results

8 Claims

1. A continuous speech recognition method comprising:
- (a) constructing a pronunciation dictionary database including at least one pronunciation representation for each word which is influenced by applying phonological rules, wherein the pronunciation representation for the coda of a first word or the pronunciation representation for the onset of a second word following the first word is additionally indexed with an identifier if it does not match the phonetic pronunciation of its spelling;
  
  (b) forming inter-word phonetic information in matrix form by combination of a number of all probable phonetic pairs, each of which is basically comprised of the coda of a first word and the onset of a second word following the first word, wherein the coda of the first word or the onset of the second word is indexed with an identifier if they undergo phonological changes; and
  
  (c) performing speech recognition on feature vectors extracted from an input speech signal with reference to the pronunciation dictionary database and the inter-word phonetic information.
- View Dependent Claims (2, 3, 4)
- - 2. The continuous speech recognition method of claim 1, wherein in step (c), a pronunciation representation for the coda of a first word and a pronunciation representation for the onset of a second word following the first word, which do not comply with the phonological rules, are constrained based on the inter-word phonetic information so as not to be linked to each other.
  - 3. The continuous speech recognition method of claim 1, wherein more than one crossword information values are assigned to each phonetic pair in a matrix.
  - 4. A computer readable medium having embodied thereon a computer program for the method according to claim 1.

5. A continuous speech recognition system including an acoustic model database and a language model database which are previously established through learning, the system comprising:
- an inter-word phonetic information storing unit which stores inter-word phonetic information by combination of all probable phonemic pairs, each of which is basically comprised of a coda of last syllable of a first word and an onset of initial syllable of a second word following the first word, wherein the coda of the first word or the onset of the second word is indexed with an identifier if it does not match the phonetic pronunciation of its spelling due to phonological interaction between the first and second words;
  
  a pronunciation dictionary database including at least one pronunciation representation for each word based on phonological rules, wherein the pronunciation representation for the coda of a first word or the pronunciation representation for the onset of a second word following the first word is additionally indexed with an identifier if it does not match the phonetic pronunciation of the spelling;
  
  a feature extraction unit which extracts information that is useful for recognition from an input speech signal and converts the extracted information into feature vectors; and
  
  a search unit which searches most likely word sequences among from the feature vectors obtained in the feature extraction unit using the inter-word phonetic information and with reference to the acoustic model database, the pronunciation dictionary database, and the language model database, and outputs the most likely word sequences in text form as a recognition result.
- View Dependent Claims (6, 7, 8)
- - 6. The continuous speech recognition system of claim 5, wherein more than one crossword information values are assigned to each phonetic pair in a matrix.
  - 7. The continuous speech recognition system of claim 5, wherein the search unit constrains linking between a pronunciation representation for the coda of a first word and a pronunciation representation for the onset of a second word following the first word, which do not comply with the phonological rules.
  - 8. The continuous speech recognition system of claim 5, further comprising a post-processing unit which converts an intra-word biphone model into an inter-word triphone model, rescores the acoustic models for the most likely word sequences obtained in the search unit, recalculates the scores of candidate sentences, and selects the best candidate sentence as a recognition result.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Samsung Electronics Co. Ltd.
Original Assignee
Samsung Electronics Co. Ltd.
Inventors
Choi, In-jeong, Kim, Nam-hoon, Yoon, Su-yeon

Granted Patent

US 7,299,178 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/251
CPC Class Codes

G10L 15/02 Feature extraction for spee...

G10L 15/187 Phonemic context, e.g. pron...

Continuous speech recognition method and system using inter-word phonetic information

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

58 Citations

8 Claims

Specification

Solutions

Use Cases

Quick Links

Continuous speech recognition method and system using inter-word phonetic information

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

58 Citations

8 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links