Method and device for speech recognition

US 5,694,520 A
Filed: 01/11/1996
Issued: 12/02/1997
Est. Priority Date: 06/29/1994
Status: Expired due to Term

First Claim

Patent Images

1. A method for recognizing spoken language comprising the steps of:

identifying a number of phonemes from a segment of input speech;

interpreting the phonemes as possible word combinations to establish a model of the speech with word and sentence accents according to a standardized pattern;

determining the fundamental tone curve of the input speech;

determining the maximum and minimum values of the fundamental tone curve of the input speech and their respective positions;

determining the maximum and minimum values of the fundamental tone curve of the speech model;

comparing the fundamental tone curve of the input speech and the fundamental tone curve of the speech model to identify a time difference between the maximum and minimum values of the fundamental tone curve of the incoming speech in relation to the maximum and minimum values of the fundamental tone curve of the speech model;

adjusting the intonation pattern of the speech model utilizing the identified time difference to modify the speech model to conform with the dialectal characteristics of the input speech.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and device for recognizing dialectal variations in a language. From an incoming speech is on one hand a speech recognition procedure being performed, and on the other hand the fundamental tone curve being extracted. Out of the speech recognition is created an allophone string which together with the fundamental tone curve is used for the detecting of the maximun and minimum values of the fundamental tone. The recognized speech is compared with a lexicon with orthography and transcription for the finding of suitable word candidates. The found word candidates are further analyzed regarding syntax. This in mentioned way found syntactical and lexical information is used for creating a model of the speech. The fundamental tone outline of the model and the fundamental tone of the speech are compared, at which the maximun and minimum values of the fundamental tones are appointed and a difference between the model and the speech are obtained. The difference is after that influencing the model which is brought to correspond to the given speech. The in mentioned way modelled model is then used for the speech recognition, at which an increased possibility to understand the different dialects of a language in an artificial way is achieved.

Citations

20 Claims

1. A method for recognizing spoken language comprising the steps of:
- identifying a number of phonemes from a segment of input speech;
  
  interpreting the phonemes as possible word combinations to establish a model of the speech with word and sentence accents according to a standardized pattern;
  
  determining the fundamental tone curve of the input speech;
  
  determining the maximum and minimum values of the fundamental tone curve of the input speech and their respective positions;
  
  determining the maximum and minimum values of the fundamental tone curve of the speech model;
  
  comparing the fundamental tone curve of the input speech and the fundamental tone curve of the speech model to identify a time difference between the maximum and minimum values of the fundamental tone curve of the incoming speech in relation to the maximum and minimum values of the fundamental tone curve of the speech model;
  
  adjusting the intonation pattern of the speech model utilizing the identified time difference to modify the speech model to conform with the dialectal characteristics of the input speech.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. Method according to claim 1, characterized by the time difference being appointed in relation to a reference, preferably a CV-limit, where C indicate a consonant and V a vowel.
  - 3. Method according to claim 2 characterized by the fundamental tone outline being based on lexical and syntactical information, and information about orthography and fonetic transcription.
  - 4. Method according to claim 3, characterized by the speech on the basis of FO-events being classified in dialect categories depending on stored information descriptions.
  - 5. Method according to claim 2, characterized by the transcription containing a lexical abstract accent information type stressed syllable, tonal word accents, accent I and accent II as well as location of secondary accent, i.e., information normally provided by dictionary.
  - 6. Method according to claim 2, characterized by the speech on the basis of FO-events being classified in dialect categories depending on stored information descriptions.
  - 7. Method according to claim 1, characterized by the fundamental tone outline being based on lexical and syntactical information, and information about orthography and fonetic transcription.
  - 8. Method according to claim 7, characterized by the transcription containing a lexical abstract accent information type stressed syllable, tonal word accents, accent I and accent II as well as location of secondary accent, i.e., information normally provided by dictionary.
  - 9. Method according to claim 7, characterized by the speech on the basis of FO-events being classified in dialect categories depending on stored information descriptions.
  - 10. Method according to claim 1, characterized by the transcription containing a lexical abstract accent information type stressed syllable, tonal word accents accent I and accent II as well as location of secondary accent, i.e. information normally provided by dictionary.
  - 11. Method according to claim 10, characterized by the speech on the basis of FO-events being classified in dialect categories depending on stored information descriptions.
  - 12. Method according to claim 1, characterized by the speech on the basis of FO-events being classified in dialect categories depending on stored information descriptions.

13. A device for recognizing spoken language comprising:
- speech recognition means for identifying a number of phonemes from a segment of input speech;
  
  interpretation means for interpreting the phonemes as possible word combinations to establish a model of speech having word and sentence accents according to a standardized pattern;
  
  extraction means for extracting a fundamental tone curve of the input speech;
  
  first analyzing means for determining the maximum and minimum values of the fundamental tone curve and their respective positions;
  
  second analyzing means for determining the maximum and minimum values of the fundamental tone curve of the speech model and their respective positions;
  
  comparison means for comparing the input speech with the speech model to identify a time difference between the occurrence of the maximum and minimum values of the fundamental tone curve of the incoming speech in relation to the maximum and minimum values of the fundamental tone curve of the speech model;
  
  correction means for adjusting the intonation pattern of the speech model utilizing the identified time difference, to modify the speech model to conform with the dialectal characteristics of the input speech.
- View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
- - 14. Device according to claim 13, characterized by the model instrument being arranged to, from a lexicon with orthography and transcription, create a model of the speech.
  - 15. Device according to claim 14, characterized by the model instrument being arranged to analyze the lexical information regarding syntax at which the model is brought to correspond to one in the language standardized pattern.
  - 16. Device according to claim 15, characterized by the device being arranged to categorize the speech in different dialect categories on the basis of stored intonation descriptions in the device.
  - 17. Device according to claim 14, characterized by the device being arranged to categorize the speech in different dialect categories on the basis of stored intonation descriptions in the device.
  - 18. Device according to claim 13, characterized by the model instrument being arranged to analyse the lexical information regarding syntax at which the model is brought to correspond to one in the language standardized pattern.
  - 19. Device according to claim 18, characterized by the device being arranged to categorize the speech in different dialect categories on the basis of stored intonation descriptions in the device.
  - 20. Device according to claim 13, characterized by the device being arranged to categorize the speech in different dialect categories on the basis of stored intonation descriptions in the device.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Intellectual Ventures I LLC (Intellectual Ventures LLC)
Original Assignee
Telia AB (Government of Norway)
Inventors
Lyberg, Bertil
Primary Examiner(s)
Tung, Kee M.

Application Number

US08/532,823
Time in Patent Office

691 Days
Field of Search

395/2.86, 395/2.6, 395/2.63, 395/2.64, 395/2.66, 395/2.09, 395/2.14, 395/2.16, 395/2.2
US Class Current

704/254
CPC Class Codes

G10L 15/07   to the speaker

G10L 15/1807   using prosody or stress

G10L 15/187   Phonemic context, e.g. pron...

G10L 15/19   Grammatical context, e.g. d...

G10L 2015/025   Phonemes, fenemes or fenone...

G10L 25/90   Pitch determination of spee...

Method and device for speech recognition

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Method and device for speech recognition

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links