Method for generating morphemes
First Claim
1. A method of generating acoustic morphemes, comprising:
- receiving training speech;
selecting candidate phone-phrases from the training speech;
selecting salient phone-phrases from the candidate phone-phrases based on salience measurements;
clustering the salient phone-phrases based on semantic and syntactic similarities into acoustic morphemes.
4 Assignments
0 Petitions
Accused Products
Abstract
The invention concerns a method of generating morphemes for speech recognition and understanding. The method may include receiving training speech, selecting candidate sub-morphemes from the training speech, selecting salient sub-morphemes from the candidate sub-morphemes based on salience measurements, and clustering the salient sub-morphemes based on semantic and syntactic similarities into morphemes. The morphemes may be acoustic and/or non-acoustic. The sub-morphemes may represent any sub-unit of communication including phones, phone-phrases, grammars, diphones, words, gestures, tablet strokes, body movements, mouse clicks, etc. The training speech may be verbal, non-verbal, a combination of verbal and non-verbal, or multimodal.
167 Citations
25 Claims
-
1. A method of generating acoustic morphemes, comprising:
-
receiving training speech;
selecting candidate phone-phrases from the training speech;
selecting salient phone-phrases from the candidate phone-phrases based on salience measurements;
clustering the salient phone-phrases based on semantic and syntactic similarities into acoustic morphemes.
-
-
2. The method of claim 1, further comprising:
storing the acoustic morphemes in a database.
-
3. The method of claim 2, wherein the acoustic morpheme database is used by a speech recognition and understanding system.
-
4. The method of claim 1, wherein the step of selecting candidate phone-phrases includes:
-
filtering the training speech;
selecting all observed phone sequences of a predetermined length; and
selecting as candidate phone-phrases the phone sequences that are of at least the predetermined length.
-
-
5. The method of claim 1, wherein the training speech includes at least one of verbal and non-verbal speech.
-
6. The method of claim 5, wherein the non-verbal speech includes the use of at least one of gestures, body movements, head movements, non-responses, text, keyboard entries, keypad entries, mouse clicks, DTMF codes, pointers, stylus, cable set-top box entries, graphical user interface entries and touchscreen entries.
-
7. The method of claim 1, wherein the training speech includes multimodal forms.
-
8. The method of claim 1, wherein the training speech is untranscribed.
-
9. The method of claim 1, wherein the training speech is transcribed.
-
10. The method of claim 1, wherein the salient phone-phrases are selected using a test for significance.
-
11. The method of claim 1, wherein the salient phone-phrases are clustered into acoustic morphemes using a distortion measure between the salient phone-phrases.
-
12. The method of claim 11, wherein the distortion measure is based on at least one of string distortion, semantic distortion and syntactic distortion.
-
13. A method of generating morphemes, comprising:
-
receiving training speech;
selecting candidate sub-morphemes from the training speech;
selecting salient sub-morphemes from the candidate sub-morphemes based on salience measurements;
clustering the salient sub-morphemes based on semantic and syntactic similarities into morphemes.
-
-
14. The method of claim 13, wherein the morphemes are at least one of acoustic and non-acoustic.
-
15. The method of claim 13, further comprising:
storing the morphemes in a database.
-
16. The method of claim 15, wherein the morpheme database is used by a speech recognition and understanding system.
-
17. The method of claim 13, wherein the step of selecting candidate sub-morphemes includes:
-
filtering the training speech;
selecting all observed sub-morpheme sequences of a predetermined length; and
selecting as candidate sub-morphemes the sub-morpheme sequences that are of at least the predetermined length.
-
-
18. The method of claim 13, wherein the training speech includes at least one of verbal and non-verbal speech.
-
19. The method of claim 18, wherein the non-verbal speech includes the use of at least one of gestures, body movements, head movements, non-responses, text, keyboard entries, keypad entries, mouse clicks, DTMF codes, pointers, stylus, cable set-top box entries, graphical user interface entries and touchscreen entries.
-
20. The method of claim 13, wherein the training speech includes multimodal forms.
-
21. The method of claim 13, wherein the training speech is untranscribed.
-
22. The method of claim 13, wherein the training speech is transcribed.
-
23. The method of claim 13, wherein the salient sub-morphemes are selected using a test for significance.
-
24. The method of claim 13, wherein the salient sub-morphemes are clustered into morphemes using a distortion measure between the salient sub-morphemes.
-
25. The method of claim 24, wherein the distortion measure is based on at least one of string distortion, semantic distortion and syntactic distortion.
Specification