METHOD AND APPARATUS FOR SPEECH RECOGNITION AND GENERATION OF SPEECH RECOGNITION ENGINE

US 20160027437A1
Filed: 02/05/2015
Published: 01/28/2016
Est. Priority Date: 07/28/2014
Status: Active Grant

First Claim

Patent Images

1. A method of speech recognition, the method comprising:

receiving a speech input;

transmitting the speech input to a speech recognition engine; and

receiving a speech recognition result from the speech recognition engine,wherein the speech recognition engine is configured to obtain a phoneme sequence from the speech input and provide the speech recognition result based on a phonetic distance of the phoneme sequence.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and apparatus for speech recognition and for generation of speech recognition engine, and a speech recognition engine are provided. The method of speech recognition involves receiving a speech input, transmitting the speech input to a speech recognition engine, and receiving a speech recognition result from the speech recognition engine, in which the speech recognition engine obtains a phoneme sequence from the speech input and provides the speech recognition result based on a phonetic distance of the phoneme sequence.

20 Citations

View as Search Results

20 Claims

1. A method of speech recognition, the method comprising:
- receiving a speech input;
  
  transmitting the speech input to a speech recognition engine; and
  
  receiving a speech recognition result from the speech recognition engine,wherein the speech recognition engine is configured to obtain a phoneme sequence from the speech input and provide the speech recognition result based on a phonetic distance of the phoneme sequence.
- View Dependent Claims (2, 3, 4)
- - 2. The method of claim 1, wherein the speech recognition engine is configured to provide the speech recognition result based on a phoneme sequence mapped to an embedding vector closest in the phonetic distance to the obtained phoneme sequence among embedding vectors arranged on an N-dimensional embedding space.
  - 3. The method of claim 1, wherein the speech recognition engine comprises an inter-word distance matrix indicating phonetic distances between words determined based on phonetic similarities between phoneme sequences of the words.
  - 4. The method of claim 1, wherein the speech recognition engine comprises embedding vectors arranged on an N-dimensional embedding space obtained by applying a multidimensional scaling method to an inter-word distance matrix.

5. A method of generating speech recognition engine, the method comprising:
- obtaining phoneme sequences of words;
  
  determining phonetic similarities between the phoneme sequences by comparing phonemes comprised in the phoneme sequences;
  
  calculating phonetic distances between the words based on the determined phonetic similarities between the phoneme sequences; and
  
  generating embedding vectors based on the calculated phonetic distances between the words.
- View Dependent Claims (6, 7, 8, 9, 10)
- - 6. The method of claim 5, wherein the calculating comprises assigning values to phonetic distances such that, when a phonetic similarity between phoneme sequences is large, a phonetic distance between words corresponding to the phoneme sequences is small.
  - 7. The method of claim 5, wherein the determining comprises:
    - calculating a substitution probability between phonemes comprised in the phoneme sequences; and
      
      determining the phonetic similarities between the phoneme sequences to be high when the calculated substitution probability between the phonemes is high.
  - 8. The method of claim 5, wherein the generating comprises generating an embedding vector by applying a multidimensional scaling method to an inter-word distance matrix indicating phonetic distances between the words.
  - 9. The method of claim 5, wherein the calculating comprises calculating a phonetic distance between words using a calculating method based on a phonetic distance between phonemes obtained by comparing the phonemes comprised in the phoneme sequences.
  - 10. The method of claim 5, wherein the generating comprises:
    - predicting a word using the embedding vectors generated by applying a multidimensional scaling method to the inter-word distance matrix.

11. A method of speech recognition, the method comprising:
- receiving a speech input;
  
  obtaining a phoneme sequence from the speech input;
  
  selecting an embedding vector closest in a phonetic distance to the phoneme sequence among embedding vectors arranged on an N-dimensional embedding space; and
  
  outputting a speech recognition result based on a phoneme sequence mapped to the selected embedding vector.
- View Dependent Claims (12, 13)
- - 12. The method of claim 11, wherein the embedding vectors arranged on the N-dimensional embedding space are generated based on phonetic distances between words determined based on phonetic similarities between phoneme sequences of the words.
  - 13. The method of claim 11, wherein the embedding vectors are generated by applying a multidimensional scaling method to an inter-word distance matrix indicating phonetic distances between words.

14. An apparatus comprising:
- a microphone configured to receive a speech input;
  
  a phoneme sequence processor configured to obtain a phoneme sequence from the speech input; and
  
  a speech recognition engine configured to generate a speech recognition result based on a phonetic distance of the phoneme sequence.
- View Dependent Claims (15, 16, 17)
- - 15. The apparatus of claim 14, further comprising a command recognition unit configured to provide a speech command interface based on the speech recognition result.
  - 16. The apparatus of claim 14, wherein the speech recognition engine comprises an inter-word distance matrix stored in a memory.
  - 17. The apparatus of claim 14, wherein the speech recognition engine comprises an embedded vector processor configured to select an embedding vector corresponding to the phoneme sequence among embedding vectors arranged on an embedding space.

18. A speech recognition engine, comprising:
- an embedded vector processor configured to select an embedding vector corresponding to a phoneme sequence among embedding vectors arranged on an embedding space; and
  
  a speech recognition result synthesizer configured to recognize a word in the speech input based on the selected embedding vector.
- View Dependent Claims (19, 20)
- - 19. The speech recognition engine of claim 18, further comprising an inter-word distance matrix stored in a memory.
  - 20. The speech recognition engine of claim 18, further comprising a phoneme sequence processor configured to parse a speech input to obtain the phoneme sequence.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Samsung Electronics Co. Ltd.
Original Assignee
Samsung Electronics Co. Ltd.
Inventors
HONG, Seok Jin, CHOI, Young Sang, YOO, Sang Hyun, CHOI, Hee Youl

Granted Patent

US 9,779,730 B2
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G10L 15/187 Phonemic context, e.g. pron...

METHOD AND APPARATUS FOR SPEECH RECOGNITION AND GENERATION OF SPEECH RECOGNITION ENGINE

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

20 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

METHOD AND APPARATUS FOR SPEECH RECOGNITION AND GENERATION OF SPEECH RECOGNITION ENGINE

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

20 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links