Speech recognition using electronic device and server

US 9,640,183 B2
Filed: 04/07/2015
Issued: 05/02/2017
Est. Priority Date: 04/07/2014
Status: Active Grant

First Claim

Patent Images

1. An electronic device comprising:

a processor configured to perform automatic speech recognition (ASR) on a speech input by using a speech recognition model that is stored in a memory; and

a communication module configured to provide the speech input to a server and receive a speech instruction, which corresponds to the speech input, from the server,wherein the processor is further configured to;

perform an operation corresponding to a result of the ASR if a confidence score of the result of the ASR is higher than a first threshold value,perform the speech instruction, which is received from the server, if the confidence score is between the first threshold value and a second threshold value, anddecrease the first threshold value if the result of the ASR corresponds to the speech instruction that is received from the server.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An electronic device is provided. The electronic device includes a processor configured to perform automatic speech recognition (ASR) on a speech input by using a speech recognition model that is stored in a memory and a communication module configured to provide the speech input to a server and receive a speech instruction, which corresponds to the speech input, from the server. The electronic device may perform different operations according to a confidence score of a result of the ASR. Besides, it may be permissible to prepare other various embodiments speculated through the specification.

Citations

20 Claims

1. An electronic device comprising:
- a processor configured to perform automatic speech recognition (ASR) on a speech input by using a speech recognition model that is stored in a memory; and
  
  a communication module configured to provide the speech input to a server and receive a speech instruction, which corresponds to the speech input, from the server,wherein the processor is further configured to;
  
  perform an operation corresponding to a result of the ASR if a confidence score of the result of the ASR is higher than a first threshold value,perform the speech instruction, which is received from the server, if the confidence score is between the first threshold value and a second threshold value, anddecrease the first threshold value if the result of the ASR corresponds to the speech instruction that is received from the server.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The electronic device of claim 1, wherein the processor is further configured to provide a feedback if the confidence score of the result of the ASR is lower than the second threshold value.
  - 3. The electronic device of claim 1, wherein, if the confidence score is higher than the first threshold value, the processor is configured to perform the operation regardless of receipt of the speech instruction from the server.
  - 4. The electronic device of claim 3, wherein the performing of the operation includes performing at least one function executable by the processor, at least one application, or at least one input based on the result of the ASR.
  - 5. The electronic device of claim 1, wherein the providing of the feedback comprises providing a message or audio output to indicate that the speech input has not been recognized or there is low confidence in the result of the ASR.
  - 6. The electronic device of claim 1, wherein the speech instruction received from the server corresponds to a result of speech recognition to the provided speech input, which is performed at the server, based on a speech recognition model different from the speech recognition model stored in the memory.
  - 7. The electronic device of claim 6, wherein the speech recognition performed in the server is configured to comprise natural language processing (NLP).
  - 8. The electronic device of claim 1, wherein the processor is further configured to:
    - provide an audio signal, in which a pre-processing is applied to the speech input, to an ASR engine performing the ASR; and
      
      provide the speech input itself to the server through the communication module.
  - 9. The electronic device of claim 1, wherein the processor is further configured to:
    - if the confidence score is higher than the first threshold value, compare the result of the ASR to the speech instruction received from the server; and
      
      change the first threshold value based on a result of the comparison.
  - 10. The electronic device of claim 1, wherein the processor is further configured to increase the first threshold value if the result of the ASR does not correspond to the speech instruction.
  - 11. The electronic device of claim 1, wherein the processor is further configured to:
    - if the confidence score is lower than the first threshold value, compare the result of the ASR to the speech instruction received from the server; and
      
      update the speech recognition model based on a result of the comparison.
  - 12. The electronic device of claim 11, wherein the communication module is further configured to receive a confidence score with the speech instruction from the server,wherein the processor is further configured to add the speech instruction and the confidence score of the speech instruction, for the speech input, to the speech recognition model.

13. A method of executing speech recognition in an electronic device, the method comprising:
- obtaining a speech input from a user;
  
  generating a speech signal corresponding to the obtained speech;
  
  performing first speech recognition on at least a part of the speech signal;
  
  acquiring first operation information and a first confidence score;
  
  transmitting at least a part of the speech signal to a server for second speech recognition;
  
  receiving second operation information, which corresponds to the transmitted signal, from the server;
  
  performing a function corresponding to the first operation information if the first confidence score is higher than a first threshold value;
  
  performing a function corresponding to the second operation information if the first confidence score is between the first threshold value and a second threshold value; and
  
  decreasing the first threshold value if the function corresponding to the first operation information is identical to the function corresponding to the second operation information.
- View Dependent Claims (14, 15, 16, 17, 18, 19)
- - 14. The method of claim 13, wherein, if the first confidence score is higher than the first threshold value, the performing of the function corresponding to the first operation information is performed before the receiving of the second operation information.
  - 15. The method of claim 13, further comprising:
    - providing a feedback if the first confidence score is lower than the second threshold value.
  - 16. The method of claim 13, further comprising:
    - increasing the first threshold if the function corresponding to the first operation information is different from the function corresponding to the second operation information.
  - 17. The method of claim 13, wherein the receiving of the second operation information further includes receiving a second confidence score of the second operation information.
  - 18. The method of claim 17, further comprising:
    - if the first confidence score is lower than the first threshold value, adding the second operation information and the second confidence score, for the speech input, to a speech recognition model that is used in the first speech recognition.
  - 19. The method of claim 17, further comprising:
    - if the first operation information corresponds to the second operation information, adding the second operation information and the second confidence score to a speech recognition model, which is used in the first speech recognition, based on the first confidence score and second confidence score.

20. A non-transitory computer readable recording medium having instructions recorded thereon, the instructions implement a method of executing speech recognition in an electronic device, the method comprising:
- obtaining a speech input from a user;
  
  generating a speech signal corresponding to the obtained speech;
  
  performing first speech recognition on at least a part of the speech signal;
  
  acquiring first operation information and a first confidence score;
  
  transmitting at least a part of the speech signal to a server for second speech recognition;
  
  receiving second operation information, which corresponds to the transmitted signal, from the server;
  
  performing a function corresponding to the first operation information if the first confidence score is higher than a first threshold value;
  
  performing a function corresponding to the second operation information if the first confidence score is between the first threshold value and a second threshold value; and
  
  decreasing the first threshold value if the function corresponding to the first operation information is identical to the function corresponding to the second operation information.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Samsung Electronics Co. Ltd.
Original Assignee
Samsung Electronics Co. Ltd.
Inventors
Jung, Seok Yeong, Kim, Kyung Tae
Primary Examiner(s)
Singh, Satwant

Application Number

US14/680,444
Publication Number

US 20150287413A1
Time in Patent Office

756 Days
Field of Search

None
US Class Current
CPC Class Codes

G10L 15/01   Assessment or evaluation of...

G10L 15/08   Speech classification or se...

G10L 15/22   Procedures used during a sp...

G10L 15/30   Distributed recognition, e....

G10L 15/32   Multiple recognisers used i...

G10L 17/22   Interactive procedures; Man...

G10L 2015/223   Execution procedure of a sp...

G10L 2015/225   Feedback of the input speech

Speech recognition using electronic device and server

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition using electronic device and server

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links