METHODS AND APPARATUS FOR REDUCING LATENCY IN SPEECH RECOGNITION APPLICATIONS
First Claim
1. A computing device including a speech-enabled application installed thereon, the computing device comprising:
- an input interface configured to receive first audio comprising speech from a user of the computing device;
an automatic speech recognition (ASR) engine configured to;
detect based, at least in part, on a threshold time for endpointing, an end of speech in the first audio; and
generate a first ASR result based, at least in part, on a portion of the first audio prior to the detected end of speech; and
at least one processor programmed to;
determine whether a valid action can be performed by the speech-enabled application using the first ASR result; and
instruct the ASR engine to process second audio when it is determined that a valid action cannot be performed by the speech-enabled application using the first ASR result.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods and apparatus for reducing latency in speech recognition applications. The method comprises receive first audio comprising speech from a user of a computing device, detecting an end of speech in the first audio, generating an ASR result based, at least in part, on a portion of the first audio prior to the detected end of speech, determining whether a valid action can be performed by a speech-enabled application installed on the computing device using the ASR result, and processing second audio when it is determined that a valid action cannot be performed by the speech-enabled application using the ASR result.
-
Citations
20 Claims
-
1. A computing device including a speech-enabled application installed thereon, the computing device comprising:
-
an input interface configured to receive first audio comprising speech from a user of the computing device; an automatic speech recognition (ASR) engine configured to; detect based, at least in part, on a threshold time for endpointing, an end of speech in the first audio; and generate a first ASR result based, at least in part, on a portion of the first audio prior to the detected end of speech; and at least one processor programmed to; determine whether a valid action can be performed by the speech-enabled application using the first ASR result; and instruct the ASR engine to process second audio when it is determined that a valid action cannot be performed by the speech-enabled application using the first ASR result. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A method, comprising:
-
receiving, by an input interface of a computing device, first audio comprising speech from a user of the computing device; detecting, by an automatic speech recognition (ASR) engine of the computing device, an end of speech in the first audio; generating, by the ASR engine, an ASR result based, at least in part, on a portion of the first audio prior to the detected end of speech; determining whether a valid action can be performed by a speech-enabled application installed on the computing device using the ASR result; and instructing the ASR engine to process second audio when it is determined that a valid action cannot be performed by the speech-enabled application using the ASR result.
-
-
20. A computer-readable storage medium encoded with a plurality of instructions that, when executed by a computing device, performs a method, the method comprising:
-
receiving first audio comprising speech from a user of the computing device; detecting an end of speech in the first audio; generating an ASR result based, at least in part, on a portion of the first audio prior to the detected end of speech; determining whether a valid action can be performed by a speech-enabled application installed on the computing device using the ASR result; and processing second audio when it is determined that a valid action cannot be performed by the speech-enabled application using the ASR result.
-
Specification