Techniques to provide a standard interface to a speech recognition platform
First Claim
Patent Images
1. A computer-implemented method, comprising:
- accepting a speech recognition request via an application program interface (API), the request comprising an audio input and parameters including a uniform resource identifier (URI) link to a length of silence to observe, a grammar, and a grammar weight;
performing speech recognition on the audio input according to the request; and
upon observing the length of silence in the audio input, returning a plurality of speech recognition results based on the request as hypertext protocol (HTTP) responses comprising an extensible markup language (XML) document formatted in a format that includes a status attribute indicating an overall success or failure of speech recognition on the audio input, wherein at least one of the plurality of speech recognition results and the status attribute are returned prior to performing speech recognition on all of the audio input in the request.
2 Assignments
0 Petitions
Accused Products
Abstract
Techniques and systems to provide speech recognition services over a network using a standard interface are described. In an embodiment, a technique includes accepting a speech recognition request that includes at least audio input, via an application program interface (API). The speech recognition request may also include additional parameters. The technique further includes performing speech recognition on the audio according to the request and any specified parameters; and returning a speech recognition result as a hypertext protocol (HTTP) response. Other embodiments are described and claimed.
11 Citations
19 Claims
-
1. A computer-implemented method, comprising:
-
accepting a speech recognition request via an application program interface (API), the request comprising an audio input and parameters including a uniform resource identifier (URI) link to a length of silence to observe, a grammar, and a grammar weight; performing speech recognition on the audio input according to the request; and upon observing the length of silence in the audio input, returning a plurality of speech recognition results based on the request as hypertext protocol (HTTP) responses comprising an extensible markup language (XML) document formatted in a format that includes a status attribute indicating an overall success or failure of speech recognition on the audio input, wherein at least one of the plurality of speech recognition results and the status attribute are returned prior to performing speech recognition on all of the audio input in the request. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. An article comprising a storage memory unit containing instructions that when executed enable a system to:
-
provide an application program interface (API) operative to receive from a computing device a speech recognition request comprising an audio input and at least two parameters including a uniform resource identifier (URI) link to a grammar, a grammar weight, and a length of silence to observe in the audio input; and return, from a speech recognition engine to the computing device, a first speech recognition result associated with the streamed speech recognition request as a first hypertext protocol (HTTP) response after the length of silence is observed and prior to returning a second speech recognition result associated with the streamed speech recognition request as a second HTTP response, the first and second HTTP responses comprising an extensible markup language (XML) document formatted in a format that includes a status flag indicating an overall success or failure of the speech recognition result for the audio input. - View Dependent Claims (8, 9, 10, 11, 12, 13)
-
-
14. An apparatus, comprising:
-
a processor; an application program interface (API) operative to provide an interface for building a speech recognition request comprising audio input and parameters including a uniform resource identifier (URI) link to a grammar, a grammar weight, and a specified duration of the audio input; a speech recognition (SR) component operative on the processor to receive the speech recognition request, to convert a first portion of the audio input to a first recognition result associated with the speech recognition request, to return the first recognition result after silence is observed for the specified duration of the audio input, to convert a second portion of the audio input to a second recognition result associated with the speech recognition request, and to return the second recognition result, wherein the first and second recognition results are returned as a hypertext protocol (HTTP) responses comprising an extensible markup language (XML) document formatted in a format that includes a status flag indicating an overall success or failure of the recognition result. - View Dependent Claims (15, 16, 17, 18, 19)
-
Specification