Speech processing for telephony API
First Claim
1. A computing device having a memory and a processor for enhancing media processing of a media stream containing speech data, comprising:
- a terminal data structure to support instantiating terminal objects, each terminal object adhering to a uniform interface, providing a telephony service, and having a terminal class name and a media type;
a speech recognition terminal data structure that extends the terminal data structure;
a terminal manager to instantiate, based on the terminal data structure and the speech recognition terminal data structure, terminal objects including a speech recognition terminal object to recognize speech having a speech recognition media type; and
a TAPI application component providing a telephony API to form a connection with a client and to process the speech data by;
registering terminal objects including a speech recognition terminal object;
selecting, with the processor, the speech recognition terminal object from among a group of registered terminal objects based on the media type of the registered terminal objects;
instantiating the selected speech recognition terminal object using the terminal manager by providing the terminal class name, the media type, and a method of signaling events; and
providing the speech data to the instantiated speech recognition terminal object for recognition of the speech data.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems, methods, and structures are discussed that enhance media processing. One aspect of the present invention includes a data structure to enhance media processing. The data structure includes a terminal data structure to instantiate terminal objects and a speech recognition terminal data structure that extends the terminal data structure. Another aspect of the present invention includes a data structure to enhance media processing. This data structure includes a terminal data structure to instantiate terminal objects and a speech generation terminal data structure that extends the terminal data structure. These data structures may be used to implement an internet protocol interactive voice response system, an internet protocol unified message system, and speech-enabled Web applications.
67 Citations
33 Claims
-
1. A computing device having a memory and a processor for enhancing media processing of a media stream containing speech data, comprising:
-
a terminal data structure to support instantiating terminal objects, each terminal object adhering to a uniform interface, providing a telephony service, and having a terminal class name and a media type; a speech recognition terminal data structure that extends the terminal data structure; a terminal manager to instantiate, based on the terminal data structure and the speech recognition terminal data structure, terminal objects including a speech recognition terminal object to recognize speech having a speech recognition media type; and a TAPI application component providing a telephony API to form a connection with a client and to process the speech data by; registering terminal objects including a speech recognition terminal object; selecting, with the processor, the speech recognition terminal object from among a group of registered terminal objects based on the media type of the registered terminal objects; instantiating the selected speech recognition terminal object using the terminal manager by providing the terminal class name, the media type, and a method of signaling events; and providing the speech data to the instantiated speech recognition terminal object for recognition of the speech data. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer-readable storage medium containing computer-executable instructions that when executed by a computer having a memory and a processor cause the computer to perform a method for enhancing media processing, the method comprising:
-
providing a terminal data structure to support instantiating terminal objects, each terminal object adhering to a uniform interface, providing a telephony service, and having a terminal class name and a media type; providing a speech recognition terminal data structure that extends the terminal data structure, wherein the speech recognition terminal data structure includes an engine token data structure; and using a telephony API provided by a TAPI application component to form a connection with a client and to process speech data by; registering terminal objects including a speech recognition terminal object; selecting, with the processor, the speech recognition terminal object from among a group of registered terminal objects based on the media type of the registered terminal objects; instantiating the selected speech recognition terminal object using a terminal manager, based on the terminal data structure and the speech recognition terminal data structure, by providing the terminal class name, the media type, and a method of signaling events; and providing speech data to the instantiated speech recognition terminal object for recognition of the speech data. - View Dependent Claims (7, 8)
-
-
9. A computing system having a memory and a processor for enhancing media processing of speech, comprising:
-
a terminal data structure to support instantiating terminal objects, each terminal object adhering to a uniform interface, providing a telephony service, and having a terminal class name and a media type; a speech recognition terminal data structure that extends the terminal data structure, wherein the speech recognition terminal data structure includes an enumeration engine data structure; and a TAPI application component providing a telephony API to form a connection with a client in response to receiving a call from the client and to process the speech by; registering terminal objects including a speech recognition terminal object; selecting, with the processor, the speech recognition terminal object from among a group of registered terminal objects based on media type of the registered terminal objects; instantiating the selected speech recognition terminal object using a terminal manager, bsed on the terminal data structure and the speech recognition terminal data structure, by providing the terminal class name, the media type, and a method of signaling events; and providing the speech to the instantiated speech recognition terminal object for recognition of the speech data. - View Dependent Claims (10)
-
-
11. A computing system having a memory and a processor for enhancing media processing, comprising:
-
a terminal data structure to support instantiating terminal objects, each terminal object adhering to a uniform interface, providing a telephony service, and having a terminal class name and a media type; a speech recognition terminal data structure that extends the terminal data structure, wherein the speech recognition terminal data structure includes a speech recognition data structure; and a TAPI application component providing a telephony API to form a connection with a client in response to receiving a call from the client, and to process speech by; registering terminal objects including a speech recognition terminal object; selecting, with the processor, the speech recognition terminal object from among a group of registered terminal objects; instantiating the selected speech recognition terminal object using a terminal manager, based on terminal data structure and the speech recognition terminal data structure, by providing the terminal class name, the media type, and a method of signaling events; and providing speech to the instantiated speech recognition terminal object for recognition of the speech. - View Dependent Claims (12, 13, 14, 15)
-
-
16. A computer-readable storage medium containing computer-executable instructions that when executed by a computer having a memory and a processor cause the computer to perform a method for enhancing media processing of speech, the method comprising:
-
receiving a terminal data structure to support instantiating terminal objects, each terminal object adhering to a uniform interface, providing a telephony service, and having a terminal class name and a media type; receiving a speech recognition terminal data structure that extends the terminal data structure, wherein the speech recognition terminal data structure includes a recognition context data structure; and providing by a TAPI application component a telephony API to form a connection with a client and to process the speech by; registering terminal objects including a speech recognition terminal object; selecting, with the processor, the speech recognition terminal object from among a group of registered terminal objects based on the media type of the registered terminal objects; instantiating the selected speech recognition terminal object using a terminal manager, based on the terminal data structure and speech recognition terminal data structure, by providing the terminal class name, the media type, and a method of signaling events; and providing speech data to the instantiated speech recognition terminal object for recognition of the speech data. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23)
-
-
24. A method in a computing device having a memory and a processor for enhancing media processing, comprising:
-
providing terminal objects adhering to a uniform interface, providing telephony services, and having terminal class names and media types, the terminal objects including a speech recognition terminal object for recognizing speech; with a processor, invoking a telephony API to listen for incoming calls, wherein the telephony API is provided by a TAPI application component, the TAPI application component for; registering terminal objects that are selectable for instantiation with a terminal manager by providing a terminal class name, a media type, and a method of signaling events and providing a list of registered terminal objects including the media types of the registered terminal objects; upon receiving a call from a client, with the processor, invoking the telephony API to form a connection with the client and to select the speech recognition terminal object from among a group of registered terminal objects; with a processor, invoking a method of the selected speech recognition terminal object to get a desired speech recognition engine; and setting a speech recognition context based on the desired speech recognition terminal object to recognize speech associated with the received call. - View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 33)
-
-
32. A system for enhancing processing of speech of a media stream, comprising:
-
a processor; a speech terminal object to recognize the speech of the media stream; and a TAPI application component providing a telephony API to form a connection with a client and to process the speech of the media stream by; registering terminal objects including the speech terminal object, each terminal object adhering to a uniform interface; selecting the speech terminal object from among a group of registered terminal objects; instantiating the selected speech terminal object using the terminal manager by providing the terminal class name, the media type, and a method of signaling events; and providing speech of the media stream to the instantiated speech terminal object for recognition of the speech of the media stream wherein the speech terminal object includes computer-executable instructions stored in a memory for execution by the processor.
-
Specification