Distributed speech recognition server system for mobile internet/intranet communication
First Claim
1. A speech recognition server system for implementation in a communications network having a plurality of clients, at least one site server, at least one gateway server, and at least one content server, said speech recognition server system comprising:
- a site map including a table of site address words;
a server daemon, communicable with the gateway server and the site server, for managing client information and request parameters;
a voice recognition server, communicable with said server daemon, for speech recognition of the speech information;
a site map manager, communicable with said site map, for speech recognition of the site address words in said site map;
a speaker model, communicable with said site map manager and said voice recognition server, for speech recognition of the site address words in said site map; and
a site selector, communicable with said voice recognition server, said server daemon, and said site map, for selecting the site words responsive to words recognized by said voice recognition server.
1 Assignment
0 Petitions
Accused Products
Abstract
This invention is a speech recognition server system for implementation in a communications network having a plurality of clients, at least one site communication server, at least one contents server, and at least one communications gateway server, said speech recognition server system comprising a site map including a table of site address words; a speech server daemon, communicable with the wireless communications gateway server and the site communications server, for managing speech information; a voice recognition server, communicable with said speech server daemon, for speech recognition of the speech information; a site map manager, communicable with said site map, for speech recognition of the site address words in said site map; a speaker model, communicable with said site map manager and said voice recognition server, for speech recognition of the site address words in said site map; and a site selector, communicable with said voice recognition server, said speech server daemon, and said site map, for selecting the site words responsive to words recognized by said voice recognition server.
167 Citations
66 Claims
-
1. A speech recognition server system for implementation in a communications network having a plurality of clients, at least one site server, at least one gateway server, and at least one content server, said speech recognition server system comprising:
-
a site map including a table of site address words;
a server daemon, communicable with the gateway server and the site server, for managing client information and request parameters;
a voice recognition server, communicable with said server daemon, for speech recognition of the speech information;
a site map manager, communicable with said site map, for speech recognition of the site address words in said site map;
a speaker model, communicable with said site map manager and said voice recognition server, for speech recognition of the site address words in said site map; and
a site selector, communicable with said voice recognition server, said server daemon, and said site map, for selecting the site words responsive to words recognized by said voice recognition server. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51)
-
-
52. A speech recognition server system for implementation in a communications network having at least one site server, at least one gateway server, at least one content server, and a plurality of clients each having a keypad and a micro-browser, said speech recognition server system comprising:
-
a hotkey, disposed on the keypad, for initializing a voice session;
a vocoder for generating voice frame data responsive to an input speech;
a client speech subroutine, coupled to said vocoder, for performing speech feature extraction on said voice frame data and to generate digitized voice signals therefrom;
a system-specific profile database for storing and transmitting system-specific client profiles;
a payload formatter, communicable with said client speech subroutine and said system-specific profile database, for formatting a client payload data flow received from said client speech subroutine with data received from said system-specific profile database;
a speech recognition server, communicable with the gateway server for speech recognition of the formatted client payload;
a transaction protocol (TP) socket, communicable with said payload formatter and the gateway server, for receiving the formatted client payload from said payload formatter, converting the client payload to a wireless speech TP query, and transmitting the wireless speech TP query via the gateway server through the communications network to said speech recognition server, and further for receiving a recognized wireless speech TP query from said speech recognition server, converting the recognized wireless speech TP query to a resource identifier, and transmitting the resource identifier to the micro-browser for identifying the resource responsive to the resource identifier;
a wireless transaction protocol socket, communicable with the micro-browser and gateway server, for receiving the resource query from the micro-browser, generating a wireless session resource query, and transmitting the resource query via the gateway server and through the communications network to the contents server, and further for receiving content from the content server via the site server, the communications network, and the gateway server, and transmitting the content via the micro-browser to the client for display; and
an event handler, communicable with said hotkey, said client speech subroutine, said TP socket, the micro-browser, and said payload formatter, for transmitting event command signals and synchronizing the voice session thereamong.
-
-
53. A speech recognition server system for implementation in a communications network having at least one site server, at least one gateway server, at least one content server, and a plurality of clients each having a keypad and a micro-browser, said speech recognition server system comprising:
-
a hotkey, disposed on the keypad, for initializing a voice session;
a vocoder for generating voice frame data responsive to an input speech;
a client speech subroutine, coupled to said vocoder, for performing speech feature extraction on said voice frame data and to generate digitized voice signals therefrom;
a system-specific profile database for storing and transmitting system-specific client profiles;
a payload formatter, communicable with said client speech subroutine and said system-specific profile database, for formatting the client payload received from said client speech subroutine with data received from said system-specific profile database;
a speech recognition server, communicable with the gateway server for speech recognition;
a transaction protocol (TP) socket, communicable with said payload formatter and the gateway server, for receiving the client payload from said payload formatter, converting the client payload to a TP tag, and transmitting the TP tag via the gateway server through the communications network to said speech recognition server;
a wireless transaction protocol socket, communicable with the micro-browser and the gateway server, for receiving a wireless push transmission from the gateway server responsive to a push access protocol transmission from said speech recognition server, and for receiving a resource transmission from the micro-browser and transmitting the resource transmission via the gateway server through the communications network to the site server, and further for receiving content from the content server via the site server, the communications network, and the gateway server, and transmitting the content via the micro-browser to the client for display; and
an event handler, communicable with said hotkey, said client speech subroutine, the micro-browser, and said payload formatter, for transmitting event command signals and synchronizing the voice session thereamong.
-
-
54. A speech recognition server system for implementation in a communications network having at least one site server, at least one gateway server, at least one contents server, and a plurality of clients each having a keypad and a micro-browser, said speech recognition server system comprising:
-
a hotkey, disposed on the keypad, for initializing a voice session;
a vocoder for generating voice frame data responsive to an input speech;
a client speech subroutine, coupled to said vocoder, for performing speech feature extraction on said voice frame data and to generate digitized voice signals therefrom;
a system-specific profile database for storing and transmitting system-specific client profiles;
a payload formatter, communicable with the micro-browser, said client speech subroutine and said system-specific profile database, for formatting a client payload received from said client speech subroutine with data received from said system-specific profile database;
a speech recognition server, communicable with the gateway server for receiving the client payload hypertext TP transmissions from the gateway server and for performing speech recognition on the client payload, and further for transmitting a recognized client payload to the gateway server;
a wireless transaction protocol socket, communicable with the micro-browser and the gateway server, for receiving a wireless query transmission from the micro-browser and transmitting a wireless session protocol transmission to the gateway server and thence to said speech recognition server, and further for receiving a wireless session protocol transmission from the gateway server responsive to a hypertext TP transmission from said speech recognition server, and for receiving a resource transmission from the micro-browser and transmitting the resource transmission via the gateway server through the communications network to the contents server, and further for receiving content from the content server via the site server, the communications network, and the gateway server, and transmitting the content via the micro-browser to the client for display; and
an event handler, communicable with said hotkey, said client speech subroutine, the micro-browser, and said payload formatter, for transmitting event command signals and synchronizing the voice session thereamong.
-
-
55. A speech recognition server system for implementation in a communications network having at least one site server, at least one gateway server, at least one content server, and a plurality of clients each having a keypad and a micro-browser, said speech recognition server system comprising:
-
a hotkey, disposed on the keypad, for initializing a voice session;
a vocoder for generating voice frame data responsive to an input speech;
a client speech subroutine, coupled to said vocoder, for performing speech feature extraction on said voice frame data and to generate digitized voice signals therefrom;
a system-specific profile database for storing and transmitting system-specific client profiles;
a payload formatter, communicable with the micro-browser, said client speech subroutine and said system-specific profile database, for formatting a client payload received from said client speech subroutine with data received from said system-specific profile database;
a speech recognition server, communicable with the gateway server for receiving the client payload hypertext TP transmissions from the gateway server and for performing speech recognition on the client payload, and further for transmitting a recognized client payload to the gateway server;
a wireless transaction protocol socket, communicable with the micro-browser, said payload formatter, and the gateway server, for receiving a wireless protocol query transmission from said payload formatter and transmitting a wireless session protocol transmission to the gateway server and thence to said speech recognition server, and further for receiving a wireless session protocol transmission from the gateway server responsive to a hypertext TP transmission from said speech recognition server, and for receiving a resource transmission from the micro-browser and transmitting the resource transmission via the gateway server through the communications network to the contents server, and further for receiving content from the content server via the site server, the communications network, and the gateway server, and transmitting the content via the micro-browser to the client for display; and
an event handler, communicable with said hotkey, said client speech subroutine, the micro-browser, and said payload formatter, for transmitting event command signals and synchronizing the voice session thereamong. - View Dependent Claims (57, 58, 60)
-
-
56. A distributed speech recognition system for implementation in a wireless mobile communications system, communicable with the Internet, having at least one website server, at least one wireless gateway proxy server, a wireless telephony applications (WTA) server, and a plurality of mobile communication devices each having a micro-browser, said distributed speech recognition system comprising:
-
a client speech processor, disposed in said mobile communication devices, for speech feature extraction; and
a server speech processor, disposed in the WTA server, for recognizing the speech features.
-
-
59. A distributed speech recognition system for implementation in a wireless mobile communications system communicable with an intranet system having at least one web server, at least one intranet wireless communications gateway proxy server, a firewall, and a plurality of mobile communication devices, said distributed speech recognition system comprising:
-
a client speech processor, disposed in said mobile communication devices, for speech feature extraction; and
a server speech processor, disposed in the intranet wireless communications gateway proxy server for recognizing the speech features.
-
-
61. A speech recognition server system for implementation in a communications network having a plurality of sites each having a site map and a plurality of sub-sites, said speech recognition server system comprising:
-
a site map table for mapping the site map at the plurality of sites;
mirroring means, coupled to said site map table, for mirroring the site map at the plurality of sites to said site map table;
speech recognition means for recognizing an input speech selecting one of said plurality of sites and sub-sites; and
first child process means, coupled to said speech recognition means, for launching one of the plurality of sites responsive to the input speech;
second child process means, coupled to said speech recognition means, for launching one of the plurality of sub-sites responsive to the input speech; and
third child process means, coupled to said speech recognition means, for launching information at the sub-site responsive to an input query. - View Dependent Claims (62)
-
-
63. In a network communication system including a plurality of sites and sub-sites each providing content, a method for speech-accessing the sites, sub-sites, and content comprising the steps of:
-
mirroring the sites and sub-sites onto a speech recognition system site map;
speaking a selected site name for one of the plurality of mirrored sites and sub-sites;
generating a first child process to launch a site responsive to said spoken site name;
speaking a sub-site name for one of the plurality of mirrored sub-sites;
generating a second child process to launch a sub-site responsive to said spoken sub-site name;
speaking a query for one of the plurality of mirrored sub-sites; and
generating a third child process to launch a content responsive to said spoken query.
-
-
64. In a network communication system including a plurality of sites and sub-sites, a method for charging a payment for speech-accessing the sites and sub-sites comprising the steps of:
-
(a) mirroring the sites and sub-sites onto a speech recognition system site map;
(b) speaking a site name for one of the plurality of mirrored sites and sub-sites;
(c) generating a first child process to launch a site responsive to said spoken site name;
(d) speaking a sub-site name for one of the plurality of mirrored sub-sites;
(e) generating a second child process to launch a sub-site responsive to said spoken sub-site name;
(f) speaking a query for one of the plurality of mirrored sub-sites;
(g) generating a third child process to launch a content responsive to said spoken query; and
(h) charging a payment for said steps (a) to (g). - View Dependent Claims (65, 66)
-
Specification