System and method for dynamically translating HTML to VoiceXML intelligently
First Claim
1. A system for dynamically translating a Hypertext Markup Language (HTML) document to Voice eXtensible Markup Language (VoiceXML) form comprising:
- a voice server for receiving a user request and, in response to the user request, making a Hypertext Transfer Protocol (HTTP) request;
a voice session manager for receiving the HTTP request from the voice server and, in response to the HTTP request, accessing the HTML document, translating the HTML document to a VoiceXML document and sending the VoiceXML document to the voice server, so that the voice server can send the VoiceXML document to the user in an audible form; and
a document structure analyzer java server page (DSA JSP) for partitioning the HTML document into a plurality of text sections and a plurality of link sections;
wherein the DSA JSP differentiates between the plurality of text sections and the plurality of link sections by calculating a link density D1 of a section, where the section may be a link section if the link density D1 is greater than about 0.75, or otherwise the section may be a text section;
wherein the link density D1 is given by equation D1=(Hc−
KIl)/Sc, where Hc is a number of non-tag characters in a section that appears inside HREF, a link tag in html, K is a weight value equal to about 5, I1 is a number of links within image maps in the section, and Sc is a total number of non-tag characters in the section.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for dynamically translating a Hypertext Markup Language (HTML) document to Voice eXtensible Markup Language (VoiceXML) form includes a VoiceXML server for receiving a user request and, in response to the user request, making a Hypertext Transfer Protocol (HTTP) request, a VoiceXML session manager for receiving the HTTP request from the voice server and, in response to the HTTP request, accessing the HTML document, translating the HTML document to a VoiceXML document after performing document structure analysis (DSA) and text summarization (TS) of the HTML document and including user profile information with the VoiceXML document and sending the VoiceXML document to the voice server, so that the voice server can send the VoiceXML document to the user in an audible form.
206 Citations
21 Claims
-
1. A system for dynamically translating a Hypertext Markup Language (HTML) document to Voice eXtensible Markup Language (VoiceXML) form comprising:
-
a voice server for receiving a user request and, in response to the user request, making a Hypertext Transfer Protocol (HTTP) request; a voice session manager for receiving the HTTP request from the voice server and, in response to the HTTP request, accessing the HTML document, translating the HTML document to a VoiceXML document and sending the VoiceXML document to the voice server, so that the voice server can send the VoiceXML document to the user in an audible form; and a document structure analyzer java server page (DSA JSP) for partitioning the HTML document into a plurality of text sections and a plurality of link sections; wherein the DSA JSP differentiates between the plurality of text sections and the plurality of link sections by calculating a link density D1 of a section, where the section may be a link section if the link density D1 is greater than about 0.75, or otherwise the section may be a text section; wherein the link density D1 is given by equation D1=(Hc−
KIl)/Sc, where Hc is a number of non-tag characters in a section that appears inside HREF, a link tag in html, K is a weight value equal to about 5, I1 is a number of links within image maps in the section, and Sc is a total number of non-tag characters in the section. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method for dynamically translating an HTML document to VoiceXML form, comprising the steps of:
-
making an HTTP request in response to a request by a user; accessing the HTML document in response to the HTTP request; translating the HTML document to a VoiceXML document; and sending the VoiceXML document to the user in an audible form; and partitioning the HTML document into a plurality of text sections and a plurality of link sections; wherein the plurality of text sections and the plurality of link sections are differentiated by calculating a link density D1 of a section, where the section may be a link section if the link density D1 is greater than about 0.75, or otherwise the section may be a text section; wherein the link density D1 is given by the equation D1=(Hc−
KII)/Sc, where He is a number of non-tag characters in a section that appears inside HREF, a link tag in html, K is a weight value equal to about 5, I1 is a number of links within image maps in the section, and Sc is a total number of non-tag characters in the section. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A method for dynamically translating an HTML document to VoiceXML form, comprising the steps of:
-
making an HTTP request in response to a request by a user; accessing the HTML document in response to the HTTP request; translating the HTML document to a VoiceXML document; sending the VoiceXML document to the user in an audible form; partitioning the HTML document into a plurality of text sections and a plurality of link sections; extracting a segment from the HTML document, the segment including a plurality of tag sequences; processing the plurality of tag sequences; finding the largest tag sequence of the plurality of tag sequences, if the plurality of tag sequences are section titles or text tags; forming a plurality of segment sections, if the plurality of tag sequences are not section titles or text tags; and
collecting the plurality of segment sections;processing the plurality of segment sections; obtaining an HTML markup of a segment section if the segment section is a text section; summarizing the HTML markup of the segment section; forming an HTML markup object structure from the summarized HTML markup; and further comprising the steps of; processing a plurality of tags in the HTML markup object structure; adding a VoiceXML audio tag from a paragraph or text earcon; creating java speech markup language (JSML) text for a text-to-speech (TTS) engine; creating a grammar from embedded tags; creating a VoiceXML prompt tag if a tag among the plurality of tags is a paragraph tag or a text tag; and creating a VoiceXML form tag.
-
-
18. A system for dynamically translating a Hypertext Markup Language (HTML) document to Voice eXtensible Markup Language (VoiceXML) form comprising:
-
a voice server for receiving a user request and, in response to the user request, making a Hypertext Transfer Protocol (HTTP) request; a voice session manager for receiving the HTTP request from the voice server and, in response to the HTTP request, accessing the HTML document, translating the HTML document to a VoiceXML document and sending the VoiceXML document to the voice server, so that the voice server can send the VoiceXML document to the user in an audible form; a document structure analyzer java server page (DSA JSP) for partitioning the HTML document into plurality of text sections and a plurality of link sections; a text summarization java server page (TS JSP) for performing summarization of the plurality of text sections of the HTML document; and a user profile java server page for interpreting user profile information stored in a database, including one or more of authentication information, bookmarks, a list of favorite Web sites, e-mail account information and user default options; wherein the DSA JSP differentiates between the plurality of text sections and the plurality of link sections by calculating a link density D1 of a section, where the section may be a link section if the link density D1 is greater than about 0.75, or otherwise the section may be a text section; wherein the link density D1 is given by the equation D1=(Hc−
KII)/Sc, where Hc is a number of non-tag characters in a section that appears inside HREF, a link tag in html, K is a weight value equal to about 5, Ii is a number of links within image maps in the section, and Sc is a total number of non-tag characters in the section. - View Dependent Claims (19, 20, 21)
-
Specification