Hyper text mark up language document to speech converter

US 6,115,686 A
Filed: 04/02/1998
Issued: 09/05/2000
Est. Priority Date: 04/02/1998
Status: Expired due to Term

First Claim

Patent Images

1. A computer system for converting a hyper text markup language (HTML) document into audio signals comprising:

an HTML parser receiving data of an HTML formatted document for parsing out content text, HTML text tags that structure said content text and control rules used only for translating said received data into sound,an HTML to speech (HTS) control parser for parsing out of said control rules for converting said received data into sound, said HTS control parser modifying entries in one or more of a tag mapping table, an audio data table, a parameter set table, an enunciation modification table and a terminology translation table depending on each of said parsed control rules,a text normalizer for modifying enunciation of each text string of said content text for which said enunciation modification table has an entry, according to an enunciation modification indicated in said respective enunciation table entry, and for translating each text string of said content text for which said terminology translation table has an entry, according to a translation indicated in said respective terminology translation table entry,a tag converter for modifying an intonation and a speed of audio generated from said content text encapsulated by, and for inserting audio data at, each text tag for which said tag mapping table has an entry, as specified in corresponding entries of said parameter set table and said audio data table pointed to by pointers in entries of said tag mapping table indexed by each of said text tags, respectively, anda text to speech converter for converting said content text, as modified, translated and appended by said text normalizer and said tag converter, to speech audio.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system for converting a hyper text markup language (HTML) document to speech includes an HTML parser, an HTML to speech (HTS) control parser, a tag converter, a text normalizer and a TTS converter. The HTML parser receives data of an HTML formatted document and parses out content text, HTML text tags that structure the content text and control rules used only for translating the received data into sound. The HTS control parser parses control rules for converting the received data into sound. The HTS control parser modifies entries in one or more of a tag mapping table, an audio data table, a parameter set table, an enunciation modification table and a terminology translation table depending on each of the parsed control rules. The text normalizer modifies enunciation of each text string of the content text of the HTML document for which the enunciation modification table has an entry, according to an enunciation modification indicated in the respective enunciation table entry. The text normalizer also translates each text string of the content text of the HTML document for which the terminology translation table has an entry, according to a translation indicated in the respective terminology translation table entry. The tag converter modifies an intonation and a speed of audio generated from the content text of the HTML document encapsulated by each text tag for which the tag mapping table has an entry, as specified in corresponding entries of the parameter set table pointed to by pointers in the tag mapping table. The tag converter also inserts audio for each text tag for which the tag mapping table has an entry, as specified in corresponding entries of the audio data table pointed to by entries of the tag mapping table. The TTS converter converts the content text of the HTML document, as modified, translated and appended by the text normalizer and the tag converter, to speech audio.

322 Citations

15 Claims

1. A computer system for converting a hyper text markup language (HTML) document into audio signals comprising:
- an HTML parser receiving data of an HTML formatted document for parsing out content text, HTML text tags that structure said content text and control rules used only for translating said received data into sound,an HTML to speech (HTS) control parser for parsing out of said control rules for converting said received data into sound, said HTS control parser modifying entries in one or more of a tag mapping table, an audio data table, a parameter set table, an enunciation modification table and a terminology translation table depending on each of said parsed control rules,a text normalizer for modifying enunciation of each text string of said content text for which said enunciation modification table has an entry, according to an enunciation modification indicated in said respective enunciation table entry, and for translating each text string of said content text for which said terminology translation table has an entry, according to a translation indicated in said respective terminology translation table entry,a tag converter for modifying an intonation and a speed of audio generated from said content text encapsulated by, and for inserting audio data at, each text tag for which said tag mapping table has an entry, as specified in corresponding entries of said parameter set table and said audio data table pointed to by pointers in entries of said tag mapping table indexed by each of said text tags, respectively, anda text to speech converter for converting said content text, as modified, translated and appended by said text normalizer and said tag converter, to speech audio.

2. In a hyper text markup language (HTML) text to speech (HTS) control parser, a method for converting data of an HTML document to speech comprising the steps of:
- parsing one or more intonation/speed modification rules that specify intonation and speed modification parameters for generating speech encapsulated by particular text tags of an HTML document and one or more rules that specify audio data to be inserted for particular text tags of an HTML document, and generating a tag mapping table mapping said text tags to corresponding tag identifiers, a parameter set table of entries containing parameter sets pointed to by pointers in corresponding tagged entries of said tag mapping table, and an audio data table of entries containing audio data pointed to by pointers in corresponding tagged entries of said tag mapping table, according to said parsed intonation/speed modification and audio data rules, respectively, andparsing one or more rules for modifying enunciation of particular strings of content text of an HTML document and one or more rules for translating particular strings of said content text of an HTML document to terms that can be converted to speech by a text to speech converter, and generating an enunciation modification table mapping particular ones of said particular strings to replacement enunciation strings and a terminology translation table mapping particular ones of said particular strings to replacement terminology strings, according to said parsed enunciation modification and terminology translation rules, respectively.

3. In a parcer and text normalizer, a method for converting data of a hyper text markup language (HTML) document to speech audio comprising the steps of:
- parsing one or more HTML to speech (HTS) control rules, including generating a tag mapping table entry indexed by an HTML text tag specified in an audio data rule and containing a tag identifier unique to said HTML text tag, and generating an audio data table entry, pointed to by said entry of said tag mapping table indexed by said tag specified in said audio data rule, and containing audio data indicated by said audio data rule,replacing each instance of a string of one or more content text characters of an HTML document, for which an enunciation modification table has an entry, with an enunciation replacement string of text characters indicated in said entry, said enunciation replacement string being converted to speech audio of a particular one of multiple permissible enunciations of said replaced string of content text characters, andreplacing each instance of a second string of content text characters of an HTML document, for which a terminology translation table has an entry, with a translation string of text characters in said entry, said translation string of text characters being convertible to speech audio, and at least part of said second replaced string of content text characters being unconvertible to speech audio, by a predetermined text to speech converter.

4. In a tag converter for intonation modification and audio data insertion, a method for converting data of a hyper text markup language (HTML) document comprising the steps of:
- modifying the intonation and speed of speech audio generated for content text encapsulated by, and inserting audio data at, each instance of an HTML text tag for which a tag mapping table has an entry, an indication to access a parameter set table, and a first pointer to a particular entry of said parameter set table, according to intonation and speed parameters specified in said entry of said parameter set table pointed to by said first pointer, andgenerating a particular audio sound for each instance of an HTML text tag, for which said tag mapping table has an entry, an indication to access an audio data table, and a second pointer to a particular entry of said audio data table, from audio data specified in said entry of said audio table pointed to by said second pointer.

5. A method for converting data of a hyper text markup language (HTML) document to speech comprising the steps of:
- parsing one or more HTML to speech (HTS) control rules, said step of parsing comprising the steps of;
  
  in response to an intonation/speed rule, generating a tag mapping table entry indexed by an HTML text tag specified in said intonation/speed rule and containing a tag identifier unique to said HTML text tag, and generating a parameter set table entry, pointed to by said entry of said tag mapping table indexed by said tag specified in said intonation/speed rule, and containing a set of intonation and speed parameters indicated by said intonation/speed rule,in response to an audio data rule, generating a tag mapping table entry indexed by an HTML text tag specified in said audio data rule and containing a tag identifier unique to said HTML text tag, and generating an audio data table entry, pointed to by said entry of said tag mapping table indexed by said tag specified in said audio data rule, and containing audio data indicated by said audio data rule,in response to an enunciation rule, generating an enunciation table entry indexed by a text string in an HTML document and containing at least a replacement text string, that is converted to a particular audio sound of one of plural enunciations of said index text string, indicated by said enunciation rule, andin response to a terminology translation rule, generating a terminology translation table entry indexed by a text string in an HTML document that cannot be converted to an audio sound by a predetermined text to speech converter and containing a replacement text string that can be converted to an audio sound by said predetermined text to speech converter.
- View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 6. The method of claim 5 further comprising the step of extracting said HTS control rules from HTML comment text of an HTML document.
  - 7. The method of claim 5 further comprising the step of reading said HTS control rules independently from HTML document data.
  - 8. The method of claim 5 further comprising the steps of:
    - parsing data of an HTML document,in response to parsing an HTML text tag, attempting to index one or more entries of said tag mapping table using a particular parsed HTML text tag that encapsulates data yet to be parsed,using a pointer in each indexed tag mapping table entry, to identify entries of said intonation/speed table and said audio data table indicated by said indexed tag mapping table entries,modifying an intonation and speed by each set of parameters contained in each identified intonation/speed table entry, andinserting audio data contained in each identified audio table entry.
  - 9. The method of claim 8 further comprising the steps of:
    - in response to parsing a start HTML text tag, pushing said start HTML text tag onto a stack, andin response to parsing an end HTML text tag, popping an HTML text tag from said stack,wherein said particular parsed HTML text tag used in said step of attempting to index is an HTML text tag at a top of said stack.
  - 10. The method of claim 9 further comprising the steps of:
    - scanning content text of said HTML document,replacing each content text string of said HTML document that matches one of said text strings that indexes one of said entries in said terminology translation table with said replacement text string contained in said corresponding terminology translation table entry indexed by said matching text string, andreplacing each content text string of said HTML document that matches one of said text strings that indexes one of said entries of said enunciation table with said replacement text string contained in said corresponding enunciation translation table entry indexed by said matching text string.
  - 11. The method of claim 10 wherein a particular entry of said enunciation table further comprises a candidate text string, and wherein said content text string of said HTML document is only replaced with said replacement text string contained in said particular enunciation table entry if said content text string is contained in a second content text string of said HTML document that matches said candidate text string.
  - 12. The method of claim 11 further comprising the steps of:
    - generating an audible sound including sound generated from said audio data and speech audio generated by converting content text of said HTML document and said replacement text strings, if any, to speech audio according to said intonation and speed parameters.
  - 13. The method of claim 5 further comprising the steps of:
    - parsing data of an HTML document,scanning content text of said HTML document,replacing each content text string of said HTML document that matches one of said text strings that indexes one of said entries in said terminology translation table with said replacement text string contained in said corresponding terminology translation table entry indexed by said matching text string, andreplacing each content text string of said HTML, document that matches one of said text strings that indexes one of said entries of said enunciation table with said replacement text string contained in said corresponding enunciation translation table entry indexed by said matching text string.
  - 14. The method of claim 13 wherein a particular entry of said enunciation table further comprises a candidate text string, and wherein said content text string of said HTML document is only replaced with said replacement text string contained in said particular enunciation table entry if said content text string is contained in a second content text string of said HTML document that matches said candidate text string.

15. In a text normalizer and tag converter, a method for converting data of a hyper text markup language (HTML) document to speech audio comprising the steps of:
- replacing each instance of a string of one or more content text characters of an HTML document, for which an enunciation modification table has an entry, with an enunciation replacement string of text characters indicated in said entry, said enunciation replacement string being converted to speech audio of a particular one of multiple permissible enunciations of said replaced string of content text characters,replacing each instance of a second string of content text characters of an HTML document, for which a terminology translation table has an entry, with a translation string of text characters in said entry, said translation string of text characters being convertible to speech audio, and at least part of said second replaced string of content text characters being unconvertible to speech audio, by a predetermined text to speech converter, andinserting audio data at each text tag for which a tag mapping table has an entry, as specified in corresponding entries of an audio data table.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Industrial Technology Research Institute
Original Assignee
Industrial Technology Research Institute
Inventors
Chung, Jin-Chin, Chung, Chung-Ping, Hwang, Shaw-Hwa
Primary Examiner(s)
Hudspeth, David R.
Assistant Examiner(s)
Lerner, Martin

Application Number

US09/053,629
Time in Patent Office

887 Days
Field of Search

704/258, 704/260, 704/266, 704/267, 704/269, 704/270, 704/275, 379/88.16, 707/513
US Class Current

704/260
CPC Class Codes

G10L 13/08 Text analysis or generation...

Hyper text mark up language document to speech converter

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

322 Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Hyper text mark up language document to speech converter

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

322 Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links