Method and system for synchronizing audio and visual presentation in a multi-modal content renderer

US 6,745,163 B1
Filed: 09/27/2000
Issued: 06/01/2004
Est. Priority Date: 09/27/2000
Status: Expired due to Term

First Claim

Patent Images

1. A process for rendering a document containing first, second and third text, first and second HTML tags and first and second types of non-HTML tags, said process comprising the steps of:

reading said document to determine that said first text is associated with said first HTML tag and the first type of non-HTML tag, said first type of non-HTML tag indicating that said first text should be rendered visually but not audibly, and in response to said first type of non-HTML tag, rendering said first text visually but not audibly, and in response to said first HTML tag, said first text is rendered visually in accordance with said first HTML tag;

reading said document to determine that said second text is associated with the second type of non-HTML tag, said second type of non-HTML tag indicating that said second text should be rendered audibly but not visually, and in response, rendering said second text audibly but not visually; and

reading said document to determine that said third text is associated with said second HTML tag but is not associated with either said first type of non-HTML tag or said second type of non-HTML tag, and in response, rendering said third text both visually and audibly, and in response to said second type of HTML tag, said third text is rendered visually in accordance with said second HTML tag.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method for a multi-modal browser/renderer that simultaneously renders content visually and verbally in a synchronized manner are provided without having the server applications change. The system and method receives a document via a computer network, parses the text in the document, provides an audible component associated with the text, simultaneously transmits to output the text and the audible component. The desired behavior for the renderer is that when some section of that content is being heard by the user, that section is visible on the screen and, furthermore, the specific visual content being audibly rendered is somehow highlighted visually. In addition, the invention also reacts to input from either the visual component or the aural component. The invention also allows any application or server to be accessible to someone via audio instead of visual means by having the browser handle the Embedded Browser Markup Language (EBML) disclosed herein so that it is audibly read to the user. Existing EBML statements can also be combined so that what is audibly read to the user is related to, but not identical to, the EBML text. The present invention also solves the problem of synchronizing audio and visual presentation of existing content via markup language changes rather than by application code changes.

Citations

22 Claims

1. A process for rendering a document containing first, second and third text, first and second HTML tags and first and second types of non-HTML tags, said process comprising the steps of:
- reading said document to determine that said first text is associated with said first HTML tag and the first type of non-HTML tag, said first type of non-HTML tag indicating that said first text should be rendered visually but not audibly, and in response to said first type of non-HTML tag, rendering said first text visually but not audibly, and in response to said first HTML tag, said first text is rendered visually in accordance with said first HTML tag;
  
  reading said document to determine that said second text is associated with the second type of non-HTML tag, said second type of non-HTML tag indicating that said second text should be rendered audibly but not visually, and in response, rendering said second text audibly but not visually; and
  
  reading said document to determine that said third text is associated with said second HTML tag but is not associated with either said first type of non-HTML tag or said second type of non-HTML tag, and in response, rendering said third text both visually and audibly, and in response to said second type of HTML tag, said third text is rendered visually in accordance with said second HTML tag.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. A process as set forth in claim 1 wherein said third text is associated only with HTML tags such that an HTML web browser would render said third text visually but not audibly.
  - 3. A process as set forth in claim 1 wherein by default the absence of said first and second types of non-HTML tags in association with said third text indicates that said third text should be rendered both visually and audibly.
  - 4. A process as set forth in claim 1 wherein said first type of non-HTML tag comprises a starting tag portion and an ending tag portion which enclose said first text and said first HTML tag associated with said first text such that said first text is rendered visually but not audibly.
  - 5. A process as set forth in claim 1 wherein said second type of non-HTML tag comprises a starting tag portion and an ending tag portion which enclose said second text such that said second text is rendered audibly but not visually.
  - 6. A process as set forth in claim 1 wherein said second text is rendered audibly literally corresponding to said second text, and said third text is rendered audibly literally corresponding to said third text.
  - 7. A process as set forth in claim 1 wherein said third text is rendered audibly and visually synchronously, and as each word of said third text is rendered audibly, said each word is highlighted visually.
  - 8. A process as set forth in claim 1 further comprising the step of parsing said document to separate text to be rendered audibly from text to be rendered visually, before the steps of rendering said first, second and third text.
  - 9. A process as set forth in claim 1 wherein the steps of reading said document are performed by a browser.

10. A system for rendering a document containing first, second and third text, first and second HTML tags and first and second types of non-HTML tags, said system comprising:
- means for reading said document to determine that said first text is associated with said first HTML tag and the first type of non-HTML tag, said first type of non-HTML tag indicating that said first text should be rendered visually but not audibly, and in response to said first type of non-HTML tag, rendering said first text visually but not audibly, and in response to said first HTML tag, said first text is rendered visually in accordance with said first HTML tag;
  
  means for reading said document to determine that said second text is associated with the second type of non-HTML tag, said second type of non-HTML tag indicating that said second text should be rendered audibly but not visually, and in response, rendering said second text audibly but not visually; and
  
  means for reading said document to determine that said third text is associated with said second HTML tag but is not associated with either said first type of non-HTML tag or said second type of non-HTML tag, and in response, rendering said third text both visually and audibly, and in response to said second type of HTML tag, said third text is rendered visually in accordance with said second HTML tag.

11. A computer program product for rendering a document containing first, second and third text, first and second HTML tags and first and second types of non-HTML tags, said computer program product comprising:
- a computer readable medium;
  
  first program instruction means for reading said document to determine that said first text is associated with said first HTML tag and the first type of non-HTML tag, said first type of non-HTML tag indicating that said first text should be rendered visually but not audibly, and in response to said first type of non-HTML tag, rendering said first text visually but not audibly, and in response to said first HTML tag, said first text is rendered visually in accordance with said first HTML tag;
  
  second program instruction means for reading said document to determine that said second text is associated with the second type of non-HTML tag, said second type of non-HTML tag indicating that said second text should be rendered audibly but not visually, and in response, rendering said second text audibly but not visually; and
  
  third program instruction means for reading said document to determine that said third text is associated with said second HTML tag but is not associated with either said first type of non-HTML tag or said second type of non-HTML tag, and in response, rendering said third text both visually and audibly, and in response to said second type of HTML tag, said third text is rendered visually in accordance with said second HTML tag; and
  
  wherein said first, second and third program instruction means are recorded on said medium.

12. A process for rendering a document containing first, second and third text and first and second types of tags, said process comprising the steps of:
- reading said document to determine that said first text is associated with the first type of tag, said first type of tag indicating that said first text should be rendered visually but not audibly, and in response, rendering said first text visually but not audibly;
  
  reading said document to determine that said second text is associated with the second type of tag, said second type of tag indicating that said second text should be rendered audibly but not visually, and in response, rendering said second text audibly but not visually; and
  
  reading said document to determine that said third text should be rendered both visually and audibly, and in response, rendering said third text both visually and audibly.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20)
- - 13. A process as set forth in claim 12 wherein said third text as associated with HTML tags such that an HTML web browser would render said third text visually but not audibly.
  - 14. A process as set forth in claim 12 wherein said third text is associated with HTML tags and is rendered visually and audibly in accordance with said HTML tags.
  - 15. A process as set forth in claim 12 wherein said document also includes HTML tags associated with said first and third text, and said web browser renders said first and third text visually in accordance with said HTML tags.
  - 16. A process as set forth in claim 15 wherein said first type of tag comprises a starting tag portion and an ending tag portion which enclose said first text and the HTML tags associated with said first text such that said first text is rendered visually but not audibly.
  - 17. A process as set forth in claim 12 wherein said first tag is not an HTML tag and said second tag is not an HTML tag.
  - 18. A process as set forth in claim 12 wherein said second text is rendered audibly literally corresponding to said second text, and said third text is rendered audibly literally corresponding to said third text.
  - 19. A process as set forth in claim 12 wherein said first text is rendered audibly and visually synchronously, and as each word of said first text is rendered audibly, said each word is highlighted visually.
  - 20. A process as set forth in claim 12 further comprising the step of parsing said document to separate text to be rendered audibly from text to be rendered visually, before the steps of rendering said first, second and third text.

21. A computer program product for rendering a document containing first, second and third text and first and second types of tags, said program product comprising:
- a computer readable medium;
  
  first program instructions for reading said document to determine that said first text is associated with the first type of tag, said first type of tag indicating that said first text should be rendered visually but not audibly, and in response, rendering said first text visually but not audibly;
  
  second program instructions for reading said document to determine that said second text is associated with the second type of tag, said second type of tag indicating that said second text should be rendered audibly but not visually, and in response, rendering said second text audibly but not visually; and
  
  third program instructions for reading said document to determining that said third text should be rendered both visually and audibly, and in response, rendering said third text both visually and audibly; and
  
  wherein said first, second and third program instructions are recorded on said medium.

22. A system for rendering a document containing first, second and third text and first and second types of tags, said system comprising:
- means for reading said document to determine that said first text is associated with the first type of tag, said first type of tag indicating that said first text should be rendered visually but not audibly, and in response, rendering said first text visually but not audibly;
  
  means for reading said document to determine that said second text is associated with the second type of tag, said second type of tag indicating that said second text should be rendered audibly but not visually, and in response, rendering said second text audibly but not visually; and
  
  means for reading said document to determining that said third text should be rendered both visually and audibly, and in response, rendering said third text both visually and audibly.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Pritko, Steven M., Hennessy, James P., Feustel, Stephen V., Howland, Michael J., Brocious, Larry A.
Primary Examiner(s)
ABEBE, DANIEL DEMELASH

Application Number

US09/670,800
Time in Patent Office

1,343 Days
Field of Search

704/260, 704/258, 704/270, 704/275, 704/271, 704/276, 704/272, 704/277, 704/278, 707/513
US Class Current

704/260
CPC Class Codes

G10L 13/00 Speech synthesis; Text to s...

Method and system for synchronizing audio and visual presentation in a multi-modal content renderer

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

22 Claims

Specification

Solutions

Use Cases

Quick Links

Method and system for synchronizing audio and visual presentation in a multi-modal content renderer

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

22 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links