Navigation and orientation tools for speech synthesis

US 10,649,726 B2
Filed: 08/16/2017
Issued: 05/12/2020
Est. Priority Date: 01/25/2010
Status: Active Grant

First Claim

Patent Images

1. A method for synchronizing speech output and display output of a text, said text being synthesized to the speech output, the method comprising:

a. receiving a text portion of the text, wherein display of an entirety of said text portion requires at least two text areas;

b. receiving a start event indicating a next text unit of said text portion to be displayed and synthesized, wherein the start event comprises moving a time indicator to a new position on a time line indicating a corresponding position of said next text unit in the text portion to be synthesized;

c. in response to the new position in a time line, calculating display parameters associated with said next text unit, wherein said display parameters are designated to synchronize, on the basis of phonemes, the speech output and the display output of said next text unit, said display parameters including;

1) a position of the next text unit in the text portion, and2) a position of the next text unit on a time line indicating a respective point in time of the next text unit in an entire playback time of the speech output of the text portion, wherein said entire playback time is calculated by multiplying an average time required to play back synthesized speech output of a single character by a number of total characters in the text portion;

d. synchronizing, on the basis of phonemes, the speech output and the display output of the next text unit, including;

displaying an indication of the next text unit according to said display parameters, executing a text to speech synthesis of the next text unit indicated by the new position on the time line, and outputting the speech output of the next text unit, said displaying including;

i. portraying a text indicator indicating the position of the next text unit in the text portion, andii. portraying said time indicator indicating the position of the next text unit on the time line; and

e. repeating steps (c)-(d) with a subsequent text unit following the next text unit, the subsequent text unit becoming the next text unit of (c) upon repetition thereof.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

TTS is a well known technology for decades used for various applications from Artificial Call centers attendants to PC software that allows people with visual impairments or reading disabilities to listen to written works on a home computer. However to date TTS is not widely adopted for PC and Mobile users for daily reading tasks such as reading emails, reading pdf and word documents, reading through website content, and for reading books. The present invention offers new user experience for operating TTS for day to day usage. More specifically this invention describes a synchronization technique for following text being read by TTS engines and specific interfaces for touch pads, touch and multi touch screens. Nevertheless this invention also describes usage of other input methods such as touchpad, mouse, and keyboard.

43 Citations

17 Claims

1. A method for synchronizing speech output and display output of a text, said text being synthesized to the speech output, the method comprising:
- a. receiving a text portion of the text, wherein display of an entirety of said text portion requires at least two text areas;
  
  b. receiving a start event indicating a next text unit of said text portion to be displayed and synthesized, wherein the start event comprises moving a time indicator to a new position on a time line indicating a corresponding position of said next text unit in the text portion to be synthesized;
  
  c. in response to the new position in a time line, calculating display parameters associated with said next text unit, wherein said display parameters are designated to synchronize, on the basis of phonemes, the speech output and the display output of said next text unit, said display parameters including;
  
  1) a position of the next text unit in the text portion, and2) a position of the next text unit on a time line indicating a respective point in time of the next text unit in an entire playback time of the speech output of the text portion, wherein said entire playback time is calculated by multiplying an average time required to play back synthesized speech output of a single character by a number of total characters in the text portion;
  
  d. synchronizing, on the basis of phonemes, the speech output and the display output of the next text unit, including;
  
  displaying an indication of the next text unit according to said display parameters, executing a text to speech synthesis of the next text unit indicated by the new position on the time line, and outputting the speech output of the next text unit, said displaying including;
  
  i. portraying a text indicator indicating the position of the next text unit in the text portion, andii. portraying said time indicator indicating the position of the next text unit on the time line; and
  
  e. repeating steps (c)-(d) with a subsequent text unit following the next text unit, the subsequent text unit becoming the next text unit of (c) upon repetition thereof.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method according to claim 1, wherein the text unit is selected from a group comprising:
    - a word, a character, a sentence, a line, a paragraph, a bookmark, and a page.
  - 3. The method according to claim 1, wherein said average time required to play back synthesized speech output of a single character is configurable according to a desired text playback rate.
  - 4. The method according to claim 1, wherein said text indicator includes a word indicator indicating a position of the next text unit for highlighting in a line.
  - 5. The method according to claim 1, wherein said calculated display parameters further includeelapsed time indicating a proportion of text that has already been processed compared to the entire playback time of the text portion, andremaining time indicating a proportion of text that has not been processed compared to the entire playback time of the text portion;
    - and wherein said displaying further includesiii. portraying the elapsed time and the remaining time on a display.
  - 6. The method according to claim 1, wherein step e further includes repeating steps c-d until a last word of said text portion being displayed and synthesized.

7. A method for synchronizing speech output and display output of a text, said text being synthesized to the speech output, the method comprising:
- a) providing a display of a time line indicating an entire playback time of the speech output of the text, wherein said entire playback time is calculated by multiplying an average time required to play back synthesized speech output of a single character by a number of total characters in the text;
  
  b) providing a display of a time indicator on the time line, wherein a position of the time indicator on the time line indicates a point in time within the entire playback time of the speech output of the text;
  
  c) synchronizing the display of a text unit within the text that is next to be synthesized to the speech output, constituting a next text unit, with the position of the time indicator on the time line such that changing the displayed next text unit will cause the time indicator to be moved to the point on the time line corresponding to the time of the speech output corresponding to that of the next text unit and changing the position of the time indicator on the time line will cause the displayed next text unit to be that which occurs at the indicated position on the time line;
  
  d) displaying a selected next text unit;
  
  e) feeding the selected next text unit to a text-to-speech engine and executing a text to speech synthesis of the selected next text unit, as indicated by the position of the time indicator on the time line, thereby generating a speech output of the next text unit;
  
  f) outputting said speech output, andg) repeating steps (d)-(f) with a subsequent text unit following the next text unit, the subsequent text unit becoming the selected next text unit of (d) upon repetition thereof.
- View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
- - 8. The method of claim 7, wherein, in step (d), the next text unit that is displayed is set by moving the time indicator on the time line to a desired point in time for commencement of speech output.
  - 9. The method according to claim 8, wherein said step (g) comprises repeating steps (d)-(f) until a last text unit of said text is displayed and synthesized to speech output.
  - 10. The method according to claim 8, wherein the moving of the time indicator is accomplished by dragging the time indicator to a selected position on the time line.
  - 11. The method of claim 7, wherein display of an entirety of said text requires at least two text areas.
  - 12. The method according to claim 7, wherein the text unit is selected from the group consisting of a word, a character, a sentence, a line, a paragraph, a bookmark, and a page.
  - 13. The method according to claim 7, wherein said average time required to play back synthesized speech output of a single character is configurable according to a desired text playback rate.
  - 14. The method according to claim 7, wherein, in step (d), the indication of the position of the desired next text unit comprises highlighting the next text unit in a displayed text portion.
  - 15. The method according to claim 7, further including displaying an elapsed time indicating a proportion of the entire playback time for the text that has already been processed compared to the entire playback time of the text, and remaining time indicating a proportion of entire playback time for the text that has not been processed compared to the entire playback time of the text.
  - 16. The method according to claim 7, wherein the repetition of step (d) with the new selected next text is accompanied by an animated progressive movement of at least the time indicator on the time line from the last text unit output to speech to the selected next text unit.
  - 17. The method according to claim 7, wherein, said step (d) further includes displaying a line indicator indicating a position of the line where the selected next text unit resides.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Dror Kalisky, Sharon Carmel
Original Assignee
Dror Kalisky, Sharon Carmel
Inventors
Kalisky, Dror, Carmel, Sharon
Primary Examiner(s)
Wozniak, James S

Application Number

US15/678,615
Publication Number

US 20180032309A1
Time in Patent Office

1,000 Days
Field of Search

704258, 704260, 704267, 704270, 434167
US Class Current
CPC Class Codes

G06F 3/04883   for inputting data by handw...

G06F 3/167   Audio in a user interface, ...

G09B 5/062   Combinations of audio and p...

G10L 13/00   Speech synthesis; Text to s...

G10L 13/027   Concept to speech synthesis...

Navigation and orientation tools for speech synthesis

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

43 Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Navigation and orientation tools for speech synthesis

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

43 Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links