Intelligent text-to-speech conversion

US 8,996,376 B2
Filed: 04/05/2008
Issued: 03/31/2015
Est. Priority Date: 04/05/2008
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method of converting text to speech, the method comprising:

selecting a document to be converted to speech, the selected document including base text and one or more links located within the base text;

parsing the selected document, wherein the parsing comprises;

resolving at least one of the one or more links in the selected document; and

retrieving pre-existing text from one or more documents obtained by said resolving;

appending at least a portion of the retrieved pre-existing text to the base text;

generating speech by converting to speech the base text and the portion of the retrieved pre-existing text appended to the base text; and

creating an audio file based on the converted text, wherein the audio file includes at least one audio cue configured to be beneficial to visually impaired listeners.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Techniques for improved text-to-speech processing are disclosed. The improved text-to-speech processing can convert text from an electronic document into an audio output that includes speech associated with the text as well as audio contextual cues. One aspect provides audio contextual cues to the listener when outputting speech (spoken text) pertaining to a document. The audio contextual cues can be based on an analysis of a document prior to a text-to-speech conversion. Another aspect can produce an audio summary for a file. The audio summary for a document can thereafter be presented to a user so that the user can hear a summary of the document without having to process the document to produce its spoken text via text-to-speech conversion.

887 Citations

18 Claims

1. A computer-implemented method of converting text to speech, the method comprising:
- selecting a document to be converted to speech, the selected document including base text and one or more links located within the base text;
  
  parsing the selected document, wherein the parsing comprises;
  
  resolving at least one of the one or more links in the selected document; and
  
  retrieving pre-existing text from one or more documents obtained by said resolving;
  
  appending at least a portion of the retrieved pre-existing text to the base text;
  
  generating speech by converting to speech the base text and the portion of the retrieved pre-existing text appended to the base text; and
  
  creating an audio file based on the converted text, wherein the audio file includes at least one audio cue configured to be beneficial to visually impaired listeners.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The computer-implemented method of claim 1, wherein the at least one audio cue is associated with a text element in the selected document.
  - 3. The computer-implemented method of claim 1, wherein the at least one audio cue is associated with a non-text element in the selected document.
  - 4. The computer-implemented method of claim 1, wherein parsing the selected document comprises:
    - identifying one or more text elements in the selected document, including;
      
      determining a first subset of the one or more identified text elements, wherein the first subset of the one or more identified text elements includes one or more spoken text elements; and
      
      determining a second subset of the identified text elements, wherein the second subset of the identified text elements includes one or more non-spoken text elements,wherein the second subset of the one or more identified text elements is excluded from the first subset of the identified text elements; and
      
      determining an order in which to speak the first subset of the one or more identified text elements.
  - 5. The computer-implemented method of claim 1, further comprising:
    - storing the created audio file at a host computer for use by a media management application operating on the host computer.
  - 6. The computer-implemented method of claim 5, further comprising:
    - copying the audio file from the host computer to a portable media player where the audio file is stored on the portable media player in a predetermined organization.
  - 7. The computer-implemented method of claim 1, wherein the selected document is selected from the group consisting of:
    - an audio file, a webpage, a PDF, a text file, an RSS feed, an e-mail, a list of e-mail headers, a list of hyperlinks, and metadata information.

8. A computer-implemented method of generating an audio summary for a document, the method comprising:
- parsing a document to extract metadata from the document;
  
  generating an audio summary for the parsed document based on the extracted metadata; and
  
  associating the audio summary with the parsed document, wherein the associating the audio summary to the parsed document includes at least embedding the audio summary into the parsed document.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The computer-implemented method of claim 8, wherein generating the audio summary for the parsed document comprises summarizing the parsed document based on textual information contained in the document.
  - 10. The computer-implemented method of claim 8, further comprising:
    - after associating the audio summary with the parsed document, receiving a user selection of the parsed document; and
      
      presenting the audio summary for the parsed document.
  - 11. The computer-implemented method of claim 10, wherein the parsed document is selected from the group consisting of:
    - an audio file, a webpage, a PDF, a text file, an RSS feed, an e-mail, a list of e-mail headers, a list of hyperlinks, and metadata information.
  - 12. The computer-implemented method of claim 10, wherein user selection of the parsed document occurs upon a mouse-over event or a mouse-click event.
  - 13. The computer-implemented method of claim 10, wherein user selection of the parsed document occurs when the parsed document is selected using a portable media player.
  - 14. The computer-implemented method of claim 8, wherein the audio summary includes audio content generated by converting at least a portion of the extracted metadata to speech.

15. A computer-implemented method of generating an audio summary for a document, the method comprising:
- parsing a document to extract metadata from the document;
  
  generating an audio summary for the parsed document based on the extracted metadata; and
  
  associating the audio summary with the parsed document by creating a software pointer from the parsed document to the audio summary, and embedding the software pointer into the parsed document.
- View Dependent Claims (16)
- - 16. The computer-implemented method of claim 15, wherein the audio summary includes audio content generated by converting at least a portion of the extracted metadata to speech.

17. A non-transitory computer readable storage medium including at least computer program code for converting text to speech, comprising:
- computer program code for selecting a document to be converted to speech, the selected document including base text and one or more links located within the base text;
  
  computer program code for parsing the selected document, wherein the computer program code for parsing comprises;
  
  computer program code for resolving at least one of the one or more links in the selected document; and
  
  computer program code for retrieving pre-existing text from one or more documents obtained by the said resolving;
  
  computer program code for appending at least a portion of the retrieved pre-existing text to the base text;
  
  computer program code for generating speech by converting to speech the base text and the portion of the retrieved pre-existing text appended to the base text; and
  
  computer program code for creating an audio file based on the converted text, wherein the audio file includes at least one audio cue configured to be beneficial to visually impaired listeners.
- View Dependent Claims (18)
- - 18. The non-transitory computer readable storage medium as recited in claim 17, further comprising:
    - computer program code for copying the audio file from a host computer to a portable media player where the audio file is stored on the portable media player in a predetermined organization.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Apple Inc.
Original Assignee
Apple Inc.
Inventors
Fleizach, Christopher Brian, Hudson, Reginald Dean
Primary Examiner(s)
COLUCCI, MICHAEL C

Application Number

US12/098,417
Publication Number

US 20090254345A1
Time in Patent Office

2,551 Days
Field of Search

381/56, 381/320, 715/239, 715/854, 715/763, 715/716, 704/260, 704/9, 704/270.1, 704/243, 704/235, 707/999.003, 709/206, 700/94, 379/257, 365/232
US Class Current

704/260
CPC Class Codes

G06F 40/205   Parsing

G10L 13/00   Speech synthesis; Text to s...

G10L 13/027   Concept to speech synthesis...

G10L 13/08   Text analysis or generation...

G10L 19/018   Audio watermarking, i.e. em...

Intelligent text-to-speech conversion

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

887 Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Intelligent text-to-speech conversion

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

887 Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links