SYSTEMS AND METHODS FOR TEXT NORMALIZATION FOR TEXT TO SPEECH SYNTHESIS

US 20100082348A1
Filed: 09/29/2008
Published: 04/01/2010
Est. Priority Date: 09/29/2008
Status: Active Grant

First Claim

Patent Images

1. A method for normalizing a text string, the method comprising:

for each non-alphabetical character in the text string, identifying at least one alphabetical character or character string that corresponds to the non-alphabetical character;

creating a set of test strings, each of which being a version of the text string that is modified to include a different one of the identified at least one alphabetical character or character string instead of the non-alphabetical character;

retrieving a plurality of probabilities, each of which correspond to a probability of occurrence of a different one of the test strings; and

substituting a test string having the highest probability of occurrence for the text string.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Algorithms for synthesizing speech used to identify media assets are provided. Speech may be selectively synthesized form text strings associated with media assets. A text string may be normalized and its native language determined for obtaining a target phoneme for providing human-sounding speech in a language (e.g., dialect or accent) that is familiar to a user. The algorithms may be implemented on a system including several dedicated render engines. The system may be part of a back end coupled to a front end including storage for media assets and associated synthesized speech, and a request processor for receiving and processing requests that result in providing the synthesized speech. The front end may communicate media assets and associated synthesized speech content over a network to host devices coupled to portable electronic devices on which the media assets and synthesized speech are played back.

371 Citations

10 Claims

1. A method for normalizing a text string, the method comprising:
- for each non-alphabetical character in the text string, identifying at least one alphabetical character or character string that corresponds to the non-alphabetical character;
  
  creating a set of test strings, each of which being a version of the text string that is modified to include a different one of the identified at least one alphabetical character or character string instead of the non-alphabetical character;
  
  retrieving a plurality of probabilities, each of which correspond to a probability of occurrence of a different one of the test strings; and
  
  substituting a test string having the highest probability of occurrence for the text string.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1 wherein identifying at least one alphabetical character or character string comprises consulting a table listing all alphabetical characters or character strings that potentially correspond to each known non-alphabetical character.
  - 3. The method of claim 1 further comprising determining whether the test string is in vocabulary.
  - 4. The method of claim 3 wherein determining whether the test string is in vocabulary comprises consulting a table that includes a list of words that are known in all known languages.
  - 5. The method of claim 1 wherein the non-alphabetical character comprises a number, a punctuation mark or any other symbol.
  - 6. The method of claim 1 further comprising separating the text string into distinct words.
  - 7. The method of claim 6 wherein the identifying at least one alphabetical character or character string, the creating a set of test strings, the retrieving a plurality of probabilities, and the substituting a test string for the text string are implemented on a word by word basis.
  - 8. The method of claim 1 wherein the identifying at least one alphabetical character or character string, the creating a set of test strings, the retrieving a plurality of probabilities, and the substituting a test string for the text string are repeated until there are no non-alphabetical characters remaining in the text string.
  - 9. The method of claim 1 wherein, if the non-alphabetical character is one of several unique characters, the non-alphabetical character is replaced with a predetermined character or set of characters without having to create the set of test strings or having to retrieve the plurality of probabilities.
  - 10. The method of claim 1 wherein, if the text string is a recognized string, the text string is replaced with a predetermined text string without having to create the set of test strings or having to retrieve the plurality of probabilities.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Apple Inc.
Original Assignee
Apple Inc.
Inventors
Bellegarda, Jerome, Naik, Devang, Silverman, Kim, Lenzo, Kevin

Granted Patent

US 8,355,919 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/260
CPC Class Codes

G10L 13/08 Text analysis or generation...

SYSTEMS AND METHODS FOR TEXT NORMALIZATION FOR TEXT TO SPEECH SYNTHESIS

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

371 Citations

10 Claims

Specification

Solutions

Use Cases

Quick Links

SYSTEMS AND METHODS FOR TEXT NORMALIZATION FOR TEXT TO SPEECH SYNTHESIS

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

371 Citations

10 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links