Methods for controlling the generation of speech from text representing names and addresses
First Claim
1. A method of synthesizing speech from a series of characters representing an address, the series of characters including a plurality of address components including a street address component, each address component including at least one word where a word is any mixture of printable nonblank characters, the method comprising the steps of:
- analyzing a first word of a street address component to determine if the first word includes only digits;
if it is determined that the first word includes only digits analyzing a second word in the street address component to determine if the second word includes only alphabetic characters, or is a digit string followed by at least one letter;
if it is determined that the first word includes only digits, and the second word includes only alphabetic characters, inserting, between the first and second words, a prosodic boundary including a pause having a first duration;
if it is determined that the first word includes only digits, and the second word includes digits followed by at least one letter, inserting, between the first and second words, a prosodic boundary including a pause having a second duration that is longer than the first duration; and
generating speech from the series of characters representing the address and any inserted prosodic boundaries.
6 Assignments
0 Petitions
Accused Products
Abstract
Improved automated synthesis of human audible speech from text is disclosed. Performance enhancement of the underlying text comprehensibility is obtained through prosodic treatment of the synthesized material, improved speaking rate treatment, and improved methods of spelling words or terms for the system user. Prosodic shaping of text sequences appropriate for the discourse in large groupings of text segments, with prosodic boundaries developed to indicate conceptual units within the text groupings, is implemented in a preferred embodiment.
277 Citations
12 Claims
-
1. A method of synthesizing speech from a series of characters representing an address, the series of characters including a plurality of address components including a street address component, each address component including at least one word where a word is any mixture of printable nonblank characters, the method comprising the steps of:
-
analyzing a first word of a street address component to determine if the first word includes only digits; if it is determined that the first word includes only digits analyzing a second word in the street address component to determine if the second word includes only alphabetic characters, or is a digit string followed by at least one letter; if it is determined that the first word includes only digits, and the second word includes only alphabetic characters, inserting, between the first and second words, a prosodic boundary including a pause having a first duration; if it is determined that the first word includes only digits, and the second word includes digits followed by at least one letter, inserting, between the first and second words, a prosodic boundary including a pause having a second duration that is longer than the first duration; and generating speech from the series of characters representing the address and any inserted prosodic boundaries. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method of synthesizing speech from a first series of characters representing a name and a second series of characters representing an address, the series of characters representing the name and the series of characters representing the address each including at least one word where a word is any mixture of alphanumeric nonblank characters, the method comprising the steps of:
-
determining, as a function of the complexity of the represented name, the length of a pause to be inserted between the series of characters representing the name and the series of characters representing the address; inserting a pause of the determined length between the series of characters representing the name and the series of characters representing the address; and generating speech from the series of characters representing the name and the address as a function of the inserted pause. - View Dependent Claims (11, 12)
-
Specification