Method and apparatus for identifying prosodic word boundaries

US 7,263,488 B2
Filed: 05/07/2001
Issued: 08/28/2007
Est. Priority Date: 12/04/2000
Status: Expired due to Fees

First Claim

Patent Images

1. A method of identifying prosody for a synthesized speech segment that is formed from a string of lexical words, the method comprising:

converting the string of lexical words into a string of prosodic words through steps comprising dividing at least one lexical word into smaller prosodic words, each prosodic word comprising at least one lexical word and the string of prosodic words having different word boundaries than the string of lexical words; and

identifying the prosody from the string of prosodic words.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and computer-readable medium are provided that identify prosodic word boundaries for a text. If the text is unsegmented, it is first segmented into lexical words. The lexical words are then converted into prosodic words using an annotated lexicon to divide large lexical words into smaller words and a model to combine the lexical words and/or the smaller words into larger prosodic words. The boundaries of the resulting prosodic words are used to set the prosody for the synthesized speech.

65 Citations

View as Search Results

27 Claims

1. A method of identifying prosody for a synthesized speech segment that is formed from a string of lexical words, the method comprising:
- converting the string of lexical words into a string of prosodic words through steps comprising dividing at least one lexical word into smaller prosodic words, each prosodic word comprising at least one lexical word and the string of prosodic words having different word boundaries than the string of lexical words; and
  
  identifying the prosody from the string of prosodic words.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1 wherein dividing a lexical word into smaller prosodic words comprises accessing an annotated lexicon to determine how to divide the lexical word into smaller prosodic words.
  - 3. The method of claim 1 wherein converting the string of lexical words into a string of prosodic words further comprises:
    - dividing at least one lexical word in the string of lexical words into smaller prosodic words to form a modified string; and
      
      combining at least two words in the modified string into a prosodic word.
  - 4. The method of claim 1 wherein identifying the prosody from the string of prosodic words comprises identifying at least one prosodic feature from the set of prosodic features consisting of pitch contour, duration, pauses, word initial, word middle and word end.
  - 5. The method of claim 1 wherein converting the string of lexical words into a string of prosodic words further comprises concatenating at least two lexical words in the string of lexical words to form a prosodic word in the string of prosodic words.
  - 6. The method of claim 5 wherein combining at least two lexical words comprises:
    - identifying at least one category for each lexical word; and
      
      determining whether to concatenate the two lexical words based on the categories of the lexical words.
  - 7. The method of claim 6 wherein determining whether to concatenate the two lexical words comprises applying the categories of the lexical words to a classification and regression tree.
  - 8. The method of claim 6 wherein determining whether to concatenate the two lexical words comprises examining a probability that describes the likelihood that the lexical words form a prosodic word given the categories.

9. A method of training a model for converting a string of lexical words into a string of prosodic words, the method comprising:
- annotating a text comprising the string of lexical words with prosodic word boundaries based on a training speech signal produced by the recitation of the string of lexical words;
  
  determining that a pair of lexical words forms a single prosodic word based on the prosodic word boundary annotations;
  
  identifying categories for the pair of lexical words; and
  
  training the model based on the determination that the pair of lexical words forms a single prosodic word and the categories for the pair of lexical words.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18)
- - 10. The method of claim 9 wherein training the model comprises training a classification and regression tree.
  - 11. The method of claim 9 wherein training the model comprises training a statistical model.
  - 12. The method of claim 11 wherein training a statistical model comprises:
    - identifying a set of categories for each pair of lexical words in the strings of lexical words;
      
      producing a category count for each set of categories by counting the number of pairs of lexical words for which the set of categories was identified;
      
      producing a prosodic word count for each set of categories by counting the number of pairs of lexical words that were determined to form a single prosodic word and for which the set of categories was identified; and
      
      using the prosodic word count and the category count to train the statistical model.
  - 13. The method of claim 12 further comprising using a weighting function with the prosodic word count and the category count to train the statistical model.
  - 14. The method of claim 13 wherein the weighting function gives preference to sets of categories that have a high category count.
  - 15. The method of claim 9 further comprising annotating a lexicon to indicate how to divide at least one lexical word into multiple prosodic words.
  - 16. The method of claim 15 wherein annotating a lexicon comprises:
    - removing words with more than a selected number of characters from a lexicon to form a short-word lexicon; and
      
      segmenting each removed word based on words in the short-word lexicon to produce smaller words.
  - 17. The method of claim 16 wherein annotating the lexicon further comprises:
    - combining at least some of smaller words to form combined words, the combined words and the smaller words that are not combined forming prosodic words; and
      
      annotating the lexicon based on the prosodic words.
  - 18. The method of claim 17 wherein combining at least some of the smaller words comprises using the model to convert the smaller words into combined words.

19. A computer-readable storage medium storing computer-executable instructions for causing a computer to perform steps comprising:
- identifying lexical words in a string of characters;
  
  identifying prosodic words from the lexical words by concatenating at least two lexical words on the basis of a model wherein concatenating at least two lexical words on the basis of a model comprises;
  
  determining at least one category for each lexical word;
  
  applying the categories to the model to determine whether to concatenate the lexical words into a prosodic word; and
  
  using the prosodic words when setting the prosody for synthesized speech formed from the string of characters.
- View Dependent Claims (20, 21, 22, 23, 24)
- - 20. The computer-readable storage medium of claim 19 wherein the model comprises a statistical model.
  - 21. The computer-readable storage medium of claim 19 wherein the model comprises a classification and regression tree.
  - 22. The computer-readable storage medium of claim 19 wherein the step of identifying prosodic words comprises:
    - dividing at least one lexical word into at least two prosodic words and replacing the lexical word with the prosodic words to form an intermediate string of words comprising at least one of the lexical words identified from the string of characters and the at least two prosodic words; and
      
      combining at least two words in the intermediate string of words to form a prosodic word.
  - 23. The computer-readable storage medium of claim 19 further comprising identifying prosodic words by dividing a lexical word into at least two prosodic words.
  - 24. The computer-readable storage medium of claim 23 wherein dividing a lexical word comprises:
    - accessing a lexicon to find an entry for the lexical word;
      
      retrieving information from the entry describing how the lexical word is to be divided; and
      
      dividing the lexical word based on the information.

25. A method of identifying prosody for a synthesized speech segment that is formed from a string of lexical words, the method comprising:
- converting the string of lexical words into a string of prosodic words by concatenating at least two lexical words in the string of lexical words to form a prosodic word, each prosodic word comprising at least one lexical word and the string of prosodic words having different word boundaries than the string of lexical words, wherein concatenating the two lexical words comprises;
  
  identifying at least one category for each lexical word; and
  
  determining whether to concatenate the two lexical words based on the categories of the lexical words; and
  
  identifying the prosody from the string of prosodic words.
- View Dependent Claims (26, 27)
- - 26. The method of claim 25 wherein determining whether to concatenate the two lexical words comprises applying the categories of the lexical words to a classification and regression tree.
  - 27. The method of claim 25 wherein determining whether to concatenate the two lexical words comprises examining a probability that describes the likelihood that the lexical words form a prosodic word given the categories.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Chu, Min, Qian, Yao
Primary Examiner(s)
Opsasnick; Michael N

Application Number

US09/850,526
Publication Number

US 20020095289A1
Time in Patent Office

2,304 Days
Field of Search

704256-260, 704/267, 704251-253
US Class Current

704/251
CPC Class Codes

G10L 13/10 Prosody rules derived from ...

Method and apparatus for identifying prosodic word boundaries

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

65 Citations

27 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for identifying prosodic word boundaries

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

65 Citations

27 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links