Method and apparatus for sculpting synthesized speech

US 20030229494A1
Filed: 04/17/2003
Published: 12/11/2003
Est. Priority Date: 04/17/2002
Status: Abandoned Application

First Claim

Patent Images

1. A speech processor, comprising;

a unit-selection device that processes a stream of target phonetic-units to produce a stream of respective selected phonetic-units, the selected phonetic-units being selected on the basis of at least a set of target-cost functions that determine target-costs between each target phonetic-unit and respective groups of sample phonetic-units; and

a phonetic editor configured to enable an operator to selectively designate one or more selected phonetic-units in the stream of selected phonetic-units.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods and systems for sculpting synthesized speech using a graphic user interface are disclosed. An operator enters a stream of text that is used to produce a stream of target phonetic-units. The stream of target phonetic-units is then submitted to a unit-selection process to produce a stream of selected phonetic-units, each selected phonetic-unit derived from a database of sample phonetic-units. After the stream of sample phonetic-units is selected, an operator can remove various selected phonetic-units from the stream of selected phonetic-units, prune the sample phonetic-database and edit various cost functions using the graphic user interface. The edited speech information can then be submitted to the unit-selection process to produce a second stream of selected phonetic-units.

Citations

40 Claims

1. A speech processor, comprising;
- a unit-selection device that processes a stream of target phonetic-units to produce a stream of respective selected phonetic-units, the selected phonetic-units being selected on the basis of at least a set of target-cost functions that determine target-costs between each target phonetic-unit and respective groups of sample phonetic-units; and
  
  a phonetic editor configured to enable an operator to selectively designate one or more selected phonetic-units in the stream of selected phonetic-units.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. A speech processor as in claim 1, wherein the phonetic editor is configured so that designation causes the removal of the one or more selected phonetic-units from the stream of selected phonetic-units.
  - 3. A speech processor as in claim 2, wherein the one or more removed phonetic-units is precluded from re-selection by a subsequent unit-selection process.
  - 4. A speech processor as in claim 2, wherein the phonetic editor is further configured to prune one or more non-selected phonetic-units, each non-selected phonetic-unit relating to the same phonetic-unit group as a first removed selected phonetic-unit.
  - 5. A speech processor as in claim 1, wherein the phonetic editor is further configured to edit at least a first target-cost function.
  - 6. A speech processor as in claim 5, wherein the phonetic editor is configured to change at least one or more parameters of the first target-cost function.
  - 7. A speech processor as in claim 6, wherein the one or more parameters includes at least one of a center point and a standard deviation.
  - 8. A speech processor as in claim 5, wherein the edited target-cost function is a duration function.
  - 9. A speech processor as in claim 5, wherein the edited target-cost function is a pitch function.
  - 10. A speech processor as in claim 5, wherein the edited target-cost function is an amplitude function.
  - 11. A speech processor as in claim 1, wherein the phonetic editor is configured to enable an operator to compare two or more streams of speech with at least one stream of speech generated using one or more editing functions.
  - 12. A speech processor as in claim 1, wherein the unit-selection device is enabled to select a new selected phonetic-unit to replace at least one removed phonetic-unit.
  - 13. A speech processor as in claim 1, wherein the phonetic editor uses a graphic user interface to enable the operator to designate phonetic-units.
  - 14. A speech processor as in claim 13, wherein the graphic user interface is configured to display a number of selected phonetic-units, each phonetic-unit including one or more displayed parameters.
  - 15. A speech processor as in claim 13, wherein the graphic user interface is configured to simultaneously display portions of two or more streams of selected phonetic-units, each phonetic-unit including one or more displayed parameters.

16. A method for processing speech information, comprising:
- selecting a stream of selected phonetic-units from a database of sample phonetic-units, wherein the step of selecting is based on a stream of target phonetic-units with respective target-costs relating to the sample phonetic-units; and
  
  performing an editing function on the stream of selected phonetic-units, the editing function including selectively designating one or more selected phonetic-units.
- View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 30, 35, 36, 40)
- - 17. A method as in claim 16, wherein performing an editing function includes removing one or more selected phonetic-units and, optionally precluding said removed phonetic unit from re-selection in a subsequent unit selection process.
  - 18. A method as in claim 17, wherein performing an editing function includes pruning one or more non-selected phonetic-units of the same phonetic-unit group as a first removed phonetic-unit.
  - 19. A method as in claim 16, wherein performing an editing function includes editing at least one cost function.
  - 20. A method as in claim 19, wherein performing an editing function includes changing at least one or more parameters of a target-cost function.
  - 21. A method as in claim 20, wherein the one or more parameters include at least one of a center point and a standard deviation.
  - 22. A method as in claim 20, wherein the edited target-cost function is selected from one of a duration function, a pitch function and an amplitude function.
  - 23. A method as in claim 22, when dependent on claim 18, wherein the step of pruning comprises entering a value in a window of the graphic user interface.
  - 24. A method as in claim 22, when dependent on claim 18, wherein the step of pruning comprises defining a pruning threshold having regard to a reference phonetic-unit.
  - 25. A method as in claim 19, wherein the step of editing the at least one cost function includes re-drawing some or all of the cost function.
  - 26. A method as in claim 20, further comprising comparing two or more streams of speech with at least one stream of speech generated using one or more editing functions.
  - 27. A graphic user interface configured to perform a method according to claim 16.
  - 30. A graphic user interface as in claim 26, wherein the editing tool is configured to enable the operator to prune one or more non-selected phonetic-units from a group of phonetic-units, the group of phonetic-units relating to a first removed phonetic-unit.
  - 35. A graphic user interface as in claim 34, when dependent on claim 30, configured such that said manipulation is performed by entering a parameter value in a window.
  - 36. A graphic user interface as in claim 34, when dependent on claim 30, configured such that a pruning threshold is defined having regard to a reference phonetic unit.
  - 40. A program code product, comprising program code means for performing a method according to claim 16.

28. A graphic user interface associated with a speech synthesis system, comprising:
- a first display area that can display a portion of symbols representing a stream of selected phonetic-units; and
  
  an editing tool configured to enable an operator to edit the stream of selected phonetic-units.
- View Dependent Claims (29, 31, 32, 33, 34, 37, 38)
- - 29. A graphic user interface as in claim 28, wherein the editing tool is configured to enable the operator to selectively remove one or more selected phonetic-units.
  - 31. A graphic user interface as in claim 28, wherein the editing tool includes a cost function editor.
  - 32. A graphic user interface as in claim 31, wherein the cost function editor is configured to manipulate at least one or more parameters of a target-cost function.
  - 33. A graphic user interface as in claim 31, wherein the cost function editor is configured to manipulate at least one of a center point and a standard deviation parameter.
  - 34. A graphic user interface as in claim 31, wherein the cost function editor is configured to manipulate at least one of a duration function, a pitch function and an amplitude function.
  - 37. A graphic user interface as in claim 31, wherein the cost function editor is configured to enable an operator to redraw at least a portion of a target-cost function.
  - 38. A graphic user interface as in claim 28, wherein the graphic user interface is configured to enable an operator to simultaneously display portions of two or more streams of selected phonetic-units, each phonetic-unit including one or more displayed parameters.

39. A graphic user interface substantially as described herein with reference to FIGS. 2 to 17.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Rhetorical Systems Ltd. (Microsoft Corporation)
Original Assignee
Rhetorical Systems Ltd. (Microsoft Corporation)
Inventors
Rutten, Peter, Taylor, Paul Alexander

Application Number

US10/417,347
Publication Number

US 20030229494A1
Time in Patent Office

Days
Field of Search
US Class Current

704/254
CPC Class Codes

G10L 13/033 Voice editing, e.g. manipul...

G10L 13/06 Elementary speech units use...

Method and apparatus for sculpting synthesized speech

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

40 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for sculpting synthesized speech

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

40 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links