Speech analysis syllabic segmenter
First Claim
1. Apparatus for partitioning a speech pattern into syllabic subunits comprising:
- means for generating a frame sequence of autocorrelation signals corresponding to said speech pattern;
means responsive to said autocorrelation signal sequence for forming a sequence of signals representative of speech energy in the successive frames of the speech pattern;
means responsive to said speech pattern energy signals for generating a sequence of speech pattern peak energy frame signals;
means responsive to said speech energy signals sequence and said peak frame signal sequence for generating a signal representative of the minimum speech energy frame between each pair of successive peak energy frames;
means responsive to said peak and minimum energy frame signals and said autocorrelation signals for producing a sequence of candidate peak and minimum energy signals;
means responsive to said candidate peak and minimum energy frame signal sequences for forming a set of candidate syllabic subunit characteristic signals; and
means responsive to said candidate syllabic subunit characteristic signals for selecting a set of speech pattern syllabic subunits.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech pattern is partitioned into its syllabic subunits by generating signals representative of the speech energy and autocorrelation features of the time frames portions thereof. The peak energy time frames are identified from the frame energy signals and the minimum energy time frames between each pair of successive peak energy frames of the speech pattern are determined from the time frame energy and autocorrelation feature signals. Candidate syllabic subunits are formed responsive to the peak and minimum energy frame characteristics and the autocorrelation feature signals. Signals corresponding to the duration and the energy of each candidate syllabic subunit peak energy frame relative to the energy of the other peak energy frames and the maximum peak energy frame of the speech pattern are formed and these signals are combined to produce a figure of merit for each candidate syllabic subunit. The sequence of syllabic subunits for the speech pattern are selected from the candidates by comparing the figure of merit signals of the candidate subunits.
25 Citations
23 Claims
-
1. Apparatus for partitioning a speech pattern into syllabic subunits comprising:
-
means for generating a frame sequence of autocorrelation signals corresponding to said speech pattern; means responsive to said autocorrelation signal sequence for forming a sequence of signals representative of speech energy in the successive frames of the speech pattern; means responsive to said speech pattern energy signals for generating a sequence of speech pattern peak energy frame signals; means responsive to said speech energy signals sequence and said peak frame signal sequence for generating a signal representative of the minimum speech energy frame between each pair of successive peak energy frames; means responsive to said peak and minimum energy frame signals and said autocorrelation signals for producing a sequence of candidate peak and minimum energy signals; means responsive to said candidate peak and minimum energy frame signal sequences for forming a set of candidate syllabic subunit characteristic signals; and means responsive to said candidate syllabic subunit characteristic signals for selecting a set of speech pattern syllabic subunits. - View Dependent Claims (10, 14, 23)
-
-
2. A method for partitioning a speech pattern into syllabic subunits comprising the steps of:
-
generating a frame sequence of autocorrelation signals responsive to said speech pattern; forming a sequence of signals representative of the speech energy in successive frames of the speech pattern responsive to said frame sequence of autocorrelation signals; generating a sequence of signals representative of the speech pattern peak energy frames responsive to said speech pattern energy signals; generating a signal representative of the minimum speech energy frame between each pair of successive peak energy frames responsive to said speech energy signal sequence and said peak energy frame signal sequence; producing a sequence of candidate syllabic subunit signals responsive to said peak and minimum energy frame signals and said autocorrelation signals; forming a first signal representative of the speech energy of each candidate syllabic subunit peak energy frame relative to the speech energy of the adjacent candidate syllabic subunit peak energy frames responsive to the said peak and minimum energy frame signals; forming a second signal representative of the energy of each candidate syllabic subunit peak energy frame relative to the energy of the maximum speech energy frame responsive to the said peak and minimum energy frame signals; forming a third signal representative of the duration of each candidate syllabic responsive to the said peak and minimum energy frame signals; combining said first, second and third signals of each candidate syllabic subunit to form a signal corresponding to a figure of merit for said syllabic subunit; and selecting a sequence of speech pattern syllabic subunits responsive to said candidate syllabic subunit figure of merit signals. - View Dependent Claims (3, 4, 5, 6, 7, 8, 9)
-
-
11. A method for partitioning a speech pattern into syllabic subunits comprising the steps of:
-
generating a frame sequence of zeroth order autocorrelation signals and a frame sequence of first order autocorrelation signals corresponding to said speech pattern; forming a sequence of signals representative of speech energy in the successive frames of the speech pattern responsive to said zeroth order autocorrelation signal sequence; generating a sequence of speech pattern peak energy frame signals responsive to said speech pattern energy signals; generating a signal representative of the minimum speech energy frame between each pair of successive peak energy frames responsive to said speech energy signals sequence and said peak energy frame signal sequence; producing a sequence of candidate peak and minimum energy signals responsive to said peak energy frame signal sequence, minimum energy frame signal sequence and said first order autocorrelation signal sequence; forming a set of candidate syllabic subunit characteristic signals including forming a first signal representative of the speech energy of each candidate syllabic subunit peak energy frame relative to the speech energy of the adjacent candidate syllabic subunit peak energy frames responsive to the said peak and minimum energy frame signals, forming a second signal representative of the energy of each candidate syllabic subunit peak energy frame relative to the energy of the maximum speech energy frame response to the said peak and minimum energy frame signals, and forming a third signal representative of the duration of each candidate syllabic subunit responsive to the said peak and minimum energy frame signals; combining said first, second and third signals of each candidate syllabic subunit to form a signal corresponding to a figure of merit for said candidate syllabic subunit; and selecting a sequence of speech pattern syllabic subunits responsive to said candidate syllabic subunit figure of merit signals. - View Dependent Claims (12, 13)
-
-
15. Apparatus for partitioning a speech pattern into syllabic subunits comprising:
-
means responsive to said speech pattern for generating a frame sequence of autocorrelation signals; means for forming a sequence of signals representation of the speech energy in successive frames of the speech pattern responsive to said frame sequence of autocorrelation signals; means responsive to said speech pattern energy signals for generating a sequence of signals representative of the speech pattern peak energy frames; means responsive to said speech energy signals sequence and said peak energy frame signal sequence for generating a signal representative of the minimum speech energy frame between each pair of successive peak energy frames; means responsive to said peak and minimum energy frame signals and said autocorrelation signals for producing a sequence of candidate syllabic subunit signals; means responsive to the said peak and mimimum energy frame signals for forming a first signal representative of the speech energy for each candidate syllabic subunit energy frame relative to the speech energy of the adjacent candidate syllabic subunit peak energy frames; means responsive to the said peak and minimum energy frame signals for forming a second signal representative of the energy of each candidate syllabic subunit peak energy frame relative to the energy of the maximum speech energy frame; means responsive to the peak and minimum energy frame signals for forming a third signal representative of the duration of each candidate syllabic subunit; means for combining said first, second and third signals of each candidate syllabic subunit to form a signal corresponding to a figure of merit for said candidate syllabic subunit; and means responsive to said candidate syllabic subunit figure of merit signals for selecting a sequence of speech pattern syllabic subunits. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22)
-
Specification