PRODUCING PHONITOS BASED ON FEATURE VECTORS

US 20090271198A1
Filed: 10/23/2008
Published: 10/29/2009
Est. Priority Date: 10/24/2007
Status: Active Grant

First Claim

Patent Images

1. A method of processing a signal representing speech, the method comprising:

receiving a first frame of the signal representing speech, the first frame comprising a voiced frame;

extracting one or more cords from the voiced frame based on occurrence of one or more events within the frame and wherein the one or more cords collectively comprise less than all of the frame; and

determining a phoneme for the voiced frame based on at least one of the one or more extracted cords.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and machine-readable media are disclosed for processing a signal representing speech. According to one embodiment, processing a signal representing speech can comprise receiving a first frame of the signal, the first frame comprising a voiced frame. One or more cords can be extracted from the voiced frame based on occurrence of one or more events within the frame. For example, the one or more events can comprise one or more glottal pulses. The one or more cords can collectively comprise less than all of the frame. For example, each of the cords can begin with onset of a glottal pulse and extend to a point prior to an onset of neighboring glottal pulse but may exclude a portion of the frame prior to the onset of the neighboring glottal pulse. A phoneme for the voiced frame can be determined based on at least one of the extracted cords.

26 Citations

View as Search Results

24 Claims

1. A method of processing a signal representing speech, the method comprising:
- receiving a first frame of the signal representing speech, the first frame comprising a voiced frame;
  
  extracting one or more cords from the voiced frame based on occurrence of one or more events within the frame and wherein the one or more cords collectively comprise less than all of the frame; and
  
  determining a phoneme for the voiced frame based on at least one of the one or more extracted cords.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein the one or more events comprise one or more glottal pulses.
  - 3. The method of claim 2, wherein each of the one or more cords begins with onset of a glottal pulse and extends to a point prior to an onset of neighboring glottal pulse but excludes a portion of the frame prior to the onset of the neighboring glottal pulse.
  - 4. The method of claim 1, wherein determining the phoneme for the voiced frame based on at least one of the one or more extracted cords comprises performing a spectral analysis on the extracted cords and performing a phoneme lookup based on results of the spectral analysis.
  - 5. The method of claim 4, further comprising providing the phoneme for the voiced frame to an automatic speech recognition engine.
  - 6. The method of claim 5, further comprising receiving a second frame of the signal representing speech, the second frame comprising an unvoiced frame.
  - 7. The method of claim 6, further comprising determining a phoneme for the unvoiced frame without extracting one or more cords from the unvoiced frame.
  - 8. The method of claim 7, further comprising providing the phoneme for the unvoiced frame to the automatic speech recognition engine.

9. A system comprising:
- a classification module adapted to receive a first frame of a signal representing speech and classify the first frame as a voiced frame;
  
  a cord finder module communicatively coupled with the classification module and adapted to receive the voiced frame from the classification module and extract one or more cords from the voiced frame based on occurrence of one or more events within the frame and wherein the one or more cords collectively comprise less than all of the frame; and
  
  a phoneme determination module communicatively coupled with the cord finder module and adapted to receive the one or more extracted cords from the cord finder module and determine a phoneme for the voiced frame based on at least one of the one or more extracted cords.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
- - 10. The system of claim 9, wherein the one or more events comprise one or more glottal pulses.
  - 11. The system of claim 10, wherein each of the one or more cords begins with onset of a glottal pulse and extends to a point prior to an onset of neighboring glottal pulse but excludes a portion of the frame prior to the onset of the neighboring glottal pulse.
  - 12. The system of claim 9, wherein determining the phoneme for the voiced frame based on at least one of the one or more extracted cords comprises performing a spectral analysis on the extracted cords and performing a phoneme lookup based on results of the spectral analysis.
  - 13. The system of claim 12, wherein the phoneme determination module is further adapted to provide the phoneme for the voiced frame to an automatic speech recognition engine.
  - 14. The system of claim 13, wherein the classification module is further adapted to receive a second frame of the signal representing speech and classify the second frame as an unvoiced frame.
  - 15. The system of claim 14, wherein the classification module is communicatively coupled with the phoneme determination module and wherein the phoneme determination module is adapted to receive the unvoiced frame from the classification module and determine a phoneme for the unvoiced frame.
  - 16. The system of claim 15, wherein the phoneme determination module is further adapted to provide the phoneme for the unvoiced frame to the automatic speech recognition engine.

17. A machine-readable medium having stored thereon a series of instructions which, when executed by a processor, cause the processor to process a signal representing speech by:
- receiving a first frame of the signal representing speech, the first frame comprising a voiced frame;
  
  extracting one or more cords from the voiced frame based on occurrence of one or more events within the frame and wherein the one or more cords collectively comprise less than all of the frame; and
  
  determining a phoneme for the voiced frame based on at least one of the one or more extracted cords.
- View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
- - 18. The machine-readable medium of claim 17, wherein the one or more events comprise one or more glottal pulses.
  - 19. The machine-readable medium of claim 18, wherein each of the one or more cords begins with onset of a glottal pulse and extends to a point prior to an onset of neighboring glottal pulse but excludes a portion of the frame prior to the onset of the neighboring glottal pulse.
  - 20. The machine-readable medium of claim 17, wherein determining the phoneme for the voiced frame based on at least one of the one or more extracted cords comprises performing a spectral analysis on the extracted cords and performing a phoneme lookup based on results of the spectral analysis.
  - 21. The machine-readable medium of claim 20, further comprising providing the phoneme for the voiced frame to an automatic speech recognition engine.
  - 22. The machine-readable medium of claim 21, further comprising receiving a second frame of the signal representing speech, the second frame comprising an unvoiced frame.
  - 23. The machine-readable medium of claim 22, further comprising determining a phoneme for the unvoiced frame without extracting one or more cords from the unvoiced frame.
  - 24. The machine-readable medium of claim 23, further comprising providing the phoneme for the unvoiced frame to the automatic speech recognition engine.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Red Shift Company LLC
Original Assignee
Red Shift Company LLC
Inventors
Nyquist, Joel K., Robinson, Matthew D., Remillard, John F., Reckase, Erik N.

Granted Patent

US 8,326,610 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/249
CPC Class Codes

G10L 25/90 Pitch determination of spee...

G10L 25/93 Discriminating between voic...

PRODUCING PHONITOS BASED ON FEATURE VECTORS

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

26 Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

PRODUCING PHONITOS BASED ON FEATURE VECTORS

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

26 Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links