PRODUCING TIME UNIFORM FEATURE VECTORS

US 20090271183A1
Filed: 10/23/2008
Published: 10/29/2009
Est. Priority Date: 10/24/2007
Status: Active Grant

First Claim

Patent Images

1. A method of processing a signal representing speech, the method comprising:

receiving a frame of the signal representing speech, the frame comprising a voiced frame;

extracting one or more cords from the voiced frame based on occurrence of one or more events within the frame and wherein the one or more cords collectively comprise less than all of the frame; and

normalizing the one or more cords on a time basis.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and machine-readable media are disclosed for processing a signal representing speech. According to one embodiment, processing a signal representing speech can comprise receiving a frame of the signal representing speech, the frame comprising a voiced frame. One or more cords can be extracted from the voiced frame based on occurrence of one or more events within the frame. For example, the one or more events comprise one or more glottal pulses. The one or more cords can collectively comprise less than all of the frame. The one or more cords can be normalized on a time basis. For example, each of the one or more cords can begin with onset of a glottal pulse and extend to a point prior to an onset of neighboring glottal pulse but may exclude a portion of the frame prior to the onset of the neighboring glottal pulse.

27 Citations

View as Search Results

21 Claims

1. A method of processing a signal representing speech, the method comprising:
- receiving a frame of the signal representing speech, the frame comprising a voiced frame;
  
  extracting one or more cords from the voiced frame based on occurrence of one or more events within the frame and wherein the one or more cords collectively comprise less than all of the frame; and
  
  normalizing the one or more cords on a time basis.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, wherein the one or more events comprise one or more glottal pulses.
  - 3. The method of claim 2, wherein each of the one or more cords begins with onset of a glottal pulse and extends to a point prior to an onset of neighboring glottal pulse but excludes a portion of the frame prior to the onset of the neighboring glottal pulse.
  - 4. The method of claim 1, wherein normalizing the one or more cords comprises:
    - determining whether the one or more cords comprise a plurality of cords;
      
      in response to determining the one or more cords comprise a plurality of cords, selecting one of the cords from the plurality of cords; and
      
      normalizing the selected cord.
  - 5. The method of claim 4, wherein normalizing the selected cord on a time basis comprises performing a function based re-sampling of the signal representing speech.
  - 6. The method of claim 4, wherein normalizing the selected cord on a time basis comprises regenerating the signal representing speech using the selected cord and performing a uniform framing process on the regenerated signal.
  - 7. The method of claim 4, wherein normalizing the selected cord on a time basis comprises resizing the selected cord to match the time basis.
  - 8. The method of claim 1, wherein the time basis comprises 10 milliseconds.
  - 9. The method of claim 1, further comprising providing the normalized one or more cords to an automatic speech recognition engine.
  - 10. The method of claim 1, further comprising providing the normalized one or more cords to an adaptive filter.

11. A system comprising:
- a classification module adapted to receive a frame of a signal representing speech and classify the frame as a voiced frame;
  
  a cord finder module communicatively coupled with the classification module and adapted to receive the frame from the classification module and extract one or more cords from the frame based on occurrence of one or more events within the frame and wherein the one or more cords collectively comprise less than all of the frame; and
  
  a time normalization module communicatively coupled with the cord finder module and adapted to receive the one or more extracted cords from the cord finder module and normalize the one or more cords on a time basis.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 12. The system of claim 11, wherein the one or more events comprise one or more glottal pulses.
  - 13. The system of claim 12, wherein each of the one or more cords begins with onset of a glottal pulse and extends to a point prior to an onset of neighboring glottal pulse but excludes a portion of the frame prior to the onset of the neighboring glottal pulse.
  - 14. The system of claim 11, wherein normalizing the one or more cords comprises:
    - determining whether the one or more cords comprise a plurality of cords;
      
      in response to determining the one or more cords comprise a plurality of cords, selecting one of the cords from the plurality of cords; and
      
      normalizing the selected cord.
  - 15. The system of claim 14, wherein normalizing the selected cord on a time basis comprises performing a function based re-sampling of the signal representing speech.
  - 16. The system of claim 14, wherein normalizing the selected cord on a time basis comprises regenerating the signal representing speech using the selected cord and performing a uniform framing process on the regenerated signal.
  - 17. The system of claim 14, wherein normalizing the selected cord on a time basis comprises resizing the selected cord to match the time basis.
  - 18. The system of claim 11, wherein the time basis comprises 10 milliseconds.
  - 19. The system of claim 11, wherein the time normalization module is adapted to provide the normalized one or more cords to an automatic speech recognition engine.
  - 20. The system of claim 11, wherein the time normalization module is adapted to provide the normalized one or more cords to an adaptive filter.

21. A machine-readable medium having stored thereon a series of instruction which, when executed by a processor, cause the processor to process a signal representing speech by:
- receiving a frame of the signal representing speech, the frame comprising a voiced frame;
  
  extracting one or more cords from the voiced frame based on occurrence of one or more events within the frame and wherein the one or more cords collectively comprise less than all of the frame; and
  
  normalizing the one or more cords on a time basis.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Red Shift Company LLC
Original Assignee
Red Shift Company LLC
Inventors
Nyquist, Joel K., Remillard, John F., Reckase, Erik N., Robinson, Matthew D.

Granted Patent

US 8,396,704 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/211
CPC Class Codes

G10L 25/90 Pitch determination of spee...

G10L 25/93 Discriminating between voic...

PRODUCING TIME UNIFORM FEATURE VECTORS

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

27 Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

PRODUCING TIME UNIFORM FEATURE VECTORS

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

27 Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links