Producing time uniform feature vectors

US 8,396,704 B2
Filed: 10/23/2008
Issued: 03/12/2013
Est. Priority Date: 10/24/2007
Status: Expired due to Fees

First Claim

Patent Images

1. A method of processing a signal representing speech, the method comprising:

receiving a region of the signal representing speech, wherein the region comprises a portion of a frame of the signal representing speech classified as a voiced frame and wherein the region is marked based on one or more pitch estimates for the region;

identifying a plurality of cords within the region of the signal based on occurrence of events within the region of the signal, wherein the events comprise glottal pulses and each cord begins with onset of a first glottal pulse and extends to a point prior to an onset of a second glottal pulse but excludes a portion of the region of the signal prior to the onset of the second glottal pulse; and

normalizing the plurality of cords on a time basis, wherein the normalized plurality of cords each have a uniform duration on the time basis.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Generally speaking, embodiments of the present invention relate to speech processing such as, for example, speech recognition. Speech processing according to one embodiment of the present invention can be performed based on the occurrence of events within the electrical signals representing speech. Such events need not comprise instantaneous occurrences but rather, an occurrence within the electrical signal spanning some period of time. Furthermore, the electrical signal can be analyzed based on the occurrence and location of these events so that less than all of the signal is analyzed. That is, the spoken sounds can be processed based on regions of the signal around and including the events but excluding other portions of the signal. For example, transition periods before the occurrence of the events may be excluded to eliminate noise or transients introduced at that part of the signal.

Citations

16 Claims

1. A method of processing a signal representing speech, the method comprising:
- receiving a region of the signal representing speech, wherein the region comprises a portion of a frame of the signal representing speech classified as a voiced frame and wherein the region is marked based on one or more pitch estimates for the region;
  
  identifying a plurality of cords within the region of the signal based on occurrence of events within the region of the signal, wherein the events comprise glottal pulses and each cord begins with onset of a first glottal pulse and extends to a point prior to an onset of a second glottal pulse but excludes a portion of the region of the signal prior to the onset of the second glottal pulse; and
  
  normalizing the plurality of cords on a time basis, wherein the normalized plurality of cords each have a uniform duration on the time basis.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein normalizing the plurality of cords comprises:
    - selecting one of the cords from the plurality of cords; and
      
      normalizing the selected cord.
  - 3. The method of claim 2, wherein normalizing the selected cord on a time basis comprises performing a function based re-sampling of the signal representing speech.
  - 4. The method of claim 2, wherein normalizing the selected cord on a time basis comprises regenerating the signal representing speech using the selected cord and performing a uniform framing process on the regenerated signal.
  - 5. The method of claim 2, wherein normalizing the selected cord on a time basis comprises resizing the selected cord to match the time basis.
  - 6. The method of claim 1, wherein the time basis comprises 10 milliseconds.
  - 7. The method of claim 1, further comprising providing the normalized plurality of cords to an automatic speech recognition engine.
  - 8. The method of claim 1, further comprising providing the normalized plurality of cords to an adaptive filter.

9. A system comprising:
- a classification module adapted to receive a region of a signal representing speech, wherein the region comprises a portion of a frame of the signal representing speech and wherein the region is marked based on one or more pitch estimates for the region;
  
  a cord finder module communicatively coupled with the classification module and adapted to receive the frame from the classification module and identify a plurality of cords within the region of the signal based on occurrence of events within the region of the signal, wherein the events comprise glottal pulses and each cord begins with onset of a first glottal pulse and extends to a point prior to an onset of a second glottal pulse but excludes a portion of the region of the signal prior to the onset of the second glottal pulse; and
  
  a time normalization module communicatively coupled with the cord finder module and adapted to receive the plurality of extracted cords from the cord finder module and normalize the plurality of cords on a time basis, wherein the normalized the plurality of cords each have a uniform duration on the time basis.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
- - 10. The system of claim 9, wherein normalizing the plurality of cords comprises:
    - selecting one of the cords from the plurality of cords; and
      
      normalizing the selected cord.
  - 11. The system of claim 10, wherein normalizing the selected cord on a time basis comprises performing a function based re-sampling of the signal representing speech.
  - 12. The system of claim 10, wherein normalizing the selected cord on a time basis comprises regenerating the signal representing speech using the selected cord and performing a uniform framing process on the regenerated signal.
  - 13. The system of claim 10, wherein normalizing the selected cord on a time basis comprises resizing the selected cord to match the time basis.
  - 14. The system of claim 9, wherein the time basis comprises 10 milliseconds.
  - 15. The system of claim 9, wherein the time normalization module is adapted to provide the normalized plurality of cords to an automatic speech recognition engine.
  - 16. The system of claim 9, wherein the time normalization module is adapted to provide the normalized plurality of cords to an adaptive filter.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Red Shift Company LLC
Original Assignee
Red Shift Company LLC
Inventors
Nyquist, Joel K., Reckase, Erik N., Robinson, Matthew D., Remillard, John F.
Primary Examiner(s)
Opsasnick, Michael N

Application Number

US12/256,710
Publication Number

US 20090271183A1
Time in Patent Office

1,601 Days
Field of Search

704/208, 704/211
US Class Current

704/211
CPC Class Codes

G10L 25/90 Pitch determination of spee...

G10L 25/93 Discriminating between voic...

Producing time uniform feature vectors

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

16 Claims

Specification

Solutions

Use Cases

Quick Links

Producing time uniform feature vectors

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

16 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links