Selective sampling for sound signal classification

US 20050043957A1
Filed: 08/21/2003
Published: 02/24/2005
Est. Priority Date: 08/21/2003
Status: Active Grant

First Claim

Patent Images

1. A method for sound signal classification, comprising:

receiving a sound signal;

specifying meta-data to be extracted from the sound signal;

dividing the sound signal into a set of frames;

applying a fitness function to the frames to create a set of fitness data;

selecting a frame from the set of frames, if the frame'"'"'s corresponding fitness datum within the set of fitness data exceeds a predetermined threshold value;

extracting the meta-data from the selected frames; and

classifying the sound signal based on the meta-data extracted from the selected frames.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method of selective sampling for sound signal classification is disclosed. The method of the present invention discloses the elements of: receiving a sound signal; specifying meta-data to be extracted from the sound signal; dividing the sound signal into a set of frames; applying a fitness function to the frames to create a set of fitness data; selecting a frame from the set of frames, if the frame'"'"'s corresponding fitness datum within the set of fitness data exceeds a predetermined threshold value; extracting the meta-data from the selected frames; and classifying the sound signal based on the meta-data extracted from the selected frames. The system of the present invention discloses means for implementing the method.

Citations

19 Claims

1. A method for sound signal classification, comprising:
- receiving a sound signal;
  
  specifying meta-data to be extracted from the sound signal;
  
  dividing the sound signal into a set of frames;
  
  applying a fitness function to the frames to create a set of fitness data;
  
  selecting a frame from the set of frames, if the frame'"'"'s corresponding fitness datum within the set of fitness data exceeds a predetermined threshold value;
  
  extracting the meta-data from the selected frames; and
  
  classifying the sound signal based on the meta-data extracted from the selected frames.
- View Dependent Claims (2, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 14, 15, 17)
- - 2. The method of claim 1:
    - wherein the sound signal is a speech signal.
  - 3. The method of claim 1 wherein specifying includes:
    - specifying age range meta-data.
  - 4. The method of claim 1 wherein specifying includes:
    - specifying gender meta-data.
  - 5. The method of claim 4 wherein selecting includes:
    - setting the threshold so that a ratio of frames selected to frames not selected is between about 1;
      
      2 and about 1;
      
      3.
  - 6. The method of claim 1 wherein specifying includes:
    - specifying accent meta-data.
  - 8. The method of claim 1 wherein specifying includes:
    - specifying identity meta-data.
  - 9. The method of claim 1 wherein dividing includes:
    - dividing the sound signal into a set of time frames.
  - 10. The method of claim 1 wherein dividing includes:
    - dividing the sound signal into a set of equal length time frames.
  - 11. The method of claim 1 wherein applying includes:
    - calculating a signal strength of the sound signal frame.
  - 12. The method of claim 1 wherein selecting includes:
    - selecting a frame for meta-data extraction, if the frame'"'"'s fitness datum exceeds a greatest fitness datum within the set of fitness data by a predetermined margin.
  - 13. The method of claim 1 wherein extracting includes:
    - extracting the meta-data from the selected frames using a Multi-Layer Perceptron (MLP) neural network.
  - 14. The method of claim 13 wherein extracting includes:
    - extracting the meta-data from the selected frames using a MLP neural network having an input layer with nodes corresponding to the sound signal'"'"'s Mel-Cepstral components.
  - 15. The method of claim 1 further wherein classifying includes:
    - assigning the sound signal to that meta-data class to which a largest number of the selected frames have been assigned.
  - 17. The method of claim 1 further wherein classifying includes:
    - assigning the sound signal to that meta-data class having a statistically longest run-length.

7. The method of claim I wherein specifying includes:
- specifying dialect meta-data.

16. The method of claim I further wherein classifying includes:
- adding together each of the selected frame'"'"'s confidence scores for each meta-data class; and
  
  assigning the sound signal to that meta-data class having a highest total confidence score.

18. A method for sound signal classification, comprising:
- receiving a speech signal;
  
  specifying meta-data to be extracted from the sound signal;
  
  dividing the sound signal into a set of equal length time frames;
  
  applying a fitness function to the frames to create a set of fitness data;
  
  selecting a frame for meta-data extraction, if the frame'"'"'s fitness datum exceeds a greatest fitness datum within the set of fitness data by a predetermined margin;
  
  extracting the meta-data from the selected frames using a Multi-Layer Perceptron (MLP) neural network;
  
  adding together each of the selected frame'"'"'s confidence scores for each meta-data class; and
  
  assigning the sound signal to that meta-data class having a highest total confidence score.

19. A system for sound signal classification comprising a:
- means for receiving a sound signal;
  
  means for specifying meta-data to be extracted from the sound signal;
  
  means for dividing the sound signal into a set of frames;
  
  means for applying a fitness function to the frames to create a set of fitness data;
  
  means for selecting a frame from the set of frames, if the frame'"'"'s corresponding fitness datum within the set of fitness data exceeds a predetermined threshold value;
  
  means for extracting the meta-data from the selected frames; and
  
  means for classifying the sound signal based on the meta-data extracted from the selected frames.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Hewlett-Packard Development Company, L.P. (HP Inc.)
Original Assignee
Hewlett-Packard Development Company, L.P. (HP Inc.)
Inventors
Lin, Xiaofan

Granted Patent

US 7,340,398 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/277
CPC Class Codes

G10L 17/26 Recognition of special voic...

Selective sampling for sound signal classification

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Selective sampling for sound signal classification

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links