Systems and Methods for Estimating Age of a Child Based on Speech

US 20180053514A1
Filed: 08/22/2016
Published: 02/22/2018
Est. Priority Date: 08/22/2016
Status: Active Application

First Claim

Patent Images

1. A system comprising:

a microphone configured to receive an input speech from an individual;

an analog-to-digital (A/D) converter configured to convert the input speech from an analog form to a digital form and generate a digitized speech;

a memory storing an executable code and an age estimation database including a plurality of age determinant formant-based feature vectors;

a hardware processor executing the executable code to;

receive the digitized speech from the A/D converter;

identify a plurality of boundaries between a plurality of phonemes in the digitized speech;

extract a plurality of formant-based feature vectors from one or more phonemes of the plurality of phonemes delineated by the plurality of boundaries, based on at least one of a formant position, a formant bandwidth, and a formant dispersion, wherein the formant dispersion is a geometric mean of the formant spacing;

compare the plurality of formant-based feature vectors with the age determinant formant-based feature vectors of the age estimation database;

estimate the age of the individual when the comparison finds a match in the age estimation database; and

communicate an age-appropriate response to the individual based on the estimated age of the individual.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

There is provided a system comprising a microphone, configured to receive an input speech from an individual, an analog-to-digital (A/D) converter to convert the input speech to digital form and generate a digitized speech, a memory storing an executable code and an age estimation database, a hardware processor executing the executable code to receive the digitized speech, identify a plurality of boundaries in the digitized speech delineating a plurality of phonemes in the digitized speech, extract a plurality of formant-based feature vectors from each phoneme in the digitized speech based on at least one of a formant position, a formant bandwidth, and a formant dispersion, compare the plurality of formant-based feature vectors with age determinant formant-based feature vectors of the age estimation database, determine the age of the individual when the comparison finds a match in the age estimation database, and communicate an age-appropriate response to the individual.

Citations

20 Claims

1. A system comprising:
- a microphone configured to receive an input speech from an individual;
  
  an analog-to-digital (A/D) converter configured to convert the input speech from an analog form to a digital form and generate a digitized speech;
  
  a memory storing an executable code and an age estimation database including a plurality of age determinant formant-based feature vectors;
  
  a hardware processor executing the executable code to;
  
  receive the digitized speech from the A/D converter;
  
  identify a plurality of boundaries between a plurality of phonemes in the digitized speech;
  
  extract a plurality of formant-based feature vectors from one or more phonemes of the plurality of phonemes delineated by the plurality of boundaries, based on at least one of a formant position, a formant bandwidth, and a formant dispersion, wherein the formant dispersion is a geometric mean of the formant spacing;
  
  compare the plurality of formant-based feature vectors with the age determinant formant-based feature vectors of the age estimation database;
  
  estimate the age of the individual when the comparison finds a match in the age estimation database; and
  
  communicate an age-appropriate response to the individual based on the estimated age of the individual.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 17, 18)
- - 2. The system of claim 1, wherein estimating the age of the individual includes a weighted combination of two or more age determinant formant-based feature vectors of the plurality of age determinant formant-based feature vectors.
  - 3. The system of claim 1, wherein the input speech is one of a predetermined sequence of phonemes and natural speech.
  - 4. The system of claim 1, wherein the age determinant formant-based feature vectors of the age estimation database include a plurality of formant-based feature vectors corresponding to a plurality of most predictive phonemes, wherein each of the plurality of most predictive phonemes corresponds to a different age.
  - 5. The system of claim 1, wherein the digitized speech includes at least one of a silence and a filled pause.
  - 6. The system of claim 1, wherein the input speech includes a plurality of formants where each formant of the plurality of formants is a resonance of a vocal tract of the individual.
  - 7. The system of claim 1, wherein the input speech is one of English and a language that is not English.
  - 8. The system of claim 1, wherein the age of the individual is estimated probabilistically.
  - 17. The system of claim 1, wherein the extracting of the plurality of formant-based feature vectors is based on the formant position, the formant bandwidth, and the formant dispersion.
  - 18. The system of claim 1, wherein prior to the extracting of the plurality of formant-based feature vectors, the hardware processor executes the executable code to identify a segment of one or more phonemes of the plurality of phonemes delineated by the plurality of boundaries, and wherein the extracting extracts the plurality of formant-based feature vectors from the identified segment of the one or more phonemes of the plurality of phonemes delineated by the plurality of boundaries.

9. A method for use with a system having a microphone, an analog-to-digital (A/D) converter, a memory storing an executable code, and a hardware processor, the method comprising:
- receiving, using the hardware processor, a digitized speech from the A/D converter;
  
  identifying, using the hardware processor, a plurality of boundaries between a plurality of phonemes in the digitized speech;
  
  extracting, using the hardware processor, a plurality of formant-based feature vectors from one or more phonemes of the plurality of phonemes delineated by the plurality of boundaries, based on at least one of a formant position, a formant bandwidth, and a formant dispersion, wherein the formant dispersion is a geometric mean of the formant spacing;
  
  comparing, using the hardware processor, the plurality of formant-based feature vectors with the age determinant formant-based feature vectors of the age estimation database;
  
  estimating, using the hardware processor, the age of the individual when the comparison finds a match in the age estimation database; and
  
  communicating, using the hardware processor, an age-appropriate response to the individual based on the estimated age of the individual.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 19, 20)
- - 10. The method of claim 9, wherein estimating the age of the individual includes a weighted combination of two or more age determinant formant-based feature vectors of the plurality of age determinant formant-based feature vectors.
  - 11. The method of claim 9, wherein the input speech is one of a predetermined sequence of phonemes and natural speech.
  - 12. The method of claim 9, wherein the age determinant formant-based feature vectors of the age estimation database include a plurality of formant-based feature vectors corresponding to a plurality of most predictive phonemes, wherein each of the plurality of most predictive phonemes corresponds to a different age.
  - 13. The method of claim 9, wherein the digitized speech includes at least one of a silence and a filled pause.
  - 14. The method of claim 9, wherein the input speech includes a plurality of formants where each formant of the plurality of formants is a resonance of a vocal tract of the individual.
  - 15. The method of claim 9, wherein the input speech is one of English and a language that is not English.
  - 16. The method of claim 9, wherein the age of the individual is estimated probabilistically.
  - 19. The method of claim 9, wherein the extracting of the plurality of formant-based feature vectors is based on the formant position, the formant bandwidth, and the formant dispersion.
  - 20. The method of claim 9, wherein prior to the extracting of the plurality of formant-based feature vectors, the method further comprises:
    - identifying a segment of one or more phonemes of the plurality of phonemes delineated by the plurality of boundaries;
      
      wherein the extracting extracts the plurality of formant-based feature vectors from the identified segment of the one or more phonemes of the plurality of phonemes delineated by the plurality of boundaries.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Disney Enterprises Incorporated (The Walt Disney Company)
Original Assignee
Disney Enterprises Incorporated (The Walt Disney Company)
Inventors
Singh, Rita, Lehman, Jill Fain

Granted Patent

US 10,269,356 B2
Time in Patent Office

Days
Field of Search
US Class Current
CPC Class Codes

G10L 15/02   Feature extraction for spee...

G10L 15/08   Speech classification or se...

G10L 15/10   using distance or distortio...

G10L 17/02   Preprocessing operations, e...

G10L 17/26   Recognition of special voic...

G10L 25/15   the extracted parameters be...

G10L 25/51   for comparison or discrimin...

Systems and Methods for Estimating Age of a Child Based on Speech

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Systems and Methods for Estimating Age of a Child Based on Speech

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links