System and method for expressive language, developmental disorder, and emotion assessment

US 9,355,651 B2
Filed: 04/29/2014
Issued: 05/31/2016
Est. Priority Date: 09/16/2004
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

capturing an audio recording from a language environment of a key child;

segmenting the audio recording into a plurality of segments using a Minimum Duration Gaussian Mixture Model (MD-GMM) technique, the MD-GMM technique comprising performing a maximum log-likelihood analysis to generate the plurality of segments having a minimum duration constraint;

identifying a segment ID for each of the plurality of segments, the segment ID identifying a source for audio in the segment of the plurality of segments;

identifying a plurality of key child segments from the plurality of segments, each of the plurality of key child segments having the key child as the segment ID;

estimating key child segment characteristics based in part on at least one of the plurality of key child segments, wherein the key child segment characteristics are estimated independent of contents of the plurality of key child segments, wherein the contents are meanings of the plurality of key child segments;

determining at least one metric associated with the language environment using the key child segment characteristics; and

outputting the at least one metric.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In one embodiment, a method for detecting autism in a natural language environment using a microphone, sound recorder, and a computer programmed with software for the specialized purpose of processing recordings captured by the microphone and sound recorder combination, the computer programmed to execute the method, includes segmenting an audio signal captured by the microphone and sound recorder combination using the computer programmed for the specialized purpose into a plurality recording segments. The method further includes determining which of the plurality of recording segments correspond to a key child. The method further includes determining which of the plurality of recording segments that correspond to the key child are classified as key child recordings. Additionally, the method includes extracting phone-based features of the key child recordings; comparing the phone-based features of the key child recordings to known phone-based features for children; and determining a likelihood of autism based on the comparing.

Citations

21 Claims

1. A method comprising:
- capturing an audio recording from a language environment of a key child;
  
  segmenting the audio recording into a plurality of segments using a Minimum Duration Gaussian Mixture Model (MD-GMM) technique, the MD-GMM technique comprising performing a maximum log-likelihood analysis to generate the plurality of segments having a minimum duration constraint;
  
  identifying a segment ID for each of the plurality of segments, the segment ID identifying a source for audio in the segment of the plurality of segments;
  
  identifying a plurality of key child segments from the plurality of segments, each of the plurality of key child segments having the key child as the segment ID;
  
  estimating key child segment characteristics based in part on at least one of the plurality of key child segments, wherein the key child segment characteristics are estimated independent of contents of the plurality of key child segments, wherein the contents are meanings of the plurality of key child segments;
  
  determining at least one metric associated with the language environment using the key child segment characteristics; and
  
  outputting the at least one metric.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1, further comprising:
    - identifying a plurality of adult segments from the plurality of segments, each of the plurality of adult segments having an adult as the segment ID; and
      
      estimating adult segment characteristics based in part on at least one of a plurality of adult segments, wherein the adult segment characteristics are estimated independent of contents of the plurality of adult segments,wherein determining the at least one metric associated with the language environment comprises using the adult segment characteristics.
  - 3. The method of claim 2, wherein adult segment characteristics comprise at least one of:
    - a word count;
      
      a duration of speech;
      
      a vocalization count;
      
      ora parentese count.
  - 4. The method of claim 2, wherein the at least one metric comprises at least one of:
    - a quantity of key child vocalizations in a pre-set time period;
      
      a quantity of conversational turns, wherein the conversational turns comprise a sound from one of the adult or the key child, and a response to the sound from the other one of the adult or the key child;
      
      ora quantity of adult words directed to the key child in a pre-set time period.
  - 5. The method of claim 1, further comprising:
    - performing a first segmentation using a first MD-GMM of the MD-GMM technique, the first MD-GMM comprising a plurality of models; and
      
      generating a second MD-GMM of the MD-GMM technique by modifying at least one of the plurality of models,wherein;
      
      segmenting the audio recording into the plurality of segments comprises using the second MD-GMM; and
      
      identifying the segment ID comprises using the second MD-GMM.
  - 6. The method of claim 5, wherein:
    - the plurality of models comprise a key child model, an electronic device model, and an adult model;
      
      the key child model comprises criteria associated with sounds from a child;
      
      the electronic device model comprises criteria associated with sounds from an electronic device; and
      
      the adult model comprises criteria associated with sounds from adults.
  - 7. The method of claim 6, wherein generating the second MD-GMM comprises at least one of:
    - modifying the key child model using an age-dependent key child model, wherein the age-dependent key child model comprises criteria associated with sounds from children of a plurality of ages;
      
      modifying the electronic device model;
      
      modifying at least one of the key child model or the adult model using a loudness/clearness detection model, wherein the loudness/clearness detection model comprises a Likelihood Ratio Test;
      
      ormodifying at least one of the key child model or the adult model using a parentese model, wherein the parentese model comprises complexity levels associated with sounds of adults.
  - 8. The method of claim 1, further comprising:
    - classifying each of the plurality of key child segments into one of;
      
      vocalizations;
      
      cries;
      
      vegetative sounds;
      
      orfixed signal sounds,wherein the key child segment characteristics are estimated using at least one of the plurality of key child segments classified into at least one of vocalizations or cries.
  - 9. The method of claim 8, wherein classifying each of the plurality of key child segments comprises using at least one of rule-based analysis or statistical processing.
  - 10. The method of claim 1, wherein the key child segment characteristics comprises at least one of:
    - a duration of cries;
      
      a quantity of squeals;
      
      a quantity of growls;
      
      a presence of canonical syllables;
      
      a quantity of canonical syllables;
      
      a presence of repetitive babbles;
      
      a quantity of repetitive babbles;
      
      a presence of protophones;
      
      a quantity of protophones;
      
      a duration of protophones;
      
      a presence of phoneme-like sounds;
      
      a quantity of phoneme-like sounds;
      
      a duration of phoneme-like sounds;
      
      a presence of phonemes;
      
      a quantity of phonemes;
      
      a duration of phonemes;
      
      a word count;
      
      ora vocalization count.
  - 11. The method of claim 1, wherein:
    - identifying the segment ID for each of the plurality of segments comprises using the MD-GMM technique, the MD-GMM technique further comprising correlating a maximum score for each of the plurality of segments to the source for the audio in the segment based on an association of the source with the maximum score.

12. A method comprising:
- capturing an audio recording from a language environment of a key child;
  
  segmenting the audio recording into a plurality of segments using a Minimum Duration Gaussian Mixture Model (MD-GMM) technique, the MD-GMM technique comprising performing a maximum log-likelihood analysis to generate the plurality of segments having a minimum duration constraint;
  
  identifying a segment ID for each of the plurality of segments, the segment ID identifying a source for audio in the segment of the plurality of segments, wherein the identifying the segment ID comprises comparing the plurality of segments to a plurality of models, wherein a model of the plurality of models includes a key child model and the identifying the segment ID includes identifying a plurality of key child segments from the plurality of segments;
  
  estimating key child segment characteristics based in part on at least one of the plurality of key child segments, wherein the key child segment characteristics are estimated independent of contents of the plurality of key child segments, wherein the contents are meanings of the plurality of key child segments;
  
  determining at least one metric associated with the language environment using the key child segment characteristics; and
  
  outputting the at least one metric.
- View Dependent Claims (13, 14, 15, 16)
- - 13. The method of claim 12, wherein the plurality of models further comprise models for other children, male adults, female adults, noise, and TV noise.
  - 14. The method of claim 12, wherein the plurality of models further comprise:
    - an adult model that includes characteristics of sounds from an adult;
      
      an electronic device model that includes characteristics of sounds from an electronic device;
      
      a noise model that includes characteristics of sounds attributable to noise;
      
      an other child model that includes characteristics of sounds from a child other than the key child;
      
      a parentese model that includes complexity level speech criteria of adult sounds;
      
      an age-dependent key child model that includes characteristics of sounds from children of a plurality of ages; and
      
      a loudness/-clearness detection model that includes characteristics of sounds directed to a key child.
  - 15. The method of claim 12, wherein segmenting the audio recording into the plurality of segments using the Minimum Duration Gaussian Mixture Model (MD-GMM) technique comprises segmenting the audio recording into the plurality of segments using a maximum likelihood analysis.
  - 16. The method of claim 12, wherein:
    - identifying the segment ID for each of the plurality of segments comprises using the MD-GMM technique, the MD-GMM technique further comprising correlating a maximum score for each of the plurality of segments to the source for the audio in the segment based on an association of the source with the maximum score.

17. A method comprising:
- capturing an audio recording from a language environment of a key child;
  
  using a Minimum Duration Gaussian Mixture Model (MD-GMM) technique to simultaneously segment the audio recording and identify a segment ID for each of a plurality of segments segmented from the audio recording, the segment ID identifying a source for audio in the segment of the plurality of segments, wherein the identifying includes comparing the plurality of segments to a plurality of models, the MD-GMM technique comprising generating the plurality of segments having a minimum duration constraint and correlating a maximum score for each of the plurality of segments to the source for the audio in the segment based on an association of the source with the maximum score;
  
  determining at least one metric associated with the language environment based on the plurality of segments that have been identified; and
  
  outputting the at least one metric.
- View Dependent Claims (18, 19, 20, 21)
- - 18. The method of claim 17, wherein the MD-GMM technique comprises a maximum likelihood analysis.
  - 19. The method of claim 17, wherein:
    - a model of the plurality of models includes a key child model; and
      
      using the Minimum Duration Gaussian Mixture Model (MD-GMM) technique comprises identifying a plurality of key child segments from the plurality of segments.
  - 20. The method of claim 19, wherein the determining the at least one metric comprises:
    - using key child segment characteristics to determine the at least one metric.
  - 21. The method of claim 20, further comprising:
    - estimating the key child segment characteristics based in part on at least one of the plurality of key child segments,wherein;
      
      the key child segment characteristics are estimated independent of contents of the plurality of key child segments; and
      
      the contents are meanings of the plurality of key child segments.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
LENA Foundation
Original Assignee
LENA Foundation
Inventors
Xu, Dongxin D., Paul, Terrance D.
Primary Examiner(s)
KOVACEK, DAVID M

Application Number

US14/265,188
Publication Number

US 20140255887A1
Time in Patent Office

763 Days
Field of Search

704 1- 10, 704231-246, 704247-251, 704270-271, 704E17001-E17016, 704E15001-E1505, 704E11001-E11007
US Class Current

1/1
CPC Class Codes

A61B 2503/04   Babies, e.g. for SIDS detec...

A61B 2503/06   Children, e.g. for attentio...

A61B 5/168   Evaluating attention defici...

A61B 5/4803   Speech analysis specially a...

A61B 5/7264   Classification of physiolog...

A61B 5/7267   involving training the clas...

G10L 15/04   Segmentation; Word boundary...

G10L 15/06   Creation of reference templ...

G10L 15/14   using statistical models, e...

G10L 17/00   Speaker identification or v...

G10L 25/03   characterised by the type o...

G10L 25/66   for extracting parameters r...

G16H 50/20   for computer-aided diagnosi...

System and method for expressive language, developmental disorder, and emotion assessment

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for expressive language, developmental disorder, and emotion assessment

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links