System and method for detection and analysis of speech
First Claim
1. A method comprising:
- capturing an audio recording from a language environment of a key child,segmenting the audio recording into a plurality of segments;
identifying a segment ID for each of the plurality of segments, the segment ID identifying a source for audio in the segment, wherein segmenting the audio recording into the plurality of segments and identifying the segment ID for each of the plurality of segments comprises using a Minimum Duration Gaussian Mixture Model (MD-GMM), and wherein the segments identified using the MD-GMM are at least a minimum duration D, and any segments with a duration longer than 2*D are broken down into several segments with a duration between D and 2*D;
estimating key child segment characteristics based in part on at least one of the plurality of key child segments, wherein the key child segment characteristics are estimated independent of content of the plurality of key child segments, wherein the content is the meaning of the plurality of key child segments;
determining at least one metric associated with the language environment using the key child segment characteristics; and
outputting the at least one metric.
2 Assignments
0 Petitions
Accused Products
Abstract
Certain aspects and embodiments of the present invention are directed to systems and methods for monitoring and analyzing the language environment and the development of a key child. A key child'"'"'s language environment and language development can be monitored without placing artificial limitations on the key child'"'"'s activities or requiring a third party observer. The language environment can be analyzed to identify words, vocalizations, or other noises directed to or spoken by the key child, independent of content. The analysis can include the number of responses between the child and another, such as an adult and the number of words spoken by the child and/or another, independent of content of the speech. One or more metrics can be determined based on the analysis and provided to assist in improving the language environment and/or tracking language development of the key child.
93 Citations
26 Claims
-
1. A method comprising:
-
capturing an audio recording from a language environment of a key child, segmenting the audio recording into a plurality of segments; identifying a segment ID for each of the plurality of segments, the segment ID identifying a source for audio in the segment, wherein segmenting the audio recording into the plurality of segments and identifying the segment ID for each of the plurality of segments comprises using a Minimum Duration Gaussian Mixture Model (MD-GMM), and wherein the segments identified using the MD-GMM are at least a minimum duration D, and any segments with a duration longer than 2*D are broken down into several segments with a duration between D and 2*D; estimating key child segment characteristics based in part on at least one of the plurality of key child segments, wherein the key child segment characteristics are estimated independent of content of the plurality of key child segments, wherein the content is the meaning of the plurality of key child segments; determining at least one metric associated with the language environment using the key child segment characteristics; and outputting the at least one metric. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13)
-
-
11. A method comprising:
-
capturing an audio recording from a language environment of a key child; segmenting the audio recording into a plurality of segments and identifying a segment ID for at least one of the plurality of segments using a Minimum Duration Gaussian Mixture Model (MD-GMM), the segment ID identifying a key child), wherein the segments identified using the MD-GMM are at least a minimum duration D, and any segments with a duration longer than 2*D are broken down into several segments with a duration between D and 2*D; estimating key child segment characteristics based in part on the at least one of the plurality of segments, wherein the key child segment characteristics are estimated independent of content of the plurality of segments, wherein the content is the meaning of the plurality of key child segments; determining at least one metric associated with the language environment using the key child segment characteristics; and outputting the at least one metric, wherein the key child characteristics comprise a number of vowels and a number of consonants in the at least one of the plurality of segments, wherein the determining at least one metric associated with the language environment using the key child segment characteristics comprises comparing the number of vowels an number of consonants in the at least one of the plurality of segments to attributes associated with a native language of the key child to determine a total number of words spoken by the key child.
-
-
14. A system comprising:
-
a recorder adapted to capture audio recordings from a language environment of a key child; a processor-based device, wherein the recorder provides the audio recordings to the processor-based device, and the processor-based device comprising an application having an audio engine adapted to segment the audio recording into a plurality of segments and identify a segment ID for each of the plurality of segments, wherein at least one of the plurality of segments is associated with a key child segment ID, wherein the audio engine segments the audio recording and identifies a segment ID for each of the plurality of segments using a Minimum Duration Gaussian Mixture Model (MD-GMM), and wherein the segments identified using the MD-GMM are at least a minimum duration D, and any segments with a duration longer than 2*D are broken down into several segments with a duration between D and 2*D, the audio engine being further adapted to; estimate key child segment characteristics based on the at least one of the plurality of segments, wherein the audio engine estimates key child segment characteristics independent of content of the at least one of the plurality of segments, wherein the content is the meaning of the plurality of key child segments; determine at least one metric associated with the language environment using the key child segment characteristics; and output the at least one metric to an output device. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
-
Specification