Speech end-pointer
First Claim
1. A system for determining at least one of a beginning or an end of a speech segment, the system comprising:
- a computer processing unit configured to access a memory to determine at least one of the beginning or the end of the speech segment, where the memory comprises,a voice triggering module executable on the computer processing unit to identify a triggering characteristic in a speech segment of an audio stream; and
a rule module executable on the computer processing unit and in communication with the voice triggering module, the rule module comprising a first rule that counts a number of isolated energy events preceding the triggering characteristic, and a second rule that determines that a frame of the audio stream that precedes the triggering characteristic is outside of the beginning or the end of the speech segment when a number of allowed isolated energy events in the audio stream preceding the trigger characteristic is exceeded.
9 Assignments
0 Petitions
Accused Products
Abstract
A rule-based end-pointer isolates spoken utterances contained within an audio stream from background noise and non-speech transients. The rule-based end-pointer includes a plurality of rules to determine the beginning and/or end of a spoken utterance based on various speech characteristics. The rules may analyze an audio stream or a portion of an audio stream based upon an event, a combination of events, the duration of an event, or a duration relative to an event. The rules may be manually or dynamically customized depending upon factors that may include characteristics of the audio stream itself, an expected response contained within the audio stream, or environmental conditions.
134 Citations
17 Claims
-
1. A system for determining at least one of a beginning or an end of a speech segment, the system comprising:
a computer processing unit configured to access a memory to determine at least one of the beginning or the end of the speech segment, where the memory comprises, a voice triggering module executable on the computer processing unit to identify a triggering characteristic in a speech segment of an audio stream; and a rule module executable on the computer processing unit and in communication with the voice triggering module, the rule module comprising a first rule that counts a number of isolated energy events preceding the triggering characteristic, and a second rule that determines that a frame of the audio stream that precedes the triggering characteristic is outside of the beginning or the end of the speech segment when a number of allowed isolated energy events in the audio stream preceding the trigger characteristic is exceeded. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
8. A method of determining at least one of a beginning or end of an audio speech segment, the method comprising:
-
receiving a portion of an audio stream that includes a speech segment; identifying a triggering characteristic in the speech segment; applying at least one decision rule to the speech segment of the audio stream to count a number of isolated energy events in the audio stream that precede the triggering characteristic; and determining that a frame of the audio stream is outside of an endpoint of the speech segment when a number of allowed isolated energy events is exceeded. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A system for determining at least one of a beginning or an end of an audio speech segment in an audio stream, the system comprising:
a computer processing unit configured to access a memory to determine at least one of the beginning or the end of the audio speech segment in the audio stream, where the memory comprises, a voice triggering module executable on the computer processing unit to identify a portion of the audio stream comprising a periodic audio signal; and an end-pointer module executable on the computer processing unit and in communication with the voice triggering module, the end-pointer module configured to vary an amount of the audio stream input to a recognition device based on a plurality of rules, where the end-pointer module is further configured to determine whether one or more portions of the audio stream before or after the portion of the audio stream comprising the periodic audio signal contain speech by applying a rule that counts a number of isolated energy events in the audio stream and upon determination that more than a predetermined number of isolated energy events after the portion of the audio stream comprising the periodic audio signal occurred identifies a frame immediately preceding a last isolated energy event as the end of the audio speech segment, to exclude, from the audio speech segment input to the recognition device, a portion of the audio stream that contains one or more isolated energy events.
-
16. A non-transitory computer readable medium having stored therein data representing instructions executable by a programmed processor for determining at least one of a beginning or end of an audio speech segment, the non-transitory computer readable medium comprising instructions operative for:
-
converting sound waves associated with an audio speech segment into electrical signals; analyzing the electrical signals to identify a periodic portion of the audio speech segment; analyzing the electrical signals to identify isolated energy events in the audio speech segment; counting a number of individual isolated energy events in the audio speech segment; and setting the end of the audio speech segment, upon determination that more than a predetermined number of individual isolated energy events occurred after the periodic portion of the audio speech segment, to exclude isolated energy events occurring after the predetermined number of isolated energy events. - View Dependent Claims (17)
-
Specification