Method and electronic device for processing voice data
First Claim
Patent Images
1. A processing method, comprising:
- acquiring voice data, the voice data being collected by at least two collecting devices from a voice source that generates a sound, each of the at least two collecting devices including a plurality of microphones forming one or more microphone arrays for performing signal processing locally;
calculating a distance between the voice source and each of the at least two collecting devices based on different timings that the sound reaches the plurality of microphones;
stitching the voice data based on a sequence of timings at which the voice data are collected to generate stitched voice data, the stitched voice data including first voice data and second voice data adjacent to each other in the sequence and being collected by different ones of the at least two collecting devices;
analyzing frequencies of the stitched voice data to determine whether a similarity between a first frequency waveform of the first voice data and a second frequency waveform of the second voice data exceeds a threshold;
determining, in response to determining that the similarity exceeds the threshold, that the stitched voice data includes a first content corresponding to the first frequency waveform and a second content corresponding to the second frequency waveform, the first content and the second content being the same as each other and being collected by different ones of the at least two collecting devices during two time periods that overlap with each other;
selecting, according to the calculated distance, one the first content and the second content that is collected by one of the at least two collecting devices closer to the voice source as a target content; and
replacing the first content and the second content with the target content to obtain to-be-recognized voice data for recognition;
acquiring a recognition result of the to-be-recognized voice data, the recognition result corresponding to a voice generated by the voice source; and
in response to the recognition result, executing a corresponding command.
1 Assignment
0 Petitions
Accused Products
Abstract
A data processing method includes acquiring voice data collected by at least two collecting devices from a voice source, acquiring a recognition result of the voice data that corresponds to a voice generated by the voice source, and executing a corresponding command in response to the recognition result.
47 Citations
14 Claims
-
1. A processing method, comprising:
-
acquiring voice data, the voice data being collected by at least two collecting devices from a voice source that generates a sound, each of the at least two collecting devices including a plurality of microphones forming one or more microphone arrays for performing signal processing locally; calculating a distance between the voice source and each of the at least two collecting devices based on different timings that the sound reaches the plurality of microphones; stitching the voice data based on a sequence of timings at which the voice data are collected to generate stitched voice data, the stitched voice data including first voice data and second voice data adjacent to each other in the sequence and being collected by different ones of the at least two collecting devices; analyzing frequencies of the stitched voice data to determine whether a similarity between a first frequency waveform of the first voice data and a second frequency waveform of the second voice data exceeds a threshold; determining, in response to determining that the similarity exceeds the threshold, that the stitched voice data includes a first content corresponding to the first frequency waveform and a second content corresponding to the second frequency waveform, the first content and the second content being the same as each other and being collected by different ones of the at least two collecting devices during two time periods that overlap with each other; selecting, according to the calculated distance, one the first content and the second content that is collected by one of the at least two collecting devices closer to the voice source as a target content; and replacing the first content and the second content with the target content to obtain to-be-recognized voice data for recognition; acquiring a recognition result of the to-be-recognized voice data, the recognition result corresponding to a voice generated by the voice source; and in response to the recognition result, executing a corresponding command. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. An electronic device comprising:
a processor communicatively coupled to at least two collecting devices, each of the at least two collecting devices including a plurality of microphones forming one or more microphone arrays for performing signal processing locally, wherein the processor; acquires voice data collected by the at least two collecting devices from a voice source, calculates a distance between the voice source and each of the at least two collecting devices based on different timings that the sound reaches the plurality of microphones; stitches the voice data based on a sequence of timings at which the voice data are collected to generate stitched voice data, the stitched voice data including first voice data and second voice data adjacent to each other in the sequence and being collected by different ones of the at least two collecting devices, analyzes frequencies of the stitched voice data to determine whether a similarity between a first frequency waveform of the first voice data and a second frequency waveform of the second voice data exceeds a threshold, determines, in response to determining that the similarity exceeds the threshold, that the stitched voice data includes a first content corresponding to the first frequency waveform and a second content corresponding to the second frequency waveform, the first content and the second content being the same as each other and being collected by different ones of the at least two collecting devices during two time periods that overlap with each other, selects, according to the calculated distance, the first content and the second content that is collected by one of the at least two collecting devices closer to the voice source as a target content, and replaces the first content and the second content with the target content to obtain to-be-recognized voice data for recognition, acquires a recognition result of the to-be-recognized voice data, the recognition result corresponding to a voice generated by the voice source, and in response to the recognition result, executes a corresponding command. - View Dependent Claims (12, 13, 14)
Specification