Speech signal similarity
First Claim
1. A method for determining a similarity between a first audio source and a second audio source, the method comprising:
- for the first audio source, performing the steps of;
determining, using an analysis module of a computer, a first plurality of segments of the first audio source;
determining, using the analysis module, a first frequency of occurrence for each of a plurality of phoneme sequences in the first audio source;
determining, using the analysis module, a first weighted frequency for each of the plurality of phoneme sequences based on the first frequency of occurrence for the phoneme sequence;
wherein determining the first weighted frequency includes emphasizing phoneme sequences that occur in few segments of the first plurality of segments relative to phoneme sequences that occur in many segments of the first plurality of segments;
for the second audio source, performing the steps of;
determining, using the analysis module, a second plurality of segments of the second audio source;
determining, using the analysis module, a second frequency of occurrence for each of a plurality of phoneme sequences in the second audio source;
determining, using the analysis module, a second weighted frequency for each of the plurality of phoneme sequences based on the second frequency of occurrence for the phoneme sequence;
wherein determining the second weighted frequency includes emphasizing phoneme sequences that occur in few segments of the second plurality of segments relative to phoneme sequences that occur in many segments of the second plurality of segments;
comparing, using a comparison module of a computer, the first weighted frequency for each phoneme sequence with the second weighted frequency for the corresponding phoneme sequence; and
generating, using the comparison module, a similarity score representative of a similarity between the first audio source and the second audio source based on the results of the comparing.
6 Assignments
0 Petitions
Accused Products
Abstract
A method for determining a similarity between a first audio source and a second audio source includes: for the first audio source, determining a first frequency of occurrence for each of a plurality of phoneme sequences and determining a first weighted frequency for each of the plurality of phoneme sequences based on the first frequency of occurrence for the phoneme sequence; for the second audio source, determining a second frequency of occurrence for each of a plurality of phoneme sequences and determining a second weighted frequency for each of the plurality of phoneme sequences based on the second frequency of occurrence for the phoneme sequence; comparing the first weighted frequency for each phoneme sequence with the second weighted frequency for the corresponding phoneme sequence; and generating a similarity score representative of a similarity between the first audio source and the second audio source based on the results of the comparing.
10 Citations
15 Claims
-
1. A method for determining a similarity between a first audio source and a second audio source, the method comprising:
-
for the first audio source, performing the steps of; determining, using an analysis module of a computer, a first plurality of segments of the first audio source; determining, using the analysis module, a first frequency of occurrence for each of a plurality of phoneme sequences in the first audio source; determining, using the analysis module, a first weighted frequency for each of the plurality of phoneme sequences based on the first frequency of occurrence for the phoneme sequence; wherein determining the first weighted frequency includes emphasizing phoneme sequences that occur in few segments of the first plurality of segments relative to phoneme sequences that occur in many segments of the first plurality of segments; for the second audio source, performing the steps of; determining, using the analysis module, a second plurality of segments of the second audio source; determining, using the analysis module, a second frequency of occurrence for each of a plurality of phoneme sequences in the second audio source; determining, using the analysis module, a second weighted frequency for each of the plurality of phoneme sequences based on the second frequency of occurrence for the phoneme sequence; wherein determining the second weighted frequency includes emphasizing phoneme sequences that occur in few segments of the second plurality of segments relative to phoneme sequences that occur in many segments of the second plurality of segments; comparing, using a comparison module of a computer, the first weighted frequency for each phoneme sequence with the second weighted frequency for the corresponding phoneme sequence; and generating, using the comparison module, a similarity score representative of a similarity between the first audio source and the second audio source based on the results of the comparing. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method for determining a similarity between a first audio source and a second audio source, the method comprising:
-
generating, using a computer, a phonetic transcript of the first audio source, the phonetic transcript including a list of phonemes occurring in the first audio source; selecting a plurality of sequences of phonemes from the list of phonemes, each sequence of phonemes being associated with a time interval in the first audio source; searching, using the computer, the second audio source to identify occurrences of each of the plurality of sequences of phonemes, each identified occurrence being associated with a time interval in the second audio source and a search score; forming a set of merged sequences of phonemes including merging at least some sequences of phonemes of the plurality of sequences of phonemes with overlapping time intervals; forming a set of merged occurrences of sequences of phonemes including merging occurrences of sequences of phonemes with overlapping time intervals, including for each merged occurrence, forming an associated score by accumulating the search scores associated with the occurrences and forming an associated time duration by accumulating time durations associated with the occurrences; and generating, using the computer, a score representative of a similarity between the first audio source and the second audio source, based on one or both of;
the scores associated with the merged set of occurrences of sequences of phonemes and the time durations associated with the merged set of occurrences of sequences of phonemes. - View Dependent Claims (15)
-
Specification