MUSIC INFORMATION RETRIEVAL USING A 3D SEARCH ALGORITHM
First Claim
1. A method for the retrieval of music information based on audio input (102, 300a), the method comprising the following steps:
- pre-storing (S11a) a defined set of music sequences with associated information, entering (S11b) speech (400) and/or music information (102, 300a) and arranging (S11c) a coding representing said speech and music information, respectively, as a first (S) and a second dimension (H) of a three-dimensional search space, time (t) being the third dimension, and carrying out (S11d) a search in the three-dimensional search space in order to find the music sequence out of the set of music sequences matching best to the entered speech (400) and/or music information (102, 300a).
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention generally relates to the field of content-based music information retrieval systems, in particular to a method and a query-by-humming (QbH) database system (100′) for processing queries in the form of analog audio sequences which encompass recorded parts of sung, hummed or whistled tunes (102), recorded parts of a melody (300a) played on a musical instrument and/or a speaker'"'"'s recorded voice (400) articulating at least one part of a song'"'"'s lyrics to retrieve textual background information about a musical piece whose score is stored in an integrated database (103, 105) of said system after having analyzed and recognized said melody (300a).
According to one embodiment of the present invention, said method is characterized by the steps of recording (S1) said analog audio sequences (102, 300a, 400), extracting (S4a) and analyzing (S4b) various acoustic-phonetic speech characteristics of the speaker'"'"'s voice and pronunciation from spoken parts (400) of a recorded song'"'"'s lyrics (102″) and recognizing (S4c) syntax and semantics of said lyrics (102″). The method further comprises the steps of extracting (S2a), analyzing (S2b) and recognizing (S2c) musical key characteristics from the analog audio sequences (102, 300a, 400), which are given by the semitone numbers of the particular notes, the intervals and/or interval directions of the melody and the time values of the notes and pauses the rhythm of said melody is composed of, the key, beat, tempo, volume, agogics, dynamics, phrasing, articulation, timbre and instrumentation of said melody, the harmonies of accompaniment chords and/or electronic sound effects generated by said musical instrument. The invention is characterized by the step of calculating (S3a) a similarity measure indicating the similarity of melody and lyrics of the recorded audio sequence (102, 300a) compared to melody and lyrics of various music files stored in said database (103, 105) by performing a Viterbi search algorithm on a three-dimensional search space, said search space having a first dimension (t) for the time, a second dimension (S) for an appropriate coding of the acoustic-phonetic speech characteristics and a third dimension (H) for an appropriate coding of the musical key characteristics, and generating (S3b) a ranked list (107) of said music files.
-
Citations
10 Claims
-
1. A method for the retrieval of music information based on audio input (102, 300a), the method comprising the following steps:
-
pre-storing (S11a) a defined set of music sequences with associated information, entering (S11b) speech (400) and/or music information (102, 300a) and arranging (S11c) a coding representing said speech and music information, respectively, as a first (S) and a second dimension (H) of a three-dimensional search space, time (t) being the third dimension, and carrying out (S11d) a search in the three-dimensional search space in order to find the music sequence out of the set of music sequences matching best to the entered speech (400) and/or music information (102, 300a). - View Dependent Claims (2, 3, 4, 5, 6, 10)
-
-
7. A music information retrieval system based on audio input (102, 300a), said system comprising:
a database (103, 105) for prestoring (S11a) a defined set of music sequences with associated information, means (101) for entering (S11b) speech (400) and/or music information (102, 300a), coding means (100′
, 104″
) for arranging (S11c) a coding representing said speech and music information respectively as a first (S) and a second dimension (H) of a three-dimensional search space, time (t) being the third dimension, characterized by matching means (106) for carrying out (S11d) a search in the three-dimensional search space in order to find the music sequence out of the set of music sequences matching best to the entered speech (400) and/or music information (102, 300a).- View Dependent Claims (8, 9)
Specification