Automatic conversion of speech into song, rap or other audible expression having target meter or rhythm
First Claim
1. A computational method for transforming an input audio encoding of speech into an output that is rhythmically consistent with a target song, the method comprising:
- segmenting the input audio encoding of the speech into plural segments, the segments corresponding to successive sequences of samples of the audio encoding and delimited by onsets identified therein;
mapping individual ones of the plural segments to respective sub-phrase portions of a phrase template for the target song, the mapping establishing one or more phrase candidates;
temporally aligning at least one of the phrase candidates with a rhythmic skeleton for the target song; and
preparing a resultant audio encoding of the speech in correspondence with the temporally aligned phrase candidate-mapped from onset-delimited segments of the input audio encoding.
2 Assignments
0 Petitions
Accused Products
Abstract
Captured vocals may be automatically transformed using advanced digital signal processing techniques that provide captivating applications, and even purpose-built devices, in which mere novice user-musicians may generate, audibly render and share musical performances. In some cases, the automated transformations allow spoken vocals to be segmented, arranged, temporally aligned with a target rhythm, meter or accompanying backing tracks and pitch corrected in accord with a score or note sequence. Speech-to-song music applications are one such example. In some cases, spoken vocals may be transformed in accord with musical genres such as rap using automated segmentation and temporal alignment techniques, often without pitch correction. Such applications, which may employ different signal processing and different automated transformations, may nonetheless be understood as speech-to-rap variations on the theme.
-
Citations
28 Claims
-
1. A computational method for transforming an input audio encoding of speech into an output that is rhythmically consistent with a target song, the method comprising:
-
segmenting the input audio encoding of the speech into plural segments, the segments corresponding to successive sequences of samples of the audio encoding and delimited by onsets identified therein; mapping individual ones of the plural segments to respective sub-phrase portions of a phrase template for the target song, the mapping establishing one or more phrase candidates; temporally aligning at least one of the phrase candidates with a rhythmic skeleton for the target song; and preparing a resultant audio encoding of the speech in correspondence with the temporally aligned phrase candidate-mapped from onset-delimited segments of the input audio encoding. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. An apparatus comprising:
-
a portable computing device; and machine readable code embodied in a non-transitory medium and executable on the portable computing device to transform an input audio encoding of speech into an output that is rhythmically consistent with a target song, the machine readable code including instructions executable to segment the input audio encoding of the speech into plural segments, the segments corresponding to successive sequences of samples of the audio encoding and delimited by onsets identified therein; the machine readable code further executable to map individual ones of the plural segments to respective sub-phrase portions of a phrase template for the target song, the mapping establishing one or more phrase candidates; the machine readable code further executable to temporally align at least one of the phrase candidates with a rhythmic skeleton for the target song; and the machine readable code further executable to prepare a resultant audio encoding of the speech in correspondence with the temporally aligned phrase candidate-mapped from onset-delimited segments of the input audio encoding. - View Dependent Claims (24, 25)
-
-
26. A computer program product encoded in non-transitory media and including instructions executable to transform an input audio encoding of speech into an output that is rhythmically consistent with a target song, the computer program product encoding and comprising:
-
instructions executable to segment the input audio encoding of the speech into plural segments, the segments corresponding to successive sequences of samples of the audio encoding and delimited by onsets identified therein; instructions executable to map individual ones of the plural segments to respective sub-phrase portions of a phrase template for the target song, the mapping establishing a one or more phrase candidates; instructions executable to temporally align at least one of the phrase candidates with a rhythmic skeleton for the target song; and instructions executable to prepare a resultant audio encoding of the speech in correspondence with the temporally aligned phrase candidate-mapped from onset delimited segments of the input audio encoding. - View Dependent Claims (27, 28)
-
Specification