×

SYSTEMS AND METHODS FOR AUTOMATIC-CREATION OF SOUNDTRACKS FOR SPEECH AUDIO

  • US 20180032610A1
  • Filed: 07/28/2017
  • Published: 02/01/2018
  • Est. Priority Date: 07/29/2016
  • Status: Active Grant
First Claim
Patent Images

1. A method of automatically generating a digital soundtrack intended for synchronised playback with associated speech audio, the soundtrack comprising one or more audio regions configured for synchronised playback with corresponding speech regions of the speech audio, the method executed by a processing device or devices having associated memory, the method comprising:

  • (a) receiving or retrieving raw text data representing or corresponding to the speech audio into memory;

    (b) applying natural language processing (NLP) to the raw text data to generate processed text data comprising token data that identifies individual tokens in the raw text, the tokens at least identifying distinct words or word concepts;

    (c) applying semantic analysis to a series of text segments of the processed text data based on a continuous emotion model defined by a predefined number of emotional category identifiers each representing an emotional category in the model, the semantic analysis being configured to parse the processed text data to generate, for each text segment, a segment emotional data profile based on the continuous emotion model;

    (d) identifying a series of text or speech regions comprising a text segment or a plurality of adjacent text segments having an emotional association by processing the segment emotional data profiles of the text segments with respect to predefined rules, and generating audio region data defining the intended audio regions corresponding to the identified text regions, each audio region being defined by a start position indicative of the position in the text at which the audio region is to commence playback, a stop position indicative of the position in the text at which the audio region is to cease playback, and a generated region emotional profile based on the segment emotional profiles of the text segments within its associated text region;

    (e) processing an accessible audio database or databases comprising audio data files and associated audio profile information to select an audio data file for playback in each audio region, the selection at least partly based on the audio profile information of the audio data file corresponding to the region emotional profile of the audio region, and defining audio data for each audio region representing the selected audio data file for playback; and

    (f) generating soundtrack data representing the created soundtrack for synchronised playback with the speech audio, the soundtrack data comprising data representing the generated audio regions and audio data associated with those audio regions.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×