SYSTEMS AND METHODS FOR ALIGNING LYRICS USING A NEURAL NETWORK
First Claim
1. A method, comprising:
- at an electronic device having one or more processors and memory storing instructions for execution by the one or more processors;
receiving audio data for a media item;
generating, from the audio data, a plurality of samples, each sample having a predefined maximum length;
using a neural network trained to predict text probabilities, generating a probability matrix of textual units for a first portion of a first sample of the plurality of samples, wherein the probability matrix includes;
information about the textual units,timing information, andrespective probabilities of respective textual units at respective times;
identifying, for the first portion of the first sample, a first sequence of textual units based on the generated probability matrix.
1 Assignment
0 Petitions
Accused Products
Abstract
An electronic device receives audio data for a media item. The electronic device generates, from the audio data, a plurality of samples, each sample having a predefined maximum length. The electronic device, using a neural network trained to predict textal unit probabilities, generates a probability matrix of textual units for a first portion of a first sample of the plurality of samples. The probability matrix includes information about textual units, timing information, and respective probabilities of respective textual units at respective times. The electronic device identifies, for the first portion of the first sample, a first sequence of textual units based on the generated probability matrix.
-
Citations
13 Claims
-
1. A method, comprising:
at an electronic device having one or more processors and memory storing instructions for execution by the one or more processors; receiving audio data for a media item; generating, from the audio data, a plurality of samples, each sample having a predefined maximum length; using a neural network trained to predict text probabilities, generating a probability matrix of textual units for a first portion of a first sample of the plurality of samples, wherein the probability matrix includes; information about the textual units, timing information, and respective probabilities of respective textual units at respective times; identifying, for the first portion of the first sample, a first sequence of textual units based on the generated probability matrix. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
12. A first electronic device comprising:
-
one or more processors; and memory storing instructions for execution by the one or more processors, the instructions including instructions for; receiving audio data for a media item; generating, from the audio data, a plurality of samples, each sample having a predefined maximum length; using a neural network trained to predict text probabilities, generating a probability matrix of textual units for a first portion of a first sample of the plurality of samples, wherein the probability matrix includes; information about the textual units, timing information, and respective probabilities of respective textual units at respective times; identifying, for the first portion of the first sample, a first sequence of textual units based on the generated probability matrix.
-
-
13. A non-transitory computer-readable storage medium storing instructions that, when executed by an electronic device, cause the electronic device to:
-
receive audio data for a media item; generate, from the audio data, a plurality of samples, each sample having a predefined maximum length; using a neural network trained to predict text probabilities, generate a probability matrix of textual units for a first portion of a first sample of the plurality of samples, wherein the probability matrix includes; information about the textual units, timing information, and respective probabilities of respective textual units at respective times; identify, for the first portion of the first sample, a first sequence of textual units based on the generated probability matrix.
-
Specification