SYSTEM AND METHOD FOR IMPROVING SPEAKER SEGMENTATION AND RECOGNITION ACCURACY IN A MEDIA PROCESSING ENVIRONMENT
First Claim
Patent Images
1. A method, comprising:
- estimating an approximate list of potential speakers in a file from one or more applications, wherein the file includes a recording of a plurality of speakers;
segmenting the file according to the approximate list of potential speakers such that each segment corresponds to at least one speaker; and
recognizing particular speakers in the file based on the approximate list of potential speakers.
1 Assignment
0 Petitions
Accused Products
Abstract
A method is provided and includes estimating an approximate list of potential speakers in a file from one or more applications. The file (e.g., an audio file, video file, or any suitable combination thereof) includes a recording of a plurality of speakers. The method also includes segmenting the file according to the approximate list of potential speakers such that each segment corresponds to at least one speaker; and recognizing particular speakers in the file based on the approximate list of potential speakers.
-
Citations
20 Claims
-
1. A method, comprising:
-
estimating an approximate list of potential speakers in a file from one or more applications, wherein the file includes a recording of a plurality of speakers; segmenting the file according to the approximate list of potential speakers such that each segment corresponds to at least one speaker; and recognizing particular speakers in the file based on the approximate list of potential speakers. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. Logic encoded in non-transitory media that includes instructions for execution and when executed by a processor, is operable to perform operations comprising:
-
estimating an approximate list of potential speakers in a video/audio file from one or more applications, wherein the video/audio file includes a recording of a plurality of speakers; segmenting the video/audio file according to the approximate list of potential speakers, such that each segment corresponds to at least one speaker; and recognizing particular speakers in the video/audio file based on the approximate list of potential speakers. - View Dependent Claims (12, 13, 14, 15)
-
-
16. An apparatus, comprising:
-
a memory element for storing data; and a processor that executes instructions associated with the data, wherein the processor and the memory element cooperate such that the apparatus is configured to; estimate an approximate list of potential speakers in a video/audio file from one or more applications, wherein the video/audio file includes a recording of a plurality of speakers; segment the video/audio file according to the approximate list of potential speakers, such that each segment corresponds to at least one speaker; and recognize particular speakers in the video/audio file based on the approximate list of potential speakers. - View Dependent Claims (17, 18, 19, 20)
-
Specification