Identifying qualified audio of a plurality of audio streams for display in a user interface

US 8,942,987 B1
Filed: 03/21/2014
Issued: 01/27/2015
Est. Priority Date: 12/11/2013
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

a) receiving a plurality of audio input data streams, wherein the plurality of audio input data streams comprise different audio input data streams representing concurrent outputs of a plurality of different microphones;

b) identifying qualified audio on each of the plurality of audio input data streams by;

i) identifying any unique signals on any of the plurality of audio input data streams which exceed an amplitude threshold as qualified audio; and

ii) when similar signals exceeding the amplitude threshold are detected on multiple audio input data streams, identifying only the loudest of the similar signals as qualified audio;

c) using a computer, organizing qualified audio into speech blocks, each of which has a status and a start time, and is associated with a single audio input data stream; and

d) presenting a speech block interface to a user, wherein the speech block interface displays, for each audio input data stream, a timeline of speech blocks for the audio input data stream;

wherein;

a) organizing qualified audio into speech blocks comprises;

i) adding identified qualified audio to an existing speech block if and only if the identified qualified audio is on the same audio input data stream as the existing speech block and either;

A) the identified qualified audio is separated from an end time for the existing speech block by no more than a first intra-block duration threshold and the existing speech block is subject to being, but has not been, discarded;

orB) the identified qualified audio is separated from the end time for the existing speech block by no more than a second intra-block duration threshold and the existing speech block is not subject to being discarded;

ii) creating a new speech block which is subject to being, but has not been, discarded, with the identified qualified audio if and only if the identified qualified audio is not added to the existing speech block and there are no other speech blocks between the identified qualified audio and the existing speech block on the same audio input data stream as the identified qualified audio; and

b) the first intra-block duration threshold is different from the second intra-block duration threshold.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A clear picture of who is speaking in a setting where there are multiple input sources (e.g., a conference room with multiple microphones) can be obtained by comparing input channels against each other. The data from each channel can not only be compared, but can also be organized into portions which logically correspond to statements by a user. These statements, along with information regarding who is speaking, can be presented in a user friendly format via an interactive timeline which can be updated in real time as new audio input data is received.

71 Citations

View as Search Results

16 Claims

1. A method comprising:
- a) receiving a plurality of audio input data streams, wherein the plurality of audio input data streams comprise different audio input data streams representing concurrent outputs of a plurality of different microphones;
  
  b) identifying qualified audio on each of the plurality of audio input data streams by;
  
  i) identifying any unique signals on any of the plurality of audio input data streams which exceed an amplitude threshold as qualified audio; and
  
  ii) when similar signals exceeding the amplitude threshold are detected on multiple audio input data streams, identifying only the loudest of the similar signals as qualified audio;
  
  c) using a computer, organizing qualified audio into speech blocks, each of which has a status and a start time, and is associated with a single audio input data stream; and
  
  d) presenting a speech block interface to a user, wherein the speech block interface displays, for each audio input data stream, a timeline of speech blocks for the audio input data stream;
  
  wherein;
  
  a) organizing qualified audio into speech blocks comprises;
  
  i) adding identified qualified audio to an existing speech block if and only if the identified qualified audio is on the same audio input data stream as the existing speech block and either;
  
  A) the identified qualified audio is separated from an end time for the existing speech block by no more than a first intra-block duration threshold and the existing speech block is subject to being, but has not been, discarded;
  
  orB) the identified qualified audio is separated from the end time for the existing speech block by no more than a second intra-block duration threshold and the existing speech block is not subject to being discarded;
  
  ii) creating a new speech block which is subject to being, but has not been, discarded, with the identified qualified audio if and only if the identified qualified audio is not added to the existing speech block and there are no other speech blocks between the identified qualified audio and the existing speech block on the same audio input data stream as the identified qualified audio; and
  
  b) the first intra-block duration threshold is different from the second intra-block duration threshold.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein signals are treated as unique or similar based on whether they start at substantially the same time.
  - 3. The method of claim 1, wherein:
    - a) organizing qualified audio into speech blocks comprises adding identified qualified audio to an existing speech block if and only if the identified qualified audio is on the same audio input data stream as the existing speech block and either;
      
      i) the identified qualified audio is after an end time for the existing speech block and is separated from the end time for the existing speech block by no more than a first intra-block duration threshold;
      
      orii) the identified qualified audio is before the start time for the existing speech block and is separated from the start time for the existing speech block by no more than a second intra-block duration threshold;
      
      b) the first intra-block duration threshold is different from the second intra-block duration threshold.
  - 4. The method of claim 1, wherein the first and second intra-block duration thresholds correspond to normal durations for pauses between words in speech.
  - 5. The method of claim 1, wherein the method is performed in real time on a frame by frame basis.
  - 6. The method of claim 5, wherein the method comprises:
    - a) determining whether any speech block treated as subject to being, but not having been, discarded should be treated as either discarded or not subject to being discarded;
      
      b) updating the speech block interface by performing a set of acts comprising, for each speech block displayed by the speech block interface and subject to being, but not having been, discarded;
      
      i) if the speech block is updated to not subject to being discarded, changing a color used to display that speech block;
      
      orii) if the speech block is updated by being discarded, remove that speech block from the timeline for the audio input data stream associated with the speech block.
  - 7. The method of claim 5, wherein:
    - a) each speech block has a status; and
      
      b) the method comprises treating the existing speech block as one which is;
      
      i) discarded;
      
      ii) subject to being, but has not been, discarded;
      
      oriii) not subject to being discarded;
      
      based on the status for the existing speech block.
  - 8. The method of claim 7 comprising, determining whether to update speech block status using a computer configured to:
    - a) update any speech block having;
      
      i) a duration less than a minimum block duration threshold; and
      
      ii) an end time followed by a period without qualified audio exceeding the first intra-block duration threshold;
      
      to have a status indicating the speech block has been discarded;
      
      b) update any speech block having;
      
      i) a duration greater than the minimum block duration threshold; and
      
      ii) a status indicating the speech block is subject to being, but has not been, discarded;
      
      to have a status indicating that the speech block is not subject to being discarded.

9. A machine comprising:
- a) a plurality of microphones;
  
  b) a computer comprising a computer readable medium having stored thereon data operable to configured the computer to perform a method comprising;
  
  i) receiving a plurality of audio input data streams, wherein the plurality of audio input data streams comprise different audio input data streams representing concurrent outputs of different microphones from the plurality of microphones;
  
  ii) identifying qualified audio on each of the plurality of audio input data streams by;
  
  A) identifying any unique signals on any of the plurality of audio input data streams which exceed an amplitude threshold as qualified audio; and
  
  B) when similar signals exceeding the amplitude threshold are detected on multiple audio input data streams, identifying only the loudest of the similar signals as qualified audio;
  
  iii) organizing qualified audio into speech blocks, each of which has a status and a start time, and is associated with a single audio input data stream; and
  
  iv) presenting a speech block interface to a user, wherein the speech block interface displays, for each audio input data stream, a timeline of speech blocks for the audio input data stream;
  
  wherein, in the method the data stored on the computer readable medium is operable to configure the computer to perform;
  
  a) organizing qualified audio into speech blocks comprises;
  
  i) adding identified qualified audio to an existing speech block if and only if the identified qualified audio is on the same audio input data stream as the existing speech block and either;
  
  A) the identified qualified audio is separated from an end time for the existing speech block by no more than a first intra-block duration threshold and the existing speech block is subject to being, but has not been, discarded;
  
  orB) the identified qualified audio is separated from the end time for the existing speech block by no more than a second intra-block duration threshold and the existing speech block is not subject to being discarded;
  
  ii) creating a new speech block which is subject to being, but has not been, discarded, with the identified qualified audio if and only if the identified qualified audio is not added to the existing speech block and there are no other speech blocks between the identified qualified audio and the existing speech block on the same audio input data stream as the identified qualified audio; and
  
  b) the first intra-block duration threshold is different from the second intra-block duration threshold.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
- - 10. The machine of claim 9, wherein the data stored on the computer readable medium is operable to configure the computer to treat signals as unique or similar based on whether they start at substantially the same time.
  - 11. The machine of claim 9, wherein, in the method the data stored on the computer readable medium is operable to configure the computer to perform:
    - a) organizing qualified audio into speech blocks comprises adding identified qualified audio to an existing speech block if and only if the identified qualified audio is on the same audio input data stream as the existing speech block and either;
      
      i) the identified qualified audio is after an end time for the existing speech block and is separated from the end time for the existing speech block by no more than a first intra-block duration threshold;
      
      orii) the identified qualified audio is before the start time for the existing speech block and is separated from the start time for the existing speech block by no more than a second intra-block duration threshold;
      
      b) the first intra-block duration threshold is different from the second intra-block duration threshold.
  - 12. The machine of claim 9, wherein, in the method the set of data stored on the computer readable medium is operable to configure the computer to perform, the first and second intra-block duration thresholds correspond to normal durations for pauses between words in speech.
  - 13. The machine of claim 9, wherein the data stored on the computer readable medium is operable to configure the computer to perform the method in real time on a frame by frame basis.
  - 14. The machine of claim 13, wherein the method the data stored on the computer readable medium is operable to configure the computer to perform comprises:
    - a) determining whether any speech block treated as subject to being, but not having been, discarded should be treated as either discarded or not subject to being discarded;
      
      b) updating the speech block interface by performing a set of acts comprising, for each speech block displayed by the speech block interface and subject to being, but not having been, discarded;
      
      i) if the speech block is updated to not subject to being discarded, changing a color used to display that speech block;
      
      orii) if the speech block is updated by being discarded, remove that speech block from the timeline for the audio input data stream associated with the speech block.
  - 15. The machine of claim 13, wherein:
    - a) the data stored on the computer readable medium is operable to configure the computer to maintain a status for each speech block; and
      
      b) the method the data stored on the computer readable medium is operable to configure the computer to perform comprises treating the existing speech block as one which is;
      
      i) discarded;
      
      ii) subject to being, but has not been, discarded;
      
      oriii) not subject to being discarded;
      
      based on the status for the existing speech block.
  - 16. The machine of claim 15, wherein the method the data stored on the computer readable medium is operable to configure the computer to perform comprises determining whether to update speech block status using a computer configured to:
    - a) update any speech block having;
      
      i) a duration less than a minimum block duration threshold; and
      
      ii) an end time followed by a period without qualified audio exceeding the first intra-block duration threshold;
      
      to have a status indicating the speech block has been discarded;
      
      b) update any speech block having;
      
      i) a duration greater than the minimum block duration threshold; and
      
      ii) a status indicating the speech block is subject to being, but has not been, discarded;
      
      to have a status indicating that the speech block is not subject to being discarded.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Jefferson Audio Video Systems, Inc.
Original Assignee
Jefferson Audio Video Systems, Inc.
Inventors
Bader, Matthew David, Cole, Nathan David
Primary Examiner(s)
Lerner, Martin

Application Number

US14/222,124
Time in Patent Office

312 Days
Field of Search

704/201, 704/215, 704/270, 704/278, 704/276, 379/202.01
US Class Current

704/276
CPC Class Codes

G06F 3/165   Management of the audio str...

G10L 15/08   Speech classification or se...

G10L 21/0272   Voice signal separating

G10L 21/10   Transforming into visible i...

G11B 20/10527   Audio or video recording; D...

G11B 2020/10546   specifically adapted for au...

H04L 65/403   Arrangements for multi-part...

H04L 65/765   intermediate

Identifying qualified audio of a plurality of audio streams for display in a user interface

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

71 Citations

16 Claims

Specification

Solutions

Use Cases

Quick Links

Identifying qualified audio of a plurality of audio streams for display in a user interface

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

71 Citations

16 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links