Methods for presenting speech blocks from a plurality of audio input data streams to a user in an interface

US 8,719,032 B1
Filed: 12/11/2013
Issued: 05/06/2014
Est. Priority Date: 12/11/2013
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

a) using a computer, identifying qualified audio on each of a plurality of audio input data streams by;

i) identifying any unique signals on any of the plurality of audio input data streams which exceed an amplitude threshold as qualified audio; and

ii) when similar signals exceeding the amplitude threshold are detected on multiple audio input data streams, identifying only the loudest of the similar signals as qualified audio;

b) for each of the audio input data streams, identifying a set of speech blocks, each of which has a status and a start time, by, for each frame in the audio input data stream;

i) executing code configured to add the current frame to a most recently created speech block if and only if;

A) the most recent preceding frame corresponding to qualified audio has a time which differs from a time for the current frame by less than a first intervening duration threshold, and the status of the most recently created speech block is pending;

orB) the most recent preceding frame corresponding to qualified audio has a time which differs from the time for the current frame by less than a second intervening duration threshold, and the status of the most recently created speech block is committed;

ii) if the current frame is not added to the most recently created speech block, and the current frame corresponds to qualified audio, executing code configured to create a new speech block, wherein;

A) the start time for the new speech block is the time for the current frame; and

B) the status for the new speech block is pending;

iii) if the status of the most recently created speech block is pending, executing code configured to change the status of the most recently created speech block to discarded if and only if;

A) the current frame does not correspond to qualified audio;

B) the most recent preceding frame corresponding to qualified audio has a time which differs from the time for the current frame by more than the first intervening duration threshold; and

C) the status of the most recently created speech block is pending;

iv) if the status of the most recently created speech block is pending, executing code configured to change the status of the most recently created speech block to committed if and only if;

A) the current frame corresponds to qualified audio; and

B) the start time for the most recently created speech block speech block precedes the time for the current frame by more than a minimum block duration threshold;

c) presenting a speech block interface to a user, wherein;

i) the speech block interface displays, for each audio input data stream, a timeline of speech blocks for the audio input data stream, the timeline being updated in real time as the qualified audio for the audio input data streams is identified;

ii) the speech block interface is configured to allow the user play a portion of an audio input data stream corresponding to a speech block by selecting the speech block to be played;

iii) the speech block interface is configured to allow the user to skip from each displayed speech block to a previous or subsequent displayed speech block; and

iv) the speech block interface is configured not to display discarded speech blocks, to display pending speech blocks semitransparently, and to display committed speech blocks opaquely.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A clear picture of who is speaking in a setting where there are multiple input sources (e.g., a conference room with multiple microphones) can be obtained by comparing input channels against each other. The data from each channel can not only be compared, but can also be organized into portions which logically correspond to statements by a user. These statements, along with information regarding who is speaking, can be presented in a user friendly format via an interactive timeline which can be updated in real time as new audio input data is received.

65 Citations

View as Search Results

2 Claims

1. A method comprising:
- a) using a computer, identifying qualified audio on each of a plurality of audio input data streams by;
  
  i) identifying any unique signals on any of the plurality of audio input data streams which exceed an amplitude threshold as qualified audio; and
  
  ii) when similar signals exceeding the amplitude threshold are detected on multiple audio input data streams, identifying only the loudest of the similar signals as qualified audio;
  
  b) for each of the audio input data streams, identifying a set of speech blocks, each of which has a status and a start time, by, for each frame in the audio input data stream;
  
  i) executing code configured to add the current frame to a most recently created speech block if and only if;
  
  A) the most recent preceding frame corresponding to qualified audio has a time which differs from a time for the current frame by less than a first intervening duration threshold, and the status of the most recently created speech block is pending;
  
  orB) the most recent preceding frame corresponding to qualified audio has a time which differs from the time for the current frame by less than a second intervening duration threshold, and the status of the most recently created speech block is committed;
  
  ii) if the current frame is not added to the most recently created speech block, and the current frame corresponds to qualified audio, executing code configured to create a new speech block, wherein;
  
  A) the start time for the new speech block is the time for the current frame; and
  
  B) the status for the new speech block is pending;
  
  iii) if the status of the most recently created speech block is pending, executing code configured to change the status of the most recently created speech block to discarded if and only if;
  
  A) the current frame does not correspond to qualified audio;
  
  B) the most recent preceding frame corresponding to qualified audio has a time which differs from the time for the current frame by more than the first intervening duration threshold; and
  
  C) the status of the most recently created speech block is pending;
  
  iv) if the status of the most recently created speech block is pending, executing code configured to change the status of the most recently created speech block to committed if and only if;
  
  A) the current frame corresponds to qualified audio; and
  
  B) the start time for the most recently created speech block speech block precedes the time for the current frame by more than a minimum block duration threshold;
  
  c) presenting a speech block interface to a user, wherein;
  
  i) the speech block interface displays, for each audio input data stream, a timeline of speech blocks for the audio input data stream, the timeline being updated in real time as the qualified audio for the audio input data streams is identified;
  
  ii) the speech block interface is configured to allow the user play a portion of an audio input data stream corresponding to a speech block by selecting the speech block to be played;
  
  iii) the speech block interface is configured to allow the user to skip from each displayed speech block to a previous or subsequent displayed speech block; and
  
  iv) the speech block interface is configured not to display discarded speech blocks, to display pending speech blocks semitransparently, and to display committed speech blocks opaquely.

2. A method comprising:
- a) filtering a plurality of audio input data streams by, for any overlapping period wherein an overlapping period is a period in which signals differing only in volume are included in two or more audio input data streams from the plurality of audio input data streams, exclude the signal from each of the two or more audio input data streams except for the loudest of the two or more audio input data streams for the overlapping period;
  
  b) for each of the filtered audio input data streams, defining a set of speech blocks using a computer, wherein, for each speech block for the filtered audio input data stream,i) the speech block comprises a base period, wherein the base period has a duration longer than a first duration threshold, and wherein the filtered audio input data stream'"'"'s volume exceeds a volume threshold throughout the base period except for lapses corresponding to normal pauses between words in speech;
  
  ii) the speech block comprises a set of additional periods, wherein, for each of the additional periods;
  
  A) the filtered audio input data stream'"'"'s volume exceeds the volume threshold throughout the additional period except for lapses corresponding to normal pauses between words in speech; and
  
  B) if the additional period has a duration longer than the first duration threshold, there is no intervening period between the additional period'"'"'s end and the base period'"'"'s start which has a duration longer than a second duration threshold and which has a volume which is lower than the volume threshold during the intervening period;
  
  C) if the additional period has a duration which is not longer than the first duration threshold, there is no intervening period between the additional period'"'"'s end and the base period'"'"'s start which has a duration longer than third duration threshold and which has a volume which is lower than the volume threshold during the intervening period, wherein the third duration threshold is shorter than the second duration threshold;
  
  iii) the start of the earliest period comprised by the speech block is defined as the speech block'"'"'s start; and
  
  iv) the end of the latest period comprised by the speech block is defined as the speech block'"'"'s end;
  
  c) presenting a speech block interface to a user, wherein;
  
  i) the speech block interface displays, for each audio input data stream, a timeline of each speech block from the audio input data stream, the timeline being updated in real time as the audio input data streams are received;
  
  ii) the speech block interface is configured to allow a user play a portion of an audio input data stream corresponding to a speech block from the audio input data stream by selecting the speech block to be played; and
  
  iii) the speech block interface is configured to allow the user to skip from each block to a previous or subsequent speech block.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Jefferson Audio Video Systems, Inc.
Original Assignee
Jefferson Audio Video Systems, Inc.
Inventors
Bader, Matthew David, Cole, Nathan David
Primary Examiner(s)
Lerner, Martin

Application Number

US14/103,369
Time in Patent Office

146 Days
Field of Search

704/248, 704/276, 704/278, 704/270, 715/727, 370/260, 370/261, 379/202.01
US Class Current

704/270
CPC Class Codes

G06F 3/165   Management of the audio str...

G10L 15/08   Speech classification or se...

G10L 21/0272   Voice signal separating

G10L 21/10   Transforming into visible i...

G11B 20/10527   Audio or video recording; D...

G11B 2020/10546   specifically adapted for au...

H04L 65/403   Arrangements for multi-part...

H04L 65/765   intermediate

Methods for presenting speech blocks from a plurality of audio input data streams to a user in an interface

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

65 Citations

2 Claims

Specification

Solutions

Use Cases

Quick Links

Methods for presenting speech blocks from a plurality of audio input data streams to a user in an interface

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

65 Citations

2 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links