METHOD AND APPARATUS FOR VIDEO CONFERENCING HAVING DYNAMIC LAYOUT BASED ON KEYWORD DETECTION

US 20070285505A1
Filed: 05/29/2007
Published: 12/13/2007
Est. Priority Date: 05/26/2006
Status: Abandoned Application

First Claim

Patent Images

1. A method of conferencing comprising:

connecting at least two sites to a conference;

receiving at least two video signals and two audio signals from the connected sites;

consecutively analyzing the audio data from the at least two sites connected in the conference by converting at least a part of the audio data to acoustical features and extracting keywords and speech parameters from the acoustical features using speech recognition;

comparing said extracted keywords to predefined words, and deciding if said extracted keywords are to be considered a call for attention based on said speech parameters;

defining an image layout based on said decision;

processing the received video signals to provide a video signal according to the defined image layout; and

transmitting the processed video signal to at least one of the at least two connected sites.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In particular, the present invention provides a method and system for conferencing, including the steps of connecting at least two sites to a conference, receiving at least two video signals and two audio signals from the connected sites, consecutively analyzing the audio data from the at least two sites connected in the conference by converting at least a part of the audio data to acoustical features and extracting keywords and speech parameters from the acoustical features using speech recognition, and comparing said extracted keywords to predefined words, then deciding if said extracted predefined keywords are to be considered a call for attention based on said speech parameters, and further, defining an image layout based on said decision, and processing the received video signals to provide a video signal according to the defined image layout, and transmitting the composite video signal to at least one of the at least two connected sites.

Citations

14 Claims

1. A method of conferencing comprising:
- connecting at least two sites to a conference;
  
  receiving at least two video signals and two audio signals from the connected sites;
  
  consecutively analyzing the audio data from the at least two sites connected in the conference by converting at least a part of the audio data to acoustical features and extracting keywords and speech parameters from the acoustical features using speech recognition;
  
  comparing said extracted keywords to predefined words, and deciding if said extracted keywords are to be considered a call for attention based on said speech parameters;
  
  defining an image layout based on said decision;
  
  processing the received video signals to provide a video signal according to the defined image layout; and
  
  transmitting the processed video signal to at least one of the at least two connected sites.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method according to claim 1 wherein the method further comprises the steps of:
    - predefining words where the words are defined as being one or more of the following;
      
      names of participants in the conference, groups of participants in the conference, aliases of said names;
      
      other predefined keywords, wherein said keywords are speech parameters
  - 3. The method according to claim 2 further comprising at the detection of a name, gathering speech parameters relating to said detected name wherein each parameter weighs positive or negative when determining the likeliness of said name being a call for attention
  - 4. The method according to one of the claims 2-3 further comprising upon a positive call for attention decision, redefining the image layout focusing on the video signal associated with said detected predefined name or alias, processing the received video signals to provide a second composite video signal according to the redefined image layout;
    - and transmitting the second composite video signal to at least one of the connected sites.
  - 5. The method according to one of the claims 2-4 further comprising the step of;
    - extracting said names of participants, and/or names of groups of participants, from a conference management system if said conference has been booked through a booking service.
  - 6. The method according to one of the claims 2-4 further comprising the steps of;
    - acquiring each sites unique ID or URI; and
      
      processing said unique IR or URI to automatically extract said names of participants , and/or groups of participants.
  - 7. The method according to one of the claims 2-3 further comprising the step of:
    - , deriving a set of aliases for each said name by means of an algorithm and/or a database of commonly used aliases.

8. A system for conferencing comprising:
- an interface unit for receiving at least audio and video signals from at least two sites connected in a conference;
  
  a speech recognition unit for analyzing the audio data from the at least two sites connected in the conference by converting at least a part of the audio data to acoustical features and extracting keywords and speech parameters from the acoustical features using speech recognition;
  
  a processing unit configured to compare said extracted keywords to predefined words, and deciding if said extracted keywords are to be considered a call for attention based on said speech parameters;
  
  a control processor for dynamically defining an image layout based on said decision;
  
  a video processor for processing the received video signals to provide a processed video signal according to the defined image layout.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The system according claim 8, wherein the system is further configured to redefine the image layout based on said decision, focusing on the video signal corresponding to said extracted predefined keywords, processing the received video signals to provide a second composite video signal according to the redefined image layout;
    - and transmitting the second video signal to at least one of the connected sites.
  - 10. The system according to claim 8, wherein said predefined words are categorized as one or more of the following:
    - names of participants in the conference, groups of participants in the conference, aliases of said names;
      
      other predefined keywords, wherein said keywords are speech parameters
  - 11. The system according to claim 8 wherein the speech recognition unit upon the detection of a name, is further configured to;
    - gather said speech parameters relating to said detected name, and determine the likeliness of said detected name being a call for attention based on said speech parameters, wherein each said speech parameter weighs positive or negative in the decision process.
  - 12. The system according to one of the claims 8-11 wherein the speech recognition unit further comprises, means for extracting said names of participants, and/or names of groups of participants, from a conference management system if said conference was booked through a booking service.
  - 13. The system according to one of the claims 8-12 wherein the speech recognition unit further comprises, means for acquiring each sites unique ID or URI;
    - and means for processing said unique IR or URI to automatically extract said names of participants, and/or groups of participants.
  - 14. The system according to one of the claims 8-13 wherein the speech recognition unit further comprises, means for deriving a set of aliases for each said participant or group of participants based on algorithms and/or a database of commonly used aliases.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Tandberg Telecom AS (Cisco Systems, Inc.)
Original Assignee
Tandberg Telecom AS (Cisco Systems, Inc.)
Inventors
KORNELIUSSEN, Jan

Application Number

US11/754,651
Publication Number

US 20070285505A1
Time in Patent Office

Days
Field of Search
US Class Current

348/14.80
CPC Class Codes

G10L 2015/088 Word spotting

H04N 7/147 Communication arrangements,...

METHOD AND APPARATUS FOR VIDEO CONFERENCING HAVING DYNAMIC LAYOUT BASED ON KEYWORD DETECTION

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

14 Claims

Specification

Solutions

Use Cases

Quick Links

METHOD AND APPARATUS FOR VIDEO CONFERENCING HAVING DYNAMIC LAYOUT BASED ON KEYWORD DETECTION

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

14 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links