Spokesman's detection system, spokesman's detection method and audio/video conferencingasystem figureu

Spokesman's detection system, spokesman's detection method and audio/video conferencingasystem figureu

  • CN 103,581,608 B
  • Filed: 07/20/2012
  • Issued: 02/01/2019
  • Est. Priority Date: 07/20/2012
  • Status: Active Grant
First Claim
Patent Images

1. spokesman'"'"'s detection system of a kind of view-based access control model voice activity detection and acoustics voice activity detection, comprising:

  • Video camera, for obtaining the video information of multiple participants;

    Microphone, for obtaining the audio-frequency information of audio/video conference;

    Processing module is configured to the visual speech activity difference to each in the multiple participant in the video informationIt is detected, to generate the visual speech activity detection signal for each in the multiple participant;

    And it is configured to instituteThe acoustic voice activity stated in audio-frequency information is detected, to generate acoustic voice activity detection signal;

    Comparison module, for carrying out visual speech activity detection signal with the acoustic voice activity detection signal respectivelyCompare, and will have corresponding to the visual speech activity detection signal of maximum relation degree with the acoustic voice activity detection signalParticipant be determined as current speaker;

    Wherein the visual speech activity is the lip motion of each in the multiple participant, and wherein;

    The processing module carries out independent visual speech activity detection, the place to each in the multiple participant respectivelyReason module obtains lip outline by the difference of lip color and face color, and based between upperlip and upperlipDifference of the gap in brightness and/or color determines the area in the gap within the scope of lip outline, when the area is in the company of videoWhen difference in continuous frame is more than preset threshold value, the output of the visual speech activity detection signal of the lip is " 1 ", otherwise, shouldThe output of the visual speech activity detection signal of lip is " 0 ";

    The processing module obtains the acoustic voice activity detection signal by detecting the audio-frequency information;

    Work as audio-frequency informationIn there are when voice, the output of the acoustic voice activity detection signal is " 1 ", otherwise, the acoustic voice activity detection letterNumber output be " 0 ".

View all claims
    ×
    ×

    Thank you for your feedback

    ×
    ×