×

Systems and methods for providing a multi-modal evaluation of a presentation

  • US 10,311,743 B2
  • Filed: 04/08/2014
  • Issued: 06/04/2019
  • Est. Priority Date: 04/08/2013
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented system for providing a multi-modal evaluation of a presentation, comprising:

  • a motion capture device configured to generate motion data representing motion of an examinee giving a presentation, the motion data generated by the motion capture device representing three dimensional depth information, motion based on anchor points at respective positions of the examinee, or video frames;

    an audio recording device configured to generate audio data representing audio of the examinee giving the presentation; and

    a processing system configured to;

    generate a plurality of non-verbal metrics of the presentation based on the motion data, the non-verbal metrics selected from the group consisting of a metric of gesticulation, a metric of posture, a metric of eye contact, and a metric of facial expression,wherein the metric of gesticulation is generated based on the depth measurements indicating an amount of hand gesturing and based on a magnitude or a rate of pixel value changes between the video frames;

    wherein the metric of posture is generated based on changes in relative distances among the anchor points;

    wherein the metric of eye contact or facial expression is generated based on analysis of the video frames;

    generate a plurality of audio metrics of the presentation based on the audio data, wherein the audio metrics are selected from the group consisting of a content metric, a non-content transcript based metric, and a non-content metric,wherein the content metric is generated based on generating a first transcript based on the audio data, and then comparing the first transcript to a model transcript or to a presentation topic prompt;

    wherein the non-content transcript based metric is generated based on the first transcript and comparing sounds produced by the examinee at points in the first transcript to proper pronunciation of words at the points in the first transcript;

    wherein the non-content metric is generated based on one or more of stresses, accents, and discontinuities in the audio data; and

    generate and output a presentation score indicating an evaluation of the presentation based on inputting the non-verbal metrics and the audio metrics to a model comprising weights for a plurality of the non-verbal and audio metrics, the weights being based on correlations between human scores and the non-verbal and audio metrics within a collection of human-scored presentations.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×