×

Public speaking trainer with 3-D simulation and real-time feedback

  • US 10,446,055 B2
  • Filed: 08/11/2015
  • Issued: 10/15/2019
  • Est. Priority Date: 08/13/2014
  • Status: Active Grant
First Claim
Patent Images

1. A method of public speaking training, comprising:

  • providing a speech analysis engine;

    using the speech analysis engine executing on a first computer system to extract a plurality of features from a plurality of prerecorded speeches;

    providing manual ratings from public speaking experts for an overall quality of each of the plurality of prerecorded speeches;

    using a machine learning algorithm executing on the first computer system to compare the manual ratings of the prerecorded speeches to the plurality of features extracted from the prerecorded speeches, wherein the machine learning algorithm generates a predictive model defining correlations between the plurality of features and the manual ratings, and wherein the predictive model includes a plurality of rating scales with thresholds for the plurality of features, wherein a first rating scale for a first feature of the plurality of features includes a plurality of thresholds for rating the first feature and a first threshold of the plurality of thresholds is above a minimum and below a maximum of the first rating scale;

    providing a second computer system including a display monitor, a microphone, and a video capture device;

    presenting a first interface on the display monitor of the second computer system allowing entry of an environment configuration, an audience configuration, and a presentation configuration by a user, wherein the environment configuration, audience configuration, and presentation are separately configurable by the user independent from each other, wherein the audience configuration includes a size of the audience, a type of audience, and a technical expertise of the audience that are independently configurable by the user, wherein the type of audience includes executives, upper management, technical professionals, or students, wherein the presentation configuration includes a desired presentation length and a presentation topic that are independently configurable by the user;

    providing a prompt allowing input of presentation materials;

    receiving a presentation material onto the second computer system through the prompt, wherein the presentation material includes a plurality of slides;

    rendering a simulated environment on the display monitor using the second computer system in accordance with the environment configuration and including a number of simulated audience members in the simulated environment determined from the size of the audience of the audience configuration;

    recording a presentation by the user onto the second computer system using the microphone and the video capture device;

    extracting, by the second computer system executing the speech analysis engine while recording the presentation, the plurality of features from the presentation, wherein the plurality of features includes a pitch variability, volume variability, pace, pace variability, and length and timing of pauses;

    providing an audio signal from the microphone to a speech-to-text application executing on the second computer system to generate a transcript of the presentation in real time;

    performing natural language processing on the transcript of the presentation, using the second computer system, to determine a linguistic complexity of the presentation, wherein the linguistic complexity is included in the plurality of features;

    analyzing the transcript of the presentation, by the second computer system, to provide a metric of proper use of domain specific terms for the technical expertise of the audience, wherein the proper use of domain specific terms is included in the plurality of features;

    analyzing, by the second computer system, usage of the presentation material while recording the presentation, including an amount of eye contact with the simulated audience members versus eye contact with the presentation material, wherein the amount of eye contact is included in the plurality of features;

    analyzing the presentation by comparing the plurality of features against the thresholds of the rating scales of the predictive model, wherein a rating for the first feature is determined by comparing the first feature extracted from the presentation against the plurality of thresholds associated with the first feature in the predictive model, and wherein the analyzing is tailored to the technical expertise of the audience, the desired presentation length, and the presentation topic entered by the user;

    animating the simulated audience member in response to at least one of the plurality of features;

    drawing a real-time metric graph on the display monitor for the first feature while recording the presentation; and

    presenting a second interface on the display monitor of the second computer system after recording the presentation to play the recording of the presentation along with a graph illustrating a second feature of the plurality of features on a scale obtained from the predictive model, wherein the second feature is selected based on the type of audience, and wherein the graph moves along with the recording of the presentation.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×