×

SYSTEM AND METHOD FOR COGNITIVE MULTILINGUAL SPEECH TRAINING AND RECOGNITION

  • US 20190303797A1
  • Filed: 03/30/2018
  • Published: 10/03/2019
  • Est. Priority Date: 03/30/2018
  • Status: Active Grant
First Claim
Patent Images

1. A method of analyzing human speech during natural language processing interactions between humans and computers, the method comprising:

  • (A) selecting, by a computer system, multiple human language tutorial videos from a plurality of human language tutorial videos, each of the plurality of human language tutorial videos having a visual track, a corresponding audio track and captions, wherein the visual track contains visual information, the audio track contains pronunciations of words or phrases spoken by humans regarding the visual information, and the captions include text of the spoken words or phrases of the audio track;

    (B) stream processing simultaneously in parallel channels, by the computer system, the selected multiple human language tutorial videos by chunking frames of the multiple videos into processor modules, said processor modules each including a video recognition module, a text recognition module, and an audio recognition module, wherein for each said frame of the multiple videos said video recognition module analyzes and identifies the visual information of the video track, said audio recognition module analyzes and identifies the pronunciations of the spoken words or phrases of the audio track, and said text recognition module analyzes and identifies the captions using optical character recognition;

    (C) correlating, by correlation modules of the computer system for each said frame of the multiple videos, the identified visual information with the identified captions and the identified pronunciations of the spoken words or phrases;

    (D) determining, by determination modules of the computer system, confidence scores of accuracy in the identifying the pronunciations of the spoken words or phrases, by comparing the identified audio pronunciations of the spoken words or phrases with a list of pronunciations of benchmark words or phrases stored in files on the computer;

    (E) assigning, by the computer system, the identified pronunciations of the words or phrases having confidence scores equal to or above a predetermined threshold value to the list of pronunciations of benchmark words or phrases stored in the files on the computer; and

    (F) selecting, by the computer system, different human language tutorial videos from the plurality of human language tutorial videos, then repeating steps (B) through (E) on the selected different human language tutorial videos.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×