SYSTEMS AND METHODS FOR IDENTIFYING CONCEPTS AND KEYWORDS FROM SPOKEN WORDS IN TEXT, AUDIO, AND VIDEO CONTENT
First Claim
1. A system for identifying, summarizing, and communicating topics and keywords included within spoken content, which comprises a server that is configured to:
- (a) receive one or more input files containing spoken content from an external source;
(b) process the input files using speech-to-text transcription when the spoken content is formatted as a video or audio file; and
(c) apply an algorithm to the transcribed text in order to analyze the spoken content, wherein the algorithm calculates a total score for each word included within the transcribed text, wherein the total score is calculated using a plurality of metrics which comprise;
(i) a length of each word in relation to a mean length of words;
(ii) frequency of letter groups used within each word;
(iii) frequency of repetition of each word and word sequences;
(iv) a part of speech that is represented by each word; and
(v) membership of each word within a custom set of words.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems for identifying, summarizing, and communicating topics and keywords included within an input file are disclosed. The systems include a server that receives one or more input files from an external source; conducts a speech-to-text transcription (when the input file is an audio or video file); and applies an algorithm to the text in order to analyze the content therein. The algorithm calculates a total score for each word included within the text, which is calculated using a variety of metrics that include: a length of each word in relation to a mean length of words, the frequency of letter groups used within each word, the frequency of repetition of each word and word sequences, a part of speech that is represented by each word, and membership of each word within a custom set of words. The systems are further capable of generating a graphical representation of each input file, which depicts those parts of the input file that exhibit a higher total score from those that do not. In addition, the systems allow users to publish commentary—through an email interface—to such graphical representations of the input files.
27 Citations
10 Claims
-
1. A system for identifying, summarizing, and communicating topics and keywords included within spoken content, which comprises a server that is configured to:
-
(a) receive one or more input files containing spoken content from an external source; (b) process the input files using speech-to-text transcription when the spoken content is formatted as a video or audio file; and (c) apply an algorithm to the transcribed text in order to analyze the spoken content, wherein the algorithm calculates a total score for each word included within the transcribed text, wherein the total score is calculated using a plurality of metrics which comprise; (i) a length of each word in relation to a mean length of words; (ii) frequency of letter groups used within each word; (iii) frequency of repetition of each word and word sequences; (iv) a part of speech that is represented by each word; and (v) membership of each word within a custom set of words. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
Specification