Method and system for automating the analysis of word frequencies
First Claim
1. A method for an automated analysis of word frequencies, the method comprising:
- collecting a plurality of opening statements including a plurality of words, the plurality of words including multiple different transcriptions of a particular word or group of words;
specifying a synonym group including the multiple different transcriptions of the particular word or group of words;
establishing a cluster size;
removing a plurality of punctuation from the opening statements;
automatically counting each unique word in the opening statements, wherein the multiple different transcriptions of the particular word or group of words specified in the synonym group are treated as a single unique word;
determining a frequency of occurrence for each of the unique words in the opening statements;
automatically locating a plurality of clusters in the opening statements;
determining a cluster frequency of occurrence for each of the clusters;
storing each of the unique words and a corresponding frequency of occurrence and each of the clusters and a corresponding cluster frequency of occurrence; and
creating an output file sorted by the frequency of occurrence and the cluster frequency of occurrence and including each unique word, the corresponding frequency of occurrence, each cluster, and the corresponding cluster frequency of occurrence.
6 Assignments
0 Petitions
Accused Products
Abstract
A method and system for automating the analysis of word frequencies includes a frequency system automatically analyzing a plurality of statements, a count engine, and a cluster engine. The count engine allows for the counting of unique words in the statements and the determination of a frequency of occurrence for each unique word. The frequency system further includes a phrase file allowing for the count engine to specify groups of words as single unique words and a synonym file allowing for the count engine to group one or more words together in synonym groups to be specified as single unique words. The cluster engine locates a plurality of clusters in the statements and determines a cluster frequency of occurrence for each of the clusters. The automated analysis of the statements allows for cost savings, more efficient use of time, and more reliable and consistent word frequency results.
-
Citations
33 Claims
-
1. A method for an automated analysis of word frequencies, the method comprising:
-
collecting a plurality of opening statements including a plurality of words, the plurality of words including multiple different transcriptions of a particular word or group of words; specifying a synonym group including the multiple different transcriptions of the particular word or group of words; establishing a cluster size; removing a plurality of punctuation from the opening statements; automatically counting each unique word in the opening statements, wherein the multiple different transcriptions of the particular word or group of words specified in the synonym group are treated as a single unique word; determining a frequency of occurrence for each of the unique words in the opening statements; automatically locating a plurality of clusters in the opening statements; determining a cluster frequency of occurrence for each of the clusters; storing each of the unique words and a corresponding frequency of occurrence and each of the clusters and a corresponding cluster frequency of occurrence; and creating an output file sorted by the frequency of occurrence and the cluster frequency of occurrence and including each unique word, the corresponding frequency of occurrence, each cluster, and the corresponding cluster frequency of occurrence.
-
-
2. A method for automating an analysis of word frequencies, the method comprising:
-
accessing a plurality of transcribed statements including a plurality of words, the plurality of words including multiple different transcriptions of a particular word or group of words; specifying a synonym group including the multiple different transcriptions of the particular word or group of words; automatically analyzing the words in the statements; determining a frequency of occurrence for each unique word in the statements, wherein the multiple different transcriptions of the particular word or group of words specified in the synonym group are treated as a single unique word; and creating an output file including each of the unique words and a corresponding frequency of occurrence. - View Dependent Claims (3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A system for automating an analysis of word frequencies, the system comprising:
-
a plurality of statements including a plurality of words, the plurality of words including multiple different transcriptions of a particular word or group of words; a synonym file specifying a synonym group including the multiple different transcriptions of the particular word or group of words; a count engine operable to analyze the statements and determine a frequency of occurrence for each unique word in the statements, wherein the multiple different transcriptions of the particular word or group of words specified in the synonym group are treated as a single unique word; and a cluster engine associated with the count engine, the cluster engine operable to locate a plurality of clusters in the statements and determine a cluster frequency of occurrence for each of the clusters. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. A Computer program product for automating an analysis of word frequencies, the computer program product;
- and embodied in a computer-readable storage medium and operable to;
access a plurality of transcribed statements including a plurality of words, the plurality of words including multiple different transcriptions of a particular word or group of words; specifying a synonym group including the multiple different transcriptions of the particular word or group of words; automatically analyze the words in the statements; determine a frequency of occurrence for each unique word in the statements, wherein the multiple different transcriptions of the particular word or group of words specified in the synonym group are treated as a single unique word; and create an output file including each of the unique words and a corresponding frequency of occurrence. - View Dependent Claims (26, 27, 28, 29, 30, 31, 32, 33)
- and embodied in a computer-readable storage medium and operable to;
Specification