SYSTEMS, METHODS AND COMPUTER PROGRAM PRODUCTS FOR BUILDING A DATABASE ASSOCIATING N-GRAMS WITH COGNITIVE MOTIVATION ORIENTATIONS
First Claim
1. A computer-implemented method for building an analysis database associating each of a plurality of n-grams with corresponding respective cognitive motivation orientations, comprising:
- receiving a training corpus of training documents in electronic form;
each training document comprising a plurality of meaningfully arranged words;
each training document having at least one annotated word sequence therein;
wherein within each training document, each particular annotated word sequence is annotated with a corresponding word-sequence-level annotation identifying at least one cognitive motivation orientation that is associated with that particular annotated word sequence;
for each training document;
for each annotated word sequence in that particular training document;
extracting n-grams overlapping that particular annotated word sequence; and
associating each extracted n-gram with the at least one cognitive motivation orientation associated with that particular annotated word sequence;
generating a set of indicator candidate n-grams wherein;
each indicator candidate n-gram represents all instances of a particular n-gram in the training corpus for which at least one instance of that particular n-gram was extracted from any annotated word sequence in any training document;
each indicator candidate n-gram being associated with every cognitive motivation orientation that is associated with at least one instance of the particular n-gram represented by that particular indicator candidate n-gram;
applying at least one relevance filter to each indicator candidate n-grams in the set of indicator candidate n-grams to obtain a set of indicator n-grams, wherein;
the set of indicator n-grams is a subset of the set of indicator candidate n-grams, so that each indicator n-gram corresponds to only one indicator candidate n-gram and thereby each indicator n-gram represents all instances of a corresponding particular n-gram in the training corpus for which at least one instance of that particular n-gram was extracted from any annotated word sequence in any training document;
each indicator n-gram is associated with only a single cognitive motivation orientation; and
each indicator n-gram has, as its associated single cognitive motivation orientation, that single cognitive motivation orientation with which the instances of the particular n-gram represented by that particular indicator n-gram are most frequently associated.
1 Assignment
0 Petitions
Accused Products
Abstract
Computer-implemented methods can transform a corpus of meaningful text sequences into a generalized computer-usable repository of neurolinguistic information that can be applied by one or more computer systems. The computer system(s) can use the neurolinguistic information to neurolinguistically analyze meaningful text sequences to derive statistical information and identify dominant cognitive motivation orientations expressed in those text sequences. The identified dominant cognitive motivation orientations can be used to improve the efficacy of both human-generated and machine-generated communications. The computer system(s) thereby transform a meaningful text sequence into actionable information about the dominant cognitive motivation orientation(s) of the author of that text sequence within the context in which the text sequence was composed. Computer systems and computer-program products for implementing the methods are also described.
-
Citations
18 Claims
-
1. A computer-implemented method for building an analysis database associating each of a plurality of n-grams with corresponding respective cognitive motivation orientations, comprising:
-
receiving a training corpus of training documents in electronic form; each training document comprising a plurality of meaningfully arranged words; each training document having at least one annotated word sequence therein; wherein within each training document, each particular annotated word sequence is annotated with a corresponding word-sequence-level annotation identifying at least one cognitive motivation orientation that is associated with that particular annotated word sequence; for each training document; for each annotated word sequence in that particular training document; extracting n-grams overlapping that particular annotated word sequence; and associating each extracted n-gram with the at least one cognitive motivation orientation associated with that particular annotated word sequence; generating a set of indicator candidate n-grams wherein; each indicator candidate n-gram represents all instances of a particular n-gram in the training corpus for which at least one instance of that particular n-gram was extracted from any annotated word sequence in any training document; each indicator candidate n-gram being associated with every cognitive motivation orientation that is associated with at least one instance of the particular n-gram represented by that particular indicator candidate n-gram; applying at least one relevance filter to each indicator candidate n-grams in the set of indicator candidate n-grams to obtain a set of indicator n-grams, wherein; the set of indicator n-grams is a subset of the set of indicator candidate n-grams, so that each indicator n-gram corresponds to only one indicator candidate n-gram and thereby each indicator n-gram represents all instances of a corresponding particular n-gram in the training corpus for which at least one instance of that particular n-gram was extracted from any annotated word sequence in any training document; each indicator n-gram is associated with only a single cognitive motivation orientation; and each indicator n-gram has, as its associated single cognitive motivation orientation, that single cognitive motivation orientation with which the instances of the particular n-gram represented by that particular indicator n-gram are most frequently associated. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A database-building data processing system configured for building an analysis database associating each of a plurality of n-grams with corresponding respective cognitive motivation orientations, the system comprising:
-
a host computer with memory and at least one processor coupled to the memory; and a database-building module, the database-building module comprising program code that, when executed in the memory of the host computer; receives a training corpus of training documents, wherein; each training document comprises a plurality of meaningfully arranged words in electronic form; each training document has at least one annotated word sequence therein; wherein within each training document, each particular annotated word sequence is annotated with a corresponding word-sequence-level annotation identifying at least one cognitive motivation orientation that is associated with that particular annotated word sequence; for each training document; for each annotated word sequence in that particular training document; extracts n-grams overlapping that particular annotated word sequence; and associates each extracted n-gram with the at least one cognitive motivation orientation associated with that particular annotated word sequence; generates a set of indicator candidate n-grams wherein; each indicator candidate n-gram represents all instances of a particular n-gram in the training corpus for which at least one instance of that particular n-gram was extracted from any annotated word sequence in any training document; each indicator candidate n-gram being associated with every cognitive motivation orientation that is associated with at least one instance of the particular n-gram represented by that particular indicator candidate n-gram; applies at least one relevance filter to each indicator candidate n-gram in the set of indicator candidate n-grams to obtain a set of indicator n-grams, wherein; the set of indicator n-grams is a subset of the set of indicator candidate n-grams, so that each indicator n-gram corresponds to only one indicator candidate n-gram and thereby each indicator n-gram represents all instances of a corresponding particular n-gram in the training corpus for which at least one instance of that particular n-gram was extracted from any annotated word sequence in any training document; and each indicator n-gram is associated with only a single cognitive motivation orientation; wherein each indicator n-gram has, as its associated single cognitive motivation orientation, that single cognitive motivation orientation with which the instances of the particular n-gram represented by that particular indicator n-gram are most frequently associated. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer program product for building an analysis database associating each of a plurality of n-grams with corresponding respective cognitive motivation orientations, the computer program product comprising:
-
a non-transitory computer readable storage medium having computer readable program code embodied therewith, the computer readable program code comprising; computer readable program code adapted to, when executed by a computer, cause the computer to receive a training corpus of training documents, wherein; each training document comprises a plurality of meaningfully arranged words in electronic form; each training document has at least one annotated word sequence therein; wherein, within each training document, each particular annotated word sequence is annotated with a corresponding word-sequence-level annotation identifying at least one cognitive motivation orientation that is associated with that particular annotated word sequence; computer readable program code adapted to, when executed by a computer, cause the computer to, for each training document; for each annotated word sequence in that particular training document; extract n-grams overlapping that particular annotated word sequence; and associate each extracted n-gram with the at least one cognitive motivation orientation associated with that particular annotated word sequence; computer readable program code adapted to, when executed by a computer, cause the computer to generate indicator candidate n-grams wherein; each indicator candidate n-gram represents all instances of a particular n-gram in the training corpus for which at least one instance of that particular n-gram was extracted from any annotated word sequence in any training document; each indicator candidate n-gram being associated with every cognitive motivation orientation that is associated with at least one instance of the particular n-gram represented by that particular indicator candidate n-gram; computer readable program code adapted to, when executed by a computer, cause the computer to apply at least one relevance filter to each indicator candidate n-grams in the set of indicator candidate n-grams to obtain a set of indicator n-grams, wherein; the set of indicator n-grams is a subset of the set of indicator candidate n-grams, so that each indicator n-gram corresponds to only one indicator candidate n-gram and thereby each indicator n-gram represents all instances of a corresponding particular n-gram in the training corpus for which at least one instance of that particular n-gram was extracted from any annotated word sequence in any training document; and each indicator n-gram is associated with only a single cognitive motivation orientation; wherein each indicator n-gram has, as its associated single cognitive motivation orientation, that single cognitive motivation orientation with which the instances of the particular n-gram represented by that particular indicator n-gram are most frequently associated. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification