Information processing analysis system for sorting and scoring text
First Claim
1. A system for determining the presence within a text message of one or more predetermined ideas wherein the text of said message is in digital form and includes components in a human language, said system comprising:
- a) means for searching said message for a plurality of predetermined scoring words and identifying words in said message which match with said scoring words;
b) means for determining the sequence within said message of matching scoring words and the distance between certain ones of said matching scoring words; and
c) means for identifying which said ideas are present in said message according to one or more predetermined rules, each of said rules specifying a relationship between one or more of said matching scoring words.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and system for text analysis provides that text messages perceived by a population can be scored to determine the extent to which the messages favor one or more specified positions on a specified issue. A method and system for predicting public opinion based on message scores provides that the extent to which messages favor one or more specified positions can be used to determine the effect on the opinions of a specified population and to determine changes in the percentages of the percent of subpopulations within said specified population which favor said one or more specified positions.
388 Citations
38 Claims
-
1. A system for determining the presence within a text message of one or more predetermined ideas wherein the text of said message is in digital form and includes components in a human language, said system comprising:
-
a) means for searching said message for a plurality of predetermined scoring words and identifying words in said message which match with said scoring words; b) means for determining the sequence within said message of matching scoring words and the distance between certain ones of said matching scoring words; and c) means for identifying which said ideas are present in said message according to one or more predetermined rules, each of said rules specifying a relationship between one or more of said matching scoring words. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A system for reducing a text message into its essential message components and identifying the presence within said message of one or more predetermined ideas wherein the text of said message is stored in digital form and includes components in a human language, said system comprising:
- computer means including;
a) a listing of predetermined concept categories and a dictionary of predetermined identifying words wherein each of said identifying words is a text representation corresponding to one of said concept categories, and wherein said concept categories represent a predetermined concept; b) a set of predetermined text analysis rules wherein said text analysis rules define relationships between one or more concept categories; c) means for dividing the text of said message into specified blocks of text, wherein said blocks can include all or a subset of the text of said message; d) means for searching each said block of text for a first plurality of words in said message which match with said predetermined identifying words, such instance of a plurality being a set of matching words; e) means for determining the sequence of said matching words in said block of text and the distance between pairs of said matching words wherein distance is a numeric representation of the quantity of text between the said pair of matching words in said text; f) means for analyzing said matching words, said sequence of matching words, and said distances between pairs of matching words to select blocks of text in said message according to one or more of said text analysis rules, wherein said text analysis rules define a relationship between one or more matching words that identifies said blocks of text as relevant to a said idea; g) means for searching each said relevant block of text in said message for a plurality of words which match with said predetermined identifying words; h) means for determining the sequence within said block of text of matching identifying words and the distance between each pair of matching identifying words; and i) means for determining within each relevant block of text using said predetermined text analysis rules, the quantities of text favoring each of said ideas. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33)
- computer means including;
-
34. A system for determining the presence within a text message of one or more predetermined ideas wherein the text of said message is in digital form and includes components in a human language, said system comprising:
-
a) means for searching said message for a plurality of predetermined scoring words and identifying words in said message which match with said scoring words; b) means for determining the sequence within said message of matching scoring words and the distance between certain ones of said matching scoring words; and c) means for identifying which said ideas are present in said message according to one or more predetermined rules, each of said rules specifying a relationship between one or more of said matching scoring words; said means further including; d) means for dividing the text of said message into specified blocks of text, wherein said blocks can include all or a subset of the text of said message; e) means for making summary representations of each said block of text wherein each important concept conveyed by said message is reduced to a single concept symbol, and wherein each important concept is identified from; 1) one or more specified words within said block of text; 2) a quantity of text between said one or more specified words within said block of text; 3) the sequence within said message of said one or more specified words; and f) means for altering specified blocks of text.
-
-
35. A system for determining the presence within a text message of one or more predetermined ideas wherein the text of said message is in digital form and includes components in a human language, said system comprising:
-
a) means for searching said message for a plurality of predetermined scoring words and identifying words in said message which match with said scoring words; b) means for determining the sequence within said message of matching scoring words and the distance between certain ones of said matching scoring words; c) means for transforming said message to make a summary representation thereof; d) means for identifying which said ideas are present in said message according to one or more predetermined rules, each of said rules specifying a relationship between one or more of said matching scoring words. - View Dependent Claims (36)
-
-
37. A system for reducing a text message into its essential message components and identifying the presence within said message of one or more predetermined ideas wherein the text of said message is stored in digital form and includes components in a human language, said system comprising:
- computer means including;
a) a listing of predetermined concept categories and a dictionary of predetermined identifying words wherein each of said identifying words is a text representation corresponding to one of said concept categories, and wherein each said concept category represents a predetermined concept; b) a set of predetermined text analysis rules wherein said text analysis rules define relationships between one or more concept categories; c) means for dividing the text of said message into specified blocks of text, wherein said blocks can include all or a subset of the text of said message; d) means for searching each said block of text for a first plurality of words in said message which match with said predetermined identifying words, such instance of a plurality being a set of matching words; e) means for determining the sequence of said matching words in said block of text and the distance between pairs of said matching words wherein distance is a numeric representation of the quantity of text between the said pair of matching words in said text; f) means for analyzing said matching words, said sequence of matching words, and said distances between pairs of matching words to select blocks of text in said message according to one or more of said text analysis rules, wherein said text analysis rules define a relationship between one or more matching words and identify relevant blocks of text as relevant to said ideas; g) means for searching each said relevant block of text in said message for a plurality of words which match with said predetermined identifying words; h) means for determining the sequence within said block of text of matching identifying words and the distance between each pair of matching identifying words; i) means for transforming said block of text to make a summary representation thereof; j) means for determining within said relevant block of text using said predetermined text analysis rules, the quantities of text favoring each of said ideas. - View Dependent Claims (38)
- computer means including;
Specification