System and method for generating semantic analysis of textual information
First Claim
1. A system for receiving an information stream in the form of textual information comprising a series of information elements and generating therefrom respective classifications for said information elements from a plurality of predetermined classifications, said system comprising:
- A. a token generator configured to receive the information stream, parse the information stream to identify the respective information elements, identify for each information element one of a plurality of element types, and generate for each information element a token identifying the information element'"'"'s element type;
B. a token classifier configured to receive the tokens and generate a classification to classify each said token in relation to the element type associated with said respective token, classifications generated for previously-classified tokens and the types of previous and successive tokens, thereby to determine the semantic content of the information associated with the tokens.
1 Assignment
0 Petitions
Accused Products
Abstract
A system receives an information stream comprising the textual information whose semantic content is to be determined, divides the information stream into a series of elements and classifies each element into one of a plurality of predetermined classifications. The system includes a token generator and a token classifier. The token generator receives the textual information stream, parses the stream to identifies the respective elements, identifies for each element one of a plurality of element types, and generates a token identifying the element type for each element. At least some of the tokens also include a pointer pointing to the actual information associated with the element. The token classifier receives the tokens and classifies them in order. In that operation, the token classifier classifies each token in relation to the token'"'"'s type, classifications for previously-classified tokens and the types of successive tokens, thereby to determine the semantic content of the information associated with the tokens. After the tokens are classified, the information associated therewith can be loaded into a database system according to their classifications, and conventional database tools used to obtain information therefrom.
70 Citations
15 Claims
-
1. A system for receiving an information stream in the form of textual information comprising a series of information elements and generating therefrom respective classifications for said information elements from a plurality of predetermined classifications, said system comprising:
-
A. a token generator configured to receive the information stream, parse the information stream to identify the respective information elements, identify for each information element one of a plurality of element types, and generate for each information element a token identifying the information element'"'"'s element type; B. a token classifier configured to receive the tokens and generate a classification to classify each said token in relation to the element type associated with said respective token, classifications generated for previously-classified tokens and the types of previous and successive tokens, thereby to determine the semantic content of the information associated with the tokens. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method of receiving an information stream in the form of textual information comprising a series of information elements and generating therefrom respective classifications for said information elements from a plurality of predetermined classifications, said method comprising the steps of:
-
A. receiving the information stream, parsing the information stream to identify the respective information elements, identifying for each information element one of a plurality of element types, and generating for each information element a token identifying the information element'"'"'s element type; B. generating for each token a classification to classify each said token in relation to the element type associated with said respective token, classifications generated for previously-classified tokens and the types of previous and successive tokens, thereby to determine the semantic content of the information associated with the tokens. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A computer program product for controlling a computer to receiving an information stream in the form of textual information comprising a series of information elements and generating therefrom respective classifications for said information elements from a plurality of predetermined classifications, said computer program product comprising a machine-readable medium having encoded thereon:
-
A. a token generator module configured to enable the computer to receive the information stream, parse the information stream to identify the respective information elements, identify for each information element one of a plurality of element types, and generate for each information element a token identifying the information element'"'"'s element type; B. a token classifier module configured to enable the computer to, for each token, generate a classification to classify each said token in relation to the element type associated with said respective token, classifications generated for previously-classified tokens and the types of previous and successive tokens, thereby to determine the semantic content of the information associated with the tokens. - View Dependent Claims (12, 13, 14, 15)
-
Specification