System and method for identifying word patterns in text
First Claim
1. A method for identifying objects referenced in a stream of text, the method comprising:
- receiving an incoming stream of text comprised of words;
consulting a semantic network to automatically identify one or more word patterns in the incoming stream of text with a single examination of each word; and
referencing a known object identified by a word pattern of the semantic network.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method for identifying word patterns in text is conducted in real time and is highly suitable for network and Internet use. The system comprises a semantic network that may be compiled on a local computer or at a remote host and a software text analysis module for receiving the text to be analyzed, parsing the text, submitting the text to the semantic network, and receiving the results. The method involves receiving a stream of text, breaking the stream of text into a plurality of threads, tokenizing the words in each thread, and comparing the words to identified words in the semantic network. Recognized, words are then examined, together with surrounding words in the text to determine whether the words are part of a word pattern. Word patterns are located at nodes in the semantic network in a hierarchical structure, and certain word patterns correspond to objects of the semantic network. When all word patterns involving a word are located, links are followed to objects corresponding to the word patterns. Several nodes may point to a single object, but each object is represented only once in the semantic network. Identified objects may thus be identified in real time, as the text streams through the text analysis module.
-
Citations
22 Claims
-
1. A method for identifying objects referenced in a stream of text, the method comprising:
-
receiving an incoming stream of text comprised of words;
consulting a semantic network to automatically identify one or more word patterns in the incoming stream of text with a single examination of each word; and
referencing a known object identified by a word pattern of the semantic network. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method for identifying objects referenced in a stream of text, the method comprising:
-
loading a semantic network substantially entirely into RAM memory of a processor, the semantic network comprised of recognized words and patterns of words in a hierarchical order;
receiving an incoming stream of text comprised of words;
tokenizing the stream of text into individual words;
examining the individual words in the stream of text in a sequential order as the words are received by consulting the semantic network within the RAM memory to automatically identify one or more word patterns in the incoming stream of text with a single examination of each individual word in the order that the individual words are received examining the individual words comprising;
finding a match between an individual word in the stream of text and an identified word in the semantic network and comparing the individual word and an adjacent word of the stream of text to a word pattern in the semantic network, and continually adding words of the stream of text to recognized word patterns and comparing the result to other word patterns in the semantic network until no more word patterns containing the individual word are located;
referencing a known object identified by a word pattern of the semantic network; and
formatting the stream of text to represent identified objects without persistently storing the stream of text.
-
-
13. A system for identifying objects referenced in a stream of text, the system comprising:
-
an input pipeline configured to receive an incoming stream of text comprised of words;
a text analysis module configured to consult a semantic network to automatically identify one or more word patterns in the incoming stream of text with a single examination of each word; and
an object association module configured to reference a known object identified by a word pattern of the semantic network. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22)
-
Specification