Processing Text with Domain-Specific Spreading Activation Methods
First Claim
Patent Images
1. A method comprising:
- identifying one or more of a plurality of groups of characters of a text as corresponding to at least one of a plurality of known words;
creating a list of the identified known words;
querying a first database to obtain a set of one or more semantic concepts associated with each of the identified known words, the first database comprising associations between the plurality of known words and a plurality of semantic concepts;
annotating the list of identified known words with the first set of semantic concepts associated with each identified known word;
querying a second database to obtain a set of one or more episodic concepts associated with the set of semantic concepts, the second database comprising associations between a plurality of episodic concepts and at least one of the plurality of known words and the plurality of semantic concepts, the plurality of episodic concepts being separate from the plurality of semantic concepts;
creating a semantic network having a plurality of nodes corresponding to the first and second sets of semantic and episodic concepts and weighted links between the first and second sets of semantic and episodic concepts;
utilizing spreading activation algorithms to refine the weighted links in the semantic network; and
selecting at least one of the concepts from the sets of semantic and episodic concepts based upon an associated weight for the at least one node derived from the step of utilizing spreading activation.
4 Assignments
0 Petitions
Accused Products
Abstract
A method for performing natural language processing of free text using domain-specific spreading activation. Embodiments of the present invention ontologize free text using an algorithm based on neurocognitive theory by simulating human recognition, semantic, and episodic memory approaches. Embodiments of the invention may be used to process clinical text for assignment of billing codes, analyze suicide notes or legal discovery materials, and for processing other collections of text. Further, embodiments of the invention may be used to more effectively search large databases, such as a database containing a large number of medical publications.
-
Citations
19 Claims
-
1. A method comprising:
-
identifying one or more of a plurality of groups of characters of a text as corresponding to at least one of a plurality of known words; creating a list of the identified known words; querying a first database to obtain a set of one or more semantic concepts associated with each of the identified known words, the first database comprising associations between the plurality of known words and a plurality of semantic concepts; annotating the list of identified known words with the first set of semantic concepts associated with each identified known word; querying a second database to obtain a set of one or more episodic concepts associated with the set of semantic concepts, the second database comprising associations between a plurality of episodic concepts and at least one of the plurality of known words and the plurality of semantic concepts, the plurality of episodic concepts being separate from the plurality of semantic concepts; creating a semantic network having a plurality of nodes corresponding to the first and second sets of semantic and episodic concepts and weighted links between the first and second sets of semantic and episodic concepts; utilizing spreading activation algorithms to refine the weighted links in the semantic network; and selecting at least one of the concepts from the sets of semantic and episodic concepts based upon an associated weight for the at least one node derived from the step of utilizing spreading activation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for performing a method for processing a text containing natural language, the method comprising:
-
tagging parts of speech in the text; recognizing known words in the text; creating a semantic network, the semantic network including at least one of the recognized known words and at least one relationship with at least one semantic concept associated with at least one of the recognized known words; and supplementing the semantic network by iteratively adding additional concepts and additional relationships to the semantic network until a termination requirement is met, each additional concept being associated with at least a prior one of the concepts and additional concepts in the semantic network by a respective additional relationship, at least one of the additional concepts being an episodic concept separate from the at least one semantic concept. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A method for processing natural language, comprising:
-
identifying one or more of a plurality of groups of characters of a text as corresponding to at least one of a plurality of known words; creating a list of the identified known words; querying one or more databases to obtain a first set of semantic concepts associated with each of the identified known words, the one or more databases including associations between a plurality of known words and a plurality of concepts, and including quantitative values representative of a strength of a relationship between the plurality of concepts; annotating the list of identified known words with the first set of semantic concepts associated with each identified known word; creating a semantic network having a plurality of nodes corresponding to the first set of semantic concepts; iteratively expanding the semantic network with additional concepts taken from the one or more databases and linked to respective nodes in the semantic network to iteratively add new nodes to the semantic network for such additional concepts, each new node including a weighted link with an existing node, the additional concepts being separate from the first set of semantic concepts and including at least one episodic concept; and selecting at least one of the concepts from the combination of the first set of concepts and the additional concepts based upon a value of the weighted link included with the node associated with the at least one selected concept. - View Dependent Claims (18, 19)
-
Specification