Language Processing And Knowledge Building System
First Claim
1. A method for processing textual data, said method employing a language processing and knowledge building system comprising at least one processor configured to execute computer program instructions for performing said method, said method comprising:
- receiving said textual data and a language object by said language processing and knowledge building system;
segmenting said received textual data into one or more sentences by said language processing and knowledge building system based on a plurality of sentence terminators predefined in said language object;
segmenting each of said one or more sentences into a plurality of words by said language processing and knowledge building system based on a plurality of word separators predefined in said language object;
generating a list of one or more natural language phrase objects for each of said words by said language processing and knowledge building system by identifying vocabulary classes and vocabulary class features for said each of said words based on vocabulary class feature differentiators predefined in said language object;
creating one or more sentence phrase lists by said language processing and knowledge building system using each said generated list of one or more natural language phrase objects, wherein each of said created one or more sentence phrase lists comprises a combination of one natural language phrase object selected for said each of said words from said each said generated list of one or more natural language phrase objects;
grouping two or more natural language phrase objects in said each of said created one or more sentence phrase lists by said language processing and knowledge building system based on word to word association rules predefined in said language object, said identified vocabulary classes, said identified vocabulary class features, and a position of each natural language phrase object in said each of said created one or more sentence phrase lists, and replacing each said grouped two or more natural language phrase objects in said each of said created one or more sentence phrase lists with a consolidated natural language phrase object;
mapping said segmented each of said one or more sentences to a sentence type by;
mapping each natural language phrase object present in said each of said created one or more sentence phrase lists at a current point in said processing of said received textual data to a sentence part type in a sentence type selected iteratively from a plurality of sentence types predefined in said language object by said language processing and knowledge building system, based on word to sentence part type association rules predefined in said language object, using said identified vocabulary classes, said identified vocabulary class features, and said position of said each natural language phrase object in said each of said created one or more sentence phrase lists at said current point in said processing of said received textual data, wherein said each natural language phrase object at said current point in said processing of said received textual data is one of;
one from said generated list of one or more natural language phrase objects and said consolidated natural language phrase object; and
identifying said sentence type of said segmented each of said one or more sentences by said language processing and knowledge building system from a sentence type with a highest number of successfully mapped sentence part types;
identifying, for said mapped each natural language phrase object in said each of said created one or more sentence phrase lists mapped successfully to said identified sentence type, one or more of a plurality of semantic items corresponding to a root word of said mapped each natural language phrase object in said each of said created one or more sentence phrase lists mapped successfully to said identified sentence type, from one or more of a discourse context and system knowledge by said language processing and knowledge building system;
selecting, for said mapped each natural language phrase object, one of said identified one or more of said semantic items by said language processing and knowledge building system based on predefined semantic disambiguation rules; and
identifying attributes of one of created semantic items and said selected one of said identified one or more of said semantic items, and further identifying relations between said one of said created semantic items and said selected one of said identified one or more of said semantic items and said semantic items in said one or more of said discourse context and said system knowledge, and adding said identified attributes to said one of said created semantic items and said selected one of said identified one or more of said semantic items and further adding said created semantic items and said identified relations to said discourse context and said system knowledge by said language processing and knowledge building system based on said identified sentence type and semantic consequence rules of said identified sentence type, predefined in said language object.
0 Assignments
0 Petitions
Accused Products
Abstract
A method and a language processing and knowledge building system (LPKBS) for processing textual data, receives textual data and a language object; segments the textual data into sentences and each sentence into words; generates a list of one or more natural language phrase objects (NLPOs) for each word by identifying vocabulary classes and vocabulary class features for each word based on vocabulary class feature differentiators; creates sentence phrase lists, each including a combination of one NLPO selected per word from each list of NLPOs; groups two or more NLPOs in each sentence phrase list based on word to word association rules, the vocabulary classes, the vocabulary class features, and a position of each NLPO; replaces each such group of NLPOs with a consolidated NLPO; maps each segmented sentence to a sentence type; identifies a semantic item for each mapped NLPO; and identifies and stores associated attributes and relations.
-
Citations
42 Claims
-
1. A method for processing textual data, said method employing a language processing and knowledge building system comprising at least one processor configured to execute computer program instructions for performing said method, said method comprising:
-
receiving said textual data and a language object by said language processing and knowledge building system; segmenting said received textual data into one or more sentences by said language processing and knowledge building system based on a plurality of sentence terminators predefined in said language object; segmenting each of said one or more sentences into a plurality of words by said language processing and knowledge building system based on a plurality of word separators predefined in said language object; generating a list of one or more natural language phrase objects for each of said words by said language processing and knowledge building system by identifying vocabulary classes and vocabulary class features for said each of said words based on vocabulary class feature differentiators predefined in said language object; creating one or more sentence phrase lists by said language processing and knowledge building system using each said generated list of one or more natural language phrase objects, wherein each of said created one or more sentence phrase lists comprises a combination of one natural language phrase object selected for said each of said words from said each said generated list of one or more natural language phrase objects; grouping two or more natural language phrase objects in said each of said created one or more sentence phrase lists by said language processing and knowledge building system based on word to word association rules predefined in said language object, said identified vocabulary classes, said identified vocabulary class features, and a position of each natural language phrase object in said each of said created one or more sentence phrase lists, and replacing each said grouped two or more natural language phrase objects in said each of said created one or more sentence phrase lists with a consolidated natural language phrase object; mapping said segmented each of said one or more sentences to a sentence type by; mapping each natural language phrase object present in said each of said created one or more sentence phrase lists at a current point in said processing of said received textual data to a sentence part type in a sentence type selected iteratively from a plurality of sentence types predefined in said language object by said language processing and knowledge building system, based on word to sentence part type association rules predefined in said language object, using said identified vocabulary classes, said identified vocabulary class features, and said position of said each natural language phrase object in said each of said created one or more sentence phrase lists at said current point in said processing of said received textual data, wherein said each natural language phrase object at said current point in said processing of said received textual data is one of;
one from said generated list of one or more natural language phrase objects and said consolidated natural language phrase object; andidentifying said sentence type of said segmented each of said one or more sentences by said language processing and knowledge building system from a sentence type with a highest number of successfully mapped sentence part types; identifying, for said mapped each natural language phrase object in said each of said created one or more sentence phrase lists mapped successfully to said identified sentence type, one or more of a plurality of semantic items corresponding to a root word of said mapped each natural language phrase object in said each of said created one or more sentence phrase lists mapped successfully to said identified sentence type, from one or more of a discourse context and system knowledge by said language processing and knowledge building system; selecting, for said mapped each natural language phrase object, one of said identified one or more of said semantic items by said language processing and knowledge building system based on predefined semantic disambiguation rules; and identifying attributes of one of created semantic items and said selected one of said identified one or more of said semantic items, and further identifying relations between said one of said created semantic items and said selected one of said identified one or more of said semantic items and said semantic items in said one or more of said discourse context and said system knowledge, and adding said identified attributes to said one of said created semantic items and said selected one of said identified one or more of said semantic items and further adding said created semantic items and said identified relations to said discourse context and said system knowledge by said language processing and knowledge building system based on said identified sentence type and semantic consequence rules of said identified sentence type, predefined in said language object. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A language processing and knowledge building system for processing textual data, said language processing and knowledge building system comprising:
-
a non-transitory computer readable storage medium configured to store computer program instructions defined by modules of said language processing and knowledge building system; at least one processor communicatively coupled to said non-transitory computer readable storage medium, said at least one processor configured to execute said defined computer program instructions; and said modules of said language processing and knowledge building system comprising; a data reception module configured to receive said textual data and a language object; a data segmentation module configured to segment said received textual data into one or more sentences based on a plurality of sentence terminators predefined in said language object; said data segmentation module further configured to segment each of said one or more sentences into a plurality of words based on a plurality of word separators predefined in said language object; a phrase object processing module configured to generate a list of one or more natural language phrase objects for each of said words by identifying vocabulary classes and vocabulary class features for said each of said words based on vocabulary class feature differentiators predefined in said language object; said phrase object processing module further configured to create one or more sentence phrase lists using each said generated list of one or more natural language phrase objects, wherein each of said created one or more sentence phrase lists comprises a combination of one natural language phrase object selected for said each of said words from said each said generated list of one or more natural language phrase objects; said phrase object processing module further configured to group two or more natural language phrase objects in said each of said created one or more sentence phrase lists based on word to word association rules predefined in said language object, said identified vocabulary classes, said identified vocabulary class features, and a position of each natural language phrase object in said each of said created one or more sentence phrase lists, and to replace each said grouped two or more natural language phrase objects in said each of said created one or more sentence phrase lists with a consolidated natural language phrase object; a mapping module configured to map said segmented each of said one or more sentences to a sentence type by; mapping each natural language phrase object present in said each of said created one or more sentence phrase lists at a current point in said processing of said received textual data to a sentence part type in a sentence type selected iteratively from a plurality of sentence types predefined in said language object, based on word to sentence part type association rules predefined in said language object, using said identified vocabulary classes, said identified vocabulary class features, and said position of said each natural language phrase object in said each of said created one or more sentence phrase lists at said current point in said processing of said received textual data, wherein said each natural language phrase object at said current point in said processing of said received textual data is one of;
one from said generated list of one or more natural language phrase objects and said consolidated natural language phrase objects; andidentifying said sentence type of said segmented each of said one or more sentences from a sentence type with a highest number of successfully mapped sentence part types; a semantic item identification module configured to identify, for said mapped each natural language phrase object in said each of said created one or more sentence phrase lists mapped successfully to said identified sentence type, one or more of a plurality of semantic items corresponding to a root word of said mapped each natural language phrase object in said each of said created one or more sentence phrase lists mapped successfully to said identified sentence type, from one or more of a discourse context and system knowledge; a semantic item processing module configured to select, for said mapped each natural language phrase object, one of said identified one or more of said semantic items based on predefined semantic disambiguation rules; and said semantic item processing module further configured to identify attributes of one of created semantic items and said selected one of said identified one or more of said semantic items, and further to identify relations between said one of said created semantic items and said selected one of said identified one or more of said semantic items and said semantic items in said one or more of said discourse context and said system knowledge, and to add said identified attributes to said one of said created semantic items and said selected one of said identified one or more of said semantic items and further to add said created semantic items and said identified relations to said discourse context and said system knowledge based on said identified sentence type and semantic consequence rules of said identified sentence type, predefined in said language object. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42)
-
Specification