Language Processing And Knowledge Building System

US 20160364377A1
Filed: 08/13/2015
Published: 12/15/2016
Est. Priority Date: 06/12/2015
Status: Active Grant

First Claim

Patent Images

1. A method for processing textual data, said method employing a language processing and knowledge building system comprising at least one processor configured to execute computer program instructions for performing said method, said method comprising:

receiving said textual data and a language object by said language processing and knowledge building system;

segmenting said received textual data into one or more sentences by said language processing and knowledge building system based on a plurality of sentence terminators predefined in said language object;

segmenting each of said one or more sentences into a plurality of words by said language processing and knowledge building system based on a plurality of word separators predefined in said language object;

generating a list of one or more natural language phrase objects for each of said words by said language processing and knowledge building system by identifying vocabulary classes and vocabulary class features for said each of said words based on vocabulary class feature differentiators predefined in said language object;

creating one or more sentence phrase lists by said language processing and knowledge building system using each said generated list of one or more natural language phrase objects, wherein each of said created one or more sentence phrase lists comprises a combination of one natural language phrase object selected for said each of said words from said each said generated list of one or more natural language phrase objects;

grouping two or more natural language phrase objects in said each of said created one or more sentence phrase lists by said language processing and knowledge building system based on word to word association rules predefined in said language object, said identified vocabulary classes, said identified vocabulary class features, and a position of each natural language phrase object in said each of said created one or more sentence phrase lists, and replacing each said grouped two or more natural language phrase objects in said each of said created one or more sentence phrase lists with a consolidated natural language phrase object;

mapping said segmented each of said one or more sentences to a sentence type by;

mapping each natural language phrase object present in said each of said created one or more sentence phrase lists at a current point in said processing of said received textual data to a sentence part type in a sentence type selected iteratively from a plurality of sentence types predefined in said language object by said language processing and knowledge building system, based on word to sentence part type association rules predefined in said language object, using said identified vocabulary classes, said identified vocabulary class features, and said position of said each natural language phrase object in said each of said created one or more sentence phrase lists at said current point in said processing of said received textual data, wherein said each natural language phrase object at said current point in said processing of said received textual data is one of;

one from said generated list of one or more natural language phrase objects and said consolidated natural language phrase object; and

identifying said sentence type of said segmented each of said one or more sentences by said language processing and knowledge building system from a sentence type with a highest number of successfully mapped sentence part types;

identifying, for said mapped each natural language phrase object in said each of said created one or more sentence phrase lists mapped successfully to said identified sentence type, one or more of a plurality of semantic items corresponding to a root word of said mapped each natural language phrase object in said each of said created one or more sentence phrase lists mapped successfully to said identified sentence type, from one or more of a discourse context and system knowledge by said language processing and knowledge building system;

selecting, for said mapped each natural language phrase object, one of said identified one or more of said semantic items by said language processing and knowledge building system based on predefined semantic disambiguation rules; and

identifying attributes of one of created semantic items and said selected one of said identified one or more of said semantic items, and further identifying relations between said one of said created semantic items and said selected one of said identified one or more of said semantic items and said semantic items in said one or more of said discourse context and said system knowledge, and adding said identified attributes to said one of said created semantic items and said selected one of said identified one or more of said semantic items and further adding said created semantic items and said identified relations to said discourse context and said system knowledge by said language processing and knowledge building system based on said identified sentence type and semantic consequence rules of said identified sentence type, predefined in said language object.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and a language processing and knowledge building system (LPKBS) for processing textual data, receives textual data and a language object; segments the textual data into sentences and each sentence into words; generates a list of one or more natural language phrase objects (NLPOs) for each word by identifying vocabulary classes and vocabulary class features for each word based on vocabulary class feature differentiators; creates sentence phrase lists, each including a combination of one NLPO selected per word from each list of NLPOs; groups two or more NLPOs in each sentence phrase list based on word to word association rules, the vocabulary classes, the vocabulary class features, and a position of each NLPO; replaces each such group of NLPOs with a consolidated NLPO; maps each segmented sentence to a sentence type; identifies a semantic item for each mapped NLPO; and identifies and stores associated attributes and relations.

Citations

42 Claims

1. A method for processing textual data, said method employing a language processing and knowledge building system comprising at least one processor configured to execute computer program instructions for performing said method, said method comprising:
- receiving said textual data and a language object by said language processing and knowledge building system;
  
  segmenting said received textual data into one or more sentences by said language processing and knowledge building system based on a plurality of sentence terminators predefined in said language object;
  
  segmenting each of said one or more sentences into a plurality of words by said language processing and knowledge building system based on a plurality of word separators predefined in said language object;
  
  generating a list of one or more natural language phrase objects for each of said words by said language processing and knowledge building system by identifying vocabulary classes and vocabulary class features for said each of said words based on vocabulary class feature differentiators predefined in said language object;
  
  creating one or more sentence phrase lists by said language processing and knowledge building system using each said generated list of one or more natural language phrase objects, wherein each of said created one or more sentence phrase lists comprises a combination of one natural language phrase object selected for said each of said words from said each said generated list of one or more natural language phrase objects;
  
  grouping two or more natural language phrase objects in said each of said created one or more sentence phrase lists by said language processing and knowledge building system based on word to word association rules predefined in said language object, said identified vocabulary classes, said identified vocabulary class features, and a position of each natural language phrase object in said each of said created one or more sentence phrase lists, and replacing each said grouped two or more natural language phrase objects in said each of said created one or more sentence phrase lists with a consolidated natural language phrase object;
  
  mapping said segmented each of said one or more sentences to a sentence type by;
  
  mapping each natural language phrase object present in said each of said created one or more sentence phrase lists at a current point in said processing of said received textual data to a sentence part type in a sentence type selected iteratively from a plurality of sentence types predefined in said language object by said language processing and knowledge building system, based on word to sentence part type association rules predefined in said language object, using said identified vocabulary classes, said identified vocabulary class features, and said position of said each natural language phrase object in said each of said created one or more sentence phrase lists at said current point in said processing of said received textual data, wherein said each natural language phrase object at said current point in said processing of said received textual data is one of;
  
  one from said generated list of one or more natural language phrase objects and said consolidated natural language phrase object; and
  
  identifying said sentence type of said segmented each of said one or more sentences by said language processing and knowledge building system from a sentence type with a highest number of successfully mapped sentence part types;
  
  identifying, for said mapped each natural language phrase object in said each of said created one or more sentence phrase lists mapped successfully to said identified sentence type, one or more of a plurality of semantic items corresponding to a root word of said mapped each natural language phrase object in said each of said created one or more sentence phrase lists mapped successfully to said identified sentence type, from one or more of a discourse context and system knowledge by said language processing and knowledge building system;
  
  selecting, for said mapped each natural language phrase object, one of said identified one or more of said semantic items by said language processing and knowledge building system based on predefined semantic disambiguation rules; and
  
  identifying attributes of one of created semantic items and said selected one of said identified one or more of said semantic items, and further identifying relations between said one of said created semantic items and said selected one of said identified one or more of said semantic items and said semantic items in said one or more of said discourse context and said system knowledge, and adding said identified attributes to said one of said created semantic items and said selected one of said identified one or more of said semantic items and further adding said created semantic items and said identified relations to said discourse context and said system knowledge by said language processing and knowledge building system based on said identified sentence type and semantic consequence rules of said identified sentence type, predefined in said language object.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
- - 2. The method of claim 1, wherein said discourse context is configured as a data structure comprising said semantic items, attributes of said semantic items, and relations between said semantic items, wherein said semantic items are one of associated and to be associated with said each of said words of said received textual data under process by said language processing and knowledge building system.
  - 3. The method of claim 1, wherein said system knowledge is configured as a data structure comprising zero or more of said semantic items, attributes of said semantic items, and relations between said semantic items, wherein said semantic items are one of associated and to be associated with said each of said words of textual data previously processed by said language processing and knowledge building system.
  - 4. The method of claim 1, wherein each of said semantic items is one of a semantic classification, an action category, a specific object, a symbolic object, and an attribute of a classification.
  - 5. The method of claim 1, further comprising creating one of a semantic item, an attribute of said semantic item, and a relation of said semantic item with other said semantic items by said language processing and knowledge building system, and storing said created one of said semantic item, said attribute, and said relation in said discourse context and said system knowledge when said one of said semantic item, said attribute, and said relation for a root word under a natural language phrase object mapped to a sentence part type is not found in said one or more of said discourse context and said system knowledge, wherein said creation of said semantic item is based on predefined semantic item creation rules.
  - 6. The method of claim 1, further comprising determining whether to create one of a semantic classification, a class attribute, and a symbolic object for a mapped natural language phrase object with a common noun root word by said language processing and knowledge building system using heuristic rules on context and by processing lexical meaning definition text data for said common noun root word obtained from a vocabulary object predefined in said language object.
  - 7. The method of claim 1, wherein said language object is predefined for a natural language used in said received textual data.
  - 8. The method of claim 1, wherein said vocabulary classes comprise parts of speech comprising a noun, a pronoun, a verb, an adjective, and an adverb.
  - 9. The method of claim 1, wherein said vocabulary class features comprise a case, a gender, a number, and a tense of said each of said words.
  - 10. The method of claim 1, further comprising identifying one or more root words and semantic variations for said each of said words by said language processing and knowledge building system based on connector word rules and morphed word rules predefined in said language object for vocabulary class feature differentiation and sense differentiation.
  - 11. The method of claim 10, further comprising validating said identified vocabulary classes and each of said identified one or more root words for said each of said words by said language processing and knowledge building system by querying a vocabulary object predefined in said language object.
  - 12. The method of claim 1, wherein each of said one or more natural language phrase objects comprises said identified vocabulary classes, said identified vocabulary class features, and a root word of said each of said words.
  - 13. The method of claim 1, wherein said word to word association rules comprise rules for associating a word of an adjective vocabulary class with a word of a noun vocabulary class, a word of an adverb vocabulary class with a word of a verb vocabulary class, and a word of an adverb vocabulary class with a word of an adjective vocabulary class.
  - 14. The method of claim 1, further comprising grouping one of said grouped two or more natural language phrase objects and ungrouped natural language phrase objects in said each of said created one or more sentence phrase lists based on said identified vocabulary classes, said identified vocabulary class features, list item separators, and list terminators predefined in said language object.
  - 15. The method of claim 1, further comprising updating said discourse context with said grouped two or more natural language phrase objects to facilitate dereferencing of a word of a pronoun vocabulary class in subsequent sentences.
  - 16. The method of claim 1, wherein said sentence part type comprises an action denoter, a doer denoter, a class denoter, a behaviour denoter, and a part denoter.
  - 17. The method of claim 1, wherein said word to sentence part type association rules comprise rules specifying a mapping of a natural language phrase object to a sentence part type in a sentence type based on said vocabulary classes, said vocabulary class features in said natural language phrase object, and a position of said natural language phrase object in said each of said created one or more sentence phrase lists.
  - 18. The method of claim 1, further comprising identifying, for said each of said one or more sentences, a sentence type based on question terminators predefined in said language object by said language processing and knowledge building system, wherein said sentence type comprises one of a proposition and a question.
  - 19. The method of claim 18, further comprising updating a last question attribute in said discourse context by said language processing and knowledge building system when said identified sentence type is said question, to facilitate processing of a possible answer in a subsequent sentence even if said subsequent sentence omits one or more logical parts.
  - 20. The method of claim 1, further comprising identifying natural language phrase objects for each unmapped sentence part type by said language processing and knowledge building system using said discourse context and valid-in-interim association rules.
  - 21. The method of claim 1, further comprising identifying a sentence part type for each of unmapped natural language phrase objects by said language processing and knowledge building system based on a semantic compounding mechanism predefined in said language object, wherein said semantic compounding mechanism comprises rules for associating sentence part types with natural language phrase objects based on one or more of a plurality of semantic compounding types, wherein said semantic compounding types comprise a concurrency, a cause-effect relationship, a condition, and a shared object between two sentences.

22. A language processing and knowledge building system for processing textual data, said language processing and knowledge building system comprising:
- a non-transitory computer readable storage medium configured to store computer program instructions defined by modules of said language processing and knowledge building system;
  
  at least one processor communicatively coupled to said non-transitory computer readable storage medium, said at least one processor configured to execute said defined computer program instructions; and
  
  said modules of said language processing and knowledge building system comprising;
  
  a data reception module configured to receive said textual data and a language object;
  
  a data segmentation module configured to segment said received textual data into one or more sentences based on a plurality of sentence terminators predefined in said language object;
  
  said data segmentation module further configured to segment each of said one or more sentences into a plurality of words based on a plurality of word separators predefined in said language object;
  
  a phrase object processing module configured to generate a list of one or more natural language phrase objects for each of said words by identifying vocabulary classes and vocabulary class features for said each of said words based on vocabulary class feature differentiators predefined in said language object;
  
  said phrase object processing module further configured to create one or more sentence phrase lists using each said generated list of one or more natural language phrase objects, wherein each of said created one or more sentence phrase lists comprises a combination of one natural language phrase object selected for said each of said words from said each said generated list of one or more natural language phrase objects;
  
  said phrase object processing module further configured to group two or more natural language phrase objects in said each of said created one or more sentence phrase lists based on word to word association rules predefined in said language object, said identified vocabulary classes, said identified vocabulary class features, and a position of each natural language phrase object in said each of said created one or more sentence phrase lists, and to replace each said grouped two or more natural language phrase objects in said each of said created one or more sentence phrase lists with a consolidated natural language phrase object;
  
  a mapping module configured to map said segmented each of said one or more sentences to a sentence type by;
  
  mapping each natural language phrase object present in said each of said created one or more sentence phrase lists at a current point in said processing of said received textual data to a sentence part type in a sentence type selected iteratively from a plurality of sentence types predefined in said language object, based on word to sentence part type association rules predefined in said language object, using said identified vocabulary classes, said identified vocabulary class features, and said position of said each natural language phrase object in said each of said created one or more sentence phrase lists at said current point in said processing of said received textual data, wherein said each natural language phrase object at said current point in said processing of said received textual data is one of;
  
  one from said generated list of one or more natural language phrase objects and said consolidated natural language phrase objects; and
  
  identifying said sentence type of said segmented each of said one or more sentences from a sentence type with a highest number of successfully mapped sentence part types;
  
  a semantic item identification module configured to identify, for said mapped each natural language phrase object in said each of said created one or more sentence phrase lists mapped successfully to said identified sentence type, one or more of a plurality of semantic items corresponding to a root word of said mapped each natural language phrase object in said each of said created one or more sentence phrase lists mapped successfully to said identified sentence type, from one or more of a discourse context and system knowledge;
  
  a semantic item processing module configured to select, for said mapped each natural language phrase object, one of said identified one or more of said semantic items based on predefined semantic disambiguation rules; and
  
  said semantic item processing module further configured to identify attributes of one of created semantic items and said selected one of said identified one or more of said semantic items, and further to identify relations between said one of said created semantic items and said selected one of said identified one or more of said semantic items and said semantic items in said one or more of said discourse context and said system knowledge, and to add said identified attributes to said one of said created semantic items and said selected one of said identified one or more of said semantic items and further to add said created semantic items and said identified relations to said discourse context and said system knowledge based on said identified sentence type and semantic consequence rules of said identified sentence type, predefined in said language object.
- View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42)
- - 23. The language processing and knowledge building system of claim 22, wherein said discourse context is configured as a data structure comprising said semantic items, attributes of said semantic items, and relations between said semantic items, wherein said semantic items are one of associated and to be associated with said each of said words of said received textual data under process by said language processing and knowledge building system.
  - 24. The language processing and knowledge building system of claim 22, wherein said system knowledge is configured as a data structure comprising zero or more of said semantic items, attributes of said semantic items, and relations between said semantic items, wherein said semantic items are one of associated and to be associated with said each of said words of textual data previously processed by said language processing and knowledge building system.
  - 25. The language processing and knowledge building system of claim 22, wherein each of said semantic items is one of a semantic classification, an action category, a specific object, a symbolic object, and an attribute of a classification.
  - 26. The language processing and knowledge building system of claim 22, wherein said semantic item processing module is further configured to create one of a semantic item, an attribute of said semantic item, and a relation of said semantic item with other said semantic items and store said created one of said semantic item, said attribute, and said relation in said discourse context and said system knowledge when said one of said semantic item, said attribute, and said relation for a root word under a natural language phrase object mapped to a sentence part type is not found in said one or more of said discourse context and said system knowledge, wherein said creation of said semantic item is based on predefined semantic item creation rules.
  - 27. The language processing and knowledge building system of claim 22, wherein said semantic item processing module is further configured to determine whether to create one of a semantic classification, a class attribute, and a symbolic object for a mapped natural language phrase object with a common noun root word using heuristic rules on context and by processing lexical meaning definition text data for said common noun root word obtained from a vocabulary object predefined in said language object.
  - 28. The language processing and knowledge building system of claim 22, wherein said language object is predefined for a natural language used in said received textual data.
  - 29. The language processing and knowledge building system of claim 22, wherein said vocabulary classes comprise parts of speech comprising a noun, a pronoun, a verb, an adjective, and an adverb.
  - 30. The language processing and knowledge building system of claim 22, wherein said vocabulary class features comprise a case, a gender, a number, and a tense of said each of said words.
  - 31. The language processing and knowledge building system of claim 22, wherein said phrase object processing module is further configured to identify one or more root words and semantic variations for said each of said words based on connector word rules and morphed word rules predefined in said language object for vocabulary class feature differentiation and sense differentiation.
  - 32. The language processing and knowledge building system of claim 31, wherein said phrase object processing module is further configured to validate said identified vocabulary classes and each of said identified one or more root words for said each of said words by querying a vocabulary object predefined in said language object.
  - 33. The language processing and knowledge building system of claim 22, wherein each of said one or more natural language phrase objects comprises said identified vocabulary classes, said identified vocabulary class features, and a root word of said each of said words.
  - 34. The language processing and knowledge building system of claim 22, wherein said word to word association rules comprise rules for associating a word of an adjective vocabulary class with a word of a noun vocabulary class, a word of an adverb vocabulary class with a word of a verb vocabulary class, and a word of an adverb vocabulary class with a word of an adjective vocabulary class.
  - 35. The language processing and knowledge building system of claim 22, wherein said phrase object processing module is further configured to group one of said grouped two or more natural language phrase objects and ungrouped natural language phrase objects in said each of said created one or more sentence phrase lists based on said identified vocabulary classes, said identified vocabulary class features, list item separators, and list terminators predefined in said language object.
  - 36. The language processing and knowledge building system of claim 22, wherein said phrase object processing module is further configured to update said discourse context with said grouped two or more natural language phrase objects to facilitate dereferencing of a word of a pronoun vocabulary class in subsequent sentences.
  - 37. The language processing and knowledge building system of claim 22, wherein said sentence part type comprises an action denoter, a doer denoter, a class denoter, a behaviour denoter, and a part denoter.
  - 38. The language processing and knowledge building system of claim 22, wherein said word to sentence part type association rules comprise rules specifying a mapping of a natural language phrase object to a sentence part type in a sentence type based on said vocabulary classes, said vocabulary class features in said natural language phrase object, and a position of said natural language phrase object in said each of said created one or more sentence phrase lists.
  - 39. The language processing and knowledge building system of claim 22, wherein said phrase object processing module is further configured to identify, for said each of said one or more sentences, a sentence type based on question terminators predefined in said language object, wherein said sentence type comprises one of a proposition and a question.
  - 40. The language processing and knowledge building system of claim 39, wherein said phrase object processing module is further configured to update a last question attribute in said discourse context when said identified sentence type is said question, to facilitate processing of a possible answer in a subsequent sentence even if said subsequent sentence omits one or more logical parts.
  - 41. The language processing and knowledge building system of claim 22, wherein said phrase object processing module is further configured to identify natural language phrase objects for each unmapped sentence part type using said discourse context and valid-in-interim association rules.
  - 42. The language processing and knowledge building system of claim 22, wherein said phrase object processing module is further configured to identify a sentence part type for each of unmapped natural language phrase objects based on a semantic compounding mechanism predefined in said language object, wherein said semantic compounding mechanism comprises rules for associating sentence part types with natural language phrase objects based on one or more of a plurality of semantic compounding types, wherein said semantic compounding types comprise a concurrency, a cause-effect relationship, a condition, and a shared object between two sentences.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Satyanarayana Krishnamurthy
Original Assignee
Satyanarayana Krishnamurthy
Inventors
Krishnamurthy, Satyanarayana

Granted Patent

US 10,496,749 B2
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G06F 40/211   Syntactic parsing, e.g. bas...

G06F 40/30   Semantic analysis

G06F 40/35   Discourse or dialogue repre...

Language Processing And Knowledge Building System

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

Citations

42 Claims

Specification

Solutions

Use Cases

Quick Links

Language Processing And Knowledge Building System

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

42 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links