Architecture and processes for computer learning and understanding
First Claim
1. A computer-implemented method, comprising:
- receiving, at a computing system, a natural language input;
performing, at the computing system, a syntactic analysis of the natural language input to produce at least a first linguistic analysis result and a second linguistic analysis result, wherein performing the syntactic analysis comprises;
generating a predicate argument structure (PAS) for the natural language input;
assigning an entity type to one or more words in the natural language input; and
determining a co-reference chain in the natural language input;
for the first linguistic analysis result, forming, at the computing system, one or more semantic structures to provide a semantic level understanding of the natural language input in part by using the first linguistic analysis result and knowledge induced from language corpora, wherein the one or more semantic structures include at least one frame structure including a set of generative semantic primitive (GSP) structures that are related under a particular context of a particular setting, wherein a GSP structure of the set of GSP structures includes a predicate and one or more roles and has at least a threshold probability of being included with the at least one frame structure that is associated with the particular context of the particular setting;
evaluating, at the computing system, the first linguistic analysis result produced by the syntactic analysis of the natural language input based at least in part on the semantic level understanding of the natural language input to determine whether to form one or more other semantic structures including at least one of a new frame structure or a new GSP structure that is relevant to the semantic level understanding of the natural language input using the second linguistic analysis result; and
engaging in multiple dialog sessions with computing devices of multiple human users in parallel, by generating and submitting one or more structured questions for presentation via computer user interface to the computing devices of the multiple human users, to receive responses from the multiple human users for use by the computing system to evaluate the one or more semantic structures as an understanding of the natural language input.
2 Assignments
0 Petitions
Accused Products
Abstract
An architecture and processes enable computer learning and developing an understanding of arbitrary natural language text through collaboration with humans in the context of joint problem solving. The architecture ingests the text and then syntactically and semantically processes the text to infer an initial understanding of the text. The initial understanding is captured in a story model of semantic and frame structures. The story model is then tested through computer generated questions that are posed to humans through interactive dialog sessions. The knowledge gleaned from the humans is used to update the story model as well as the computing system'"'"'s current world model of understanding. The process is repeated for multiple stories over time, enabling the computing system to grow in knowledge and thereby understand stories of increasingly higher reading comprehension levels.
176 Citations
21 Claims
-
1. A computer-implemented method, comprising:
-
receiving, at a computing system, a natural language input; performing, at the computing system, a syntactic analysis of the natural language input to produce at least a first linguistic analysis result and a second linguistic analysis result, wherein performing the syntactic analysis comprises; generating a predicate argument structure (PAS) for the natural language input; assigning an entity type to one or more words in the natural language input; and determining a co-reference chain in the natural language input; for the first linguistic analysis result, forming, at the computing system, one or more semantic structures to provide a semantic level understanding of the natural language input in part by using the first linguistic analysis result and knowledge induced from language corpora, wherein the one or more semantic structures include at least one frame structure including a set of generative semantic primitive (GSP) structures that are related under a particular context of a particular setting, wherein a GSP structure of the set of GSP structures includes a predicate and one or more roles and has at least a threshold probability of being included with the at least one frame structure that is associated with the particular context of the particular setting; evaluating, at the computing system, the first linguistic analysis result produced by the syntactic analysis of the natural language input based at least in part on the semantic level understanding of the natural language input to determine whether to form one or more other semantic structures including at least one of a new frame structure or a new GSP structure that is relevant to the semantic level understanding of the natural language input using the second linguistic analysis result; and engaging in multiple dialog sessions with computing devices of multiple human users in parallel, by generating and submitting one or more structured questions for presentation via computer user interface to the computing devices of the multiple human users, to receive responses from the multiple human users for use by the computing system to evaluate the one or more semantic structures as an understanding of the natural language input. - View Dependent Claims (2, 3, 4, 5, 6, 7, 21)
-
-
8. A computer-implemented method comprising:
-
receiving, at a computing system, a natural language story composed of multiple sentences; performing, by the computing system, a syntactic analysis of the natural language story to produce multiple linguistic analysis results; inferring, by the computing system and from the multiple linguistic analysis results, multiple generative semantic primitive (GSP) structures corresponding to a linguistic structure of a sentence in the natural language story; forming one or more frame structures that comprise the multiple GSP structures as inferred from the multiple linguistic analysis results, wherein forming a frame structure of the one or more frame structures includes associating a set of GSP structures with a particular context of at least one of a story setting or a story background, a GSP structure of the set of GSP structures having an above threshold probability of being included with the frame structure that is associated with the particular context of at least one of the story setting or the story background, wherein at least one frame structure of the one or more frame structures is nested within, and referred to by, at least another frame structure of the one or more frame structures; aligning the one or more frame structures with entities in the natural language story to represent an understanding of the natural language story, where multiple of the one or more frame structures form an episode within the natural language story; determining a co-reference chain in the natural language story; and evaluating the multiple linguistic analysis results produced by the syntactic analysis in view of information learned from forming and aligning the one or more frame structures. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A computing system, comprising:
-
a datastore containing a current world model that expresses beliefs about how natural language is understood; one or more processors to access the current world model maintained in the datastore; and memory coupled to the one or more processors, the memory storing computer executable instructions that, when executed by the one or more processors, perform acts comprising; processing multiple natural language stories over time, the processing including syntactically analyzing the multiple natural language stories and subsequently inferring semantic structures as representations of the multiple natural language stories, in part by using information maintained in the current world model and results from syntactic analysis, and conducting dialog sessions with one or more human users to evaluate the semantic structures as understandings of the multiple natural language stories, wherein syntactically analyzing the multiple natural language stories includes creating, for individual portions of a natural language story, multiple syntactic linguistic results and aligning one or more semantic structures of the semantic structures with one or more entities in the multiple syntactic linguistic results, wherein the processing the multiple natural language stories over time includes; processing a first story of the multiple natural language stories over a first time period; inferring first semantic structures as representations of the first story; evaluating the first semantic structures by the one or more human users over the first time period; processing a second story of the multiple natural language stories over a second time period that is subsequent to the first time period; inferring second semantic structures as representations of the second story based at least in part on expanding at least one of the first semantic structures to include a new semantic structure; and evaluating the second semantic structures by the one or more human users over the second time period; expanding the current world model, in the datastore, over time to include the semantic structures inferred from the multiple natural language stories and evaluated by the one or more human users, wherein expanding the current world model over time comprises including the first semantic structures inferred over the first time period, and including the second semantic structures inferred over the second time period; and using information in the expanding the current world model to modify the processing of the multiple natural language stories through syntactically analyzing the multiple natural language stories and subsequently inferring semantic structures. - View Dependent Claims (15, 16, 17)
-
-
18. A computing system, comprising:
-
a datastore containing a current world model that expresses beliefs about how natural language is understood; one or more processors; and memory coupled to the one or more processors, the memory storing computer-executable modules comprising; a story parsing engine to syntactically analyze a natural language story to produce linguistic analysis results, the linguistic analysis results having initial scores indicative of an extent to which the linguistic analysis results syntactically represent portions of the natural language story; a knowledge induction engine to induce information from language corpora to form induced information; and a knowledge integration engine to form frame structures that provide a semantic representation of the natural language story, the knowledge integration engine using the linguistic analysis results, information from the current world model, and the induced information to form the frame structures, wherein a frame structure of the frame structures includes a set of semantic structures that are commonly associated with a particular context of a story setting, wherein a semantic structure of the set of semantic structures has a corresponding probability that is above a threshold for being included in the frame structure that is associated with the particular context of the story setting of the natural language story, wherein at least one frame structure of the frame structures is nested within, and referred to by, at least another semantic structure; and the knowledge integration engine reprocessing the linguistic analysis results after formation of the frame structures to produce revised scores indicative of the extent to which the linguistic analysis results syntactically represent the portions of the natural language story, wherein the revised scores are based at least in part on the corresponding probability of the semantic structure. - View Dependent Claims (19, 20)
-
Specification