Architecture and processes for computer learning and understanding
First Claim
1. A computer-implemented method, comprising:
- maintaining a collection of frame structures within memory of a computing system, individual frame structures of the collection of frame structures including multiple generative semantic primitive (GSP) structures defined with roles that are commonly associated with a theme or a context of a setting of natural language stories;
receiving, by the computing system, a natural language story;
retrieving, by the computing system and from the collection of frame structures in the memory, one or more frame structures that includes GSP structures defined with the roles that are relevant to a current understanding of the natural language story, wherein a frame structure of the one or more frame structures defines the theme represented within the natural language story, and the frame structure comprises a set of GSP structures that define the roles associated with the theme, an individual GSP structure of the GSP structures having at least a threshold probability of being included in the one or more frame structures that is associated with a particular context of a particular setting of the natural language story;
aligning, by the computing system, entities in the natural language story to the roles defined in the GSP structures of the one or more frame structures; and
evaluating, by the computing system, an extent to which the one or more frame structures, when aligned with the entities in the natural language story, represent the current understanding of the natural language story by computing a confidence score, the confidence score being based at least in part on the individual GSP structure of the GSP structures having a probability of being true for the one or more frame structures, wherein the confidence score is determined according to a function that produces a higher score based on a comparison of a lower number of frame structures with respect to a higher number of GSP structures included in the frame structures that match GSP structures extracted from the natural language story.
2 Assignments
0 Petitions
Accused Products
Abstract
An architecture and processes enable computer learning and developing an understanding of arbitrary natural language text through collaboration with humans in the context of joint problem solving. The architecture ingests the text and then syntactically and semantically processes the text to infer an initial understanding of the text. The initial understanding is captured in a story model of semantic and frame structures. The story model is then tested through computer generated questions that are posed to humans through interactive dialog sessions. The knowledge gleaned from the humans is used to update the story model as well as the computing system'"'"'s current world model of understanding. The process is repeated for multiple stories over time, enabling the computing system to grow in knowledge and thereby understand stories of increasingly higher reading comprehension levels.
177 Citations
27 Claims
-
1. A computer-implemented method, comprising:
-
maintaining a collection of frame structures within memory of a computing system, individual frame structures of the collection of frame structures including multiple generative semantic primitive (GSP) structures defined with roles that are commonly associated with a theme or a context of a setting of natural language stories; receiving, by the computing system, a natural language story; retrieving, by the computing system and from the collection of frame structures in the memory, one or more frame structures that includes GSP structures defined with the roles that are relevant to a current understanding of the natural language story, wherein a frame structure of the one or more frame structures defines the theme represented within the natural language story, and the frame structure comprises a set of GSP structures that define the roles associated with the theme, an individual GSP structure of the GSP structures having at least a threshold probability of being included in the one or more frame structures that is associated with a particular context of a particular setting of the natural language story; aligning, by the computing system, entities in the natural language story to the roles defined in the GSP structures of the one or more frame structures; and evaluating, by the computing system, an extent to which the one or more frame structures, when aligned with the entities in the natural language story, represent the current understanding of the natural language story by computing a confidence score, the confidence score being based at least in part on the individual GSP structure of the GSP structures having a probability of being true for the one or more frame structures, wherein the confidence score is determined according to a function that produces a higher score based on a comparison of a lower number of frame structures with respect to a higher number of GSP structures included in the frame structures that match GSP structures extracted from the natural language story. - View Dependent Claims (2, 3, 4, 5, 23, 24, 27)
-
-
6. A computer-implemented method, comprising:
-
receiving, at a computing system, a natural language input; performing, by the computing system, a syntactic analysis of the natural language input to produce one or more linguistic analysis results; inferring, by the computing system, one or more frame structures from the one or more linguistic analysis results, the one or more frame structures expressing an understanding of the natural language input, a frame structure of the one or more frame structures including beliefs having relevance to a context of the natural language input, wherein at least one frame structure of the one or more frame structures is nested within, and referred to by, at least another semantic structure; and calculating, by the computing system, a confidence score indicative of an extent to which the one or more frame structures express the understanding of the natural language input based at least in part on whether the confidence score satisfies a predetermined threshold and on the relevance of the beliefs to the context of the natural language input, wherein the confidence score is computed according to a function that produces a low score, when the one or more frame structures are contradictory to the beliefs extracted from the natural language input, relative to higher number of frame structures that match the beliefs extracted from the natural language input. - View Dependent Claims (7, 8, 25, 26)
-
-
9. A computer-implemented method, comprising:
-
receiving, at a computing system, a natural language story composed of multiple sentences; performing, by the computing system, a syntactic analysis of the natural language story to produce one or more linguistic analysis results; inferring, by the computing system and from the one or more linguistic analysis results, one or more generative semantic primitive (GSP) structures corresponding to a linguistic structure of a sentence in the natural language story, wherein individual GSP structures of the one or more GSP structures include a predicate and one or more roles; forming one or more frame structures that comprise the one or more GSP structures as inferred from the one or more linguistic analysis results, wherein forming the one or more frame structures includes associating the one or more GSP structures with a particular context based at least in part on a setting or a background of the natural language story and defining a relationship between the one or more GSP structures, a GSP structure of the one or more GSP structures having at least a threshold probability of being included with the one or more frame structures that is associated with the particular context of the setting or the background of the natural language story, wherein a frame structure of the one or more frame structures defines a context represented within the natural language story, and the frame structure comprises a set of GSP structures that define the roles associated with the context; and aligning the one or more frame structures with entities in the natural language story to represent an understanding of the natural language story, wherein aligning the one or more frame structures includes mapping at least one entity of the entities in the natural language story to at least one role of the roles, wherein the one or more frame structures form an episode within the natural language story. - View Dependent Claims (10, 11, 12)
-
-
13. A computer-implemented method, comprising:
-
receiving, at a computing system, a natural language input; forming, by the computing system, one or more frame structures to represent an understanding of the natural language input, the one or more frame structures including one or more sets of generative semantic primitive (GSP) structures that are interrelated, the one or more sets of GSP structures being interrelated based at least in part on individual GSP structures of the one or more sets of GSP having an above threshold probability of being included with the one or more frame structures that is associated with a context of the natural language input, wherein at least one frame structure of the one or more frame structures is nested within, and referred to by, at least another semantic structure; engaging, via communication between the computing system and a computing device used by a human user, in a dialog session with the human user to present the understanding of the natural language input for the human user to evaluate and to receive input from the human user for use by the computing system to evaluate an extent to which the one or more frame structures represent the understanding of the natural language input; and using, by the computing system, the input from the human user to infer an additional frame structure to improve the understanding of the natural language input, wherein the additional frame structure includes a new set of GSP structures. - View Dependent Claims (14, 15, 16)
-
-
17. A computing system, comprising:
-
a datastore containing a current world model that expresses beliefs about how natural language is understood; one or more processors; and memory coupled to the one or more processors, the memory storing computer-executable modules comprising; a story parsing engine to syntactically analyze a natural language story to produce linguistic analysis results, wherein producing the linguistic analysis results includes; parsing the natural language story to produce a syntactic representation of the natural language story; performing a predicate argument structure (PAS) analysis on the syntactic representation of the natural language story; assigning one or more entity types to one or more words in the natural language story; and determining co-reference chains in the one or more words in the natural language story; a knowledge induction engine to induce information from language corpora to form induced information; and a knowledge integration engine to form frame structures that provide a semantic representation of the natural language story, the knowledge integration engine using the linguistic analysis results, information from the current world model, and the induced information to form the frame structures, wherein to form individual frame structures includes interrelating multiple generative semantic primitive (GSP) structures by associating the multiple GSP structures with a particular context of the natural language story based at least in part on a GSP structure of the multiple GSP structures having at least a threshold probability of being included with the individual frame structures that is associated with the particular context of the natural language story, wherein at least one semantic structure of the frame structures is nested within, and referred to by, at least another semantic structure. - View Dependent Claims (18, 19, 20, 21, 22)
-
Specification