Architecture and processes for computer learning and understanding
First Claim
1. A computer-implemented method, comprising:
- forming, by a computing system, a semantic representation of a natural language story, the semantic representation using knowledge stored in a current world model that expresses beliefs about how natural language is understood, wherein the semantic representation includes multiple generative semantic primitive (GSP) structures, the multiple GSP structures including one or more predicates and one or more roles that indicate one or more beliefs regarding an understanding of natural language, wherein the current world model is developed over time through processing of multiple natural language stories;
generating, by the computing system, multiple questions to evaluate the semantic representation;
maintaining a dependency structure for linear dialogs to identify questions that are independent from one another;
submitting, from the computing system to multiple user computing devices, different ones of the multiple questions for presentation to multiple different human users;
receiving, by the computing system from the multiple user computing devices, multiple responses indicative of input from the multiple different human users when answering the multiple questions;
iterating through the dependency structure based at least in part on the multiple responses to identify a set of next questions and branch conditions indicating that at least two next questions of the set of next questions are independent of one another;
learning, by the computing system, at least one new GSP structure based at least in part on aggregating the multiple responses received from the multiple user computing devices, wherein the at least one new GSP structure includes a new predicate that indicates a new belief regarding the understanding of the natural language, the new belief being based at least in part on the aggregating of the multiple responses; and
revising, by the computing system, the semantic representation of the natural language story based at least in part on the multiple responses received from the multiple user computing devices and the at least one new GSP structure.
2 Assignments
0 Petitions
Accused Products
Abstract
An architecture and processes enable computer learning and developing an understanding of arbitrary natural language text through collaboration with humans in the context of joint problem solving. The architecture ingests the text and then syntactically and semantically processes the text to infer an initial understanding of the text. The initial understanding is captured in a story model of semantic and frame structures. The story model is then tested through computer generated questions that are posed to humans through interactive dialog sessions. The knowledge gleaned from the humans is used to update the story model as well as the computing system'"'"'s current world model of understanding. The process is repeated for multiple stories over time, enabling the computing system to grow in knowledge and thereby understand stories of increasingly higher reading comprehension levels.
180 Citations
21 Claims
-
1. A computer-implemented method, comprising:
-
forming, by a computing system, a semantic representation of a natural language story, the semantic representation using knowledge stored in a current world model that expresses beliefs about how natural language is understood, wherein the semantic representation includes multiple generative semantic primitive (GSP) structures, the multiple GSP structures including one or more predicates and one or more roles that indicate one or more beliefs regarding an understanding of natural language, wherein the current world model is developed over time through processing of multiple natural language stories; generating, by the computing system, multiple questions to evaluate the semantic representation; maintaining a dependency structure for linear dialogs to identify questions that are independent from one another; submitting, from the computing system to multiple user computing devices, different ones of the multiple questions for presentation to multiple different human users; receiving, by the computing system from the multiple user computing devices, multiple responses indicative of input from the multiple different human users when answering the multiple questions; iterating through the dependency structure based at least in part on the multiple responses to identify a set of next questions and branch conditions indicating that at least two next questions of the set of next questions are independent of one another; learning, by the computing system, at least one new GSP structure based at least in part on aggregating the multiple responses received from the multiple user computing devices, wherein the at least one new GSP structure includes a new predicate that indicates a new belief regarding the understanding of the natural language, the new belief being based at least in part on the aggregating of the multiple responses; and revising, by the computing system, the semantic representation of the natural language story based at least in part on the multiple responses received from the multiple user computing devices and the at least one new GSP structure. - View Dependent Claims (2, 3, 4, 5, 19, 20)
-
-
6. A computer-implemented method, comprising:
-
receiving, at a computing system, a natural language story composed of multiple sentences; performing, by the computing system, a syntactic analysis of the natural language story to produce one or more linguistic analysis results; inferring, by the computing system and from the linguistic analysis results, one or more generative semantic primitive (GSP) structures corresponding to a linguistic structure of a sentence in the natural language story, wherein a GSP structure of the one or more GSP structures includes a predicate and one or more roles to express a first understanding of the linguistic structure, and wherein the first understanding of the linguistic structure is based at least in part on the linguistic analysis results; forming, by the computing system, one or more frame structures that comprise the one or more GSP structures as inferred from the linguistic analysis results, wherein a frame structure of the one or more frame structures is associated with a particular context of a setting or a background of the natural language story, wherein at least one GSP structure of the one or more GSP structures has at least a threshold probability of being included in the frame structure that is associated with the particular context of the setting or the background of the natural language story; aligning, by the computing system, the one or more frame structures with entities in the natural language story to represent an understanding of the natural language story; generating, by the computing system, one or more questions to evaluate at least one of the one or more GSP structures or the one or more frame structures; determining a number of multiple different human users and to whom the one or more questions are to be submitted using heuristics and profiles of the multiple different human users; submitting, from the computing system to multiple user computing devices, the one or more questions for presentation to the multiple different human users; receiving, by the computing system from the multiple user computing devices, multiple responses indicative of input from the multiple different human users when answering the one or more questions; aggregating, by the computing system, the multiple responses from the multiple user computing devices in association with the one or more questions to which the multiple responses apply; using, by the computing system, the multiple responses to evaluate an extent to which the one or more frame structures represent an understanding of the natural language story; and revising, by the computing system, at least the a GSP structure by redefining the one or more roles for the predicate to express a second understanding of the linguistic structure, and wherein the second understanding of the linguistic structure is based at least in part on aggregating the multiple responses. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13)
-
-
14. A computing system, comprising:
-
one or more processors; and memory coupled to the one or more processors, the memory storing computer-executable modules comprising; a knowledge integration engine to form semantic structures that provide a semantic representation of a natural language story, wherein to form the semantic structures includes creating multiple generative semantic primitive (GSP) structures by defining a predicate and one or more roles for a GSP structure of the multiple GSP structures; a dialog engine to facilitate multiple dialog sessions with multiple human users in parallel and to aggregate responses from the human users to evaluate an extent to which the semantic structures represent the natural language story, wherein the dialog engine includes a distributed dialog dispatcher to break up the dialog sessions into sub-dialog sessions, and the distributed dialog dispatcher includes a user selector to choose the human users to answer independent questions in parallel, wherein the user selector employs heuristics and profiles of the human users to determine a quantity and to whom the independent questions are to be submitted; and wherein the knowledge integration engine updates the semantic structures based in part on the responses aggregated from the multiple human users and learns a new GSP structure including a new predicate discovered by the responses aggregated. - View Dependent Claims (15, 16, 17, 18, 21)
-
Specification