Architecture and processes for computer learning and understanding
First Claim
1. A computer-implemented method, comprising:
- for a natural language input, performing, by a computing system, a process comprising;
receiving the natural language input;
performing a syntactic analysis of the natural language input to produce one or more linguistic analysis results;
creating multiple semantic structures to represent the natural language input in part by using the one or more linguistic analysis results and knowledge induced from a large language corpora, wherein creating the multiple semantic structures includes creating multiple generative semantic primitive (GSP) structures by defining a predicate and one or more roles for a GSP structure of the multiple GSP structures to express a first understanding of the natural language input, the first understanding of the natural language input being based at least in part on the one or more linguistic analysis results, wherein defining the one or more roles includes mapping one or more entities in the natural language input to the one or more roles;
associating a semantic structure of the multiple semantic structures with a particular theme or a particular context of the natural language input;
engaging in a dialog session with a human user to receive input from the human user to use by the computing system to evaluate the multiple semantic structures as an understanding of the natural language input; and
revising the multiple semantic structures based on one or more responses from the human user to improve the understanding of the natural language input, wherein revising the multiple semantic structures includes defining at least one of a new predicate or one or more new roles for at least one new GSP structure associated with the semantic structure, the at least one new GSP structure expressing a second understanding of the natural language input based at least in part on the one or more responses, wherein defining the at least one of the new predicate or the one or more new roles is based at least in part on the at least one new GSP structure having an above threshold probability of being included in the semantic structure associated with the particular theme or the particular context of the natural language input; and
repeating, by the computing system, the process with the natural language input at least once to form one or more additional GSP structures for subsequent natural language inputs, wherein the subsequent natural language inputs have similar or increasingly higher reading comprehension levels.
2 Assignments
0 Petitions
Accused Products
Abstract
An architecture and processes enable computer learning and developing an understanding of arbitrary natural language text through collaboration with humans in the context of joint problem solving. The architecture ingests the text and then syntactically and semantically processes the text to infer an initial understanding of the text. The initial understanding is captured in a story model of semantic and frame structures. The story model is then tested through computer generated questions that are posed to humans through interactive dialog sessions. The knowledge gleaned from the humans is used to update the story model as well as the computing system'"'"'s current world model of understanding. The process is repeated for multiple stories over time, enabling the computing system to grow in knowledge and thereby understand stories of increasingly higher reading comprehension levels.
-
Citations
26 Claims
-
1. A computer-implemented method, comprising:
-
for a natural language input, performing, by a computing system, a process comprising; receiving the natural language input; performing a syntactic analysis of the natural language input to produce one or more linguistic analysis results; creating multiple semantic structures to represent the natural language input in part by using the one or more linguistic analysis results and knowledge induced from a large language corpora, wherein creating the multiple semantic structures includes creating multiple generative semantic primitive (GSP) structures by defining a predicate and one or more roles for a GSP structure of the multiple GSP structures to express a first understanding of the natural language input, the first understanding of the natural language input being based at least in part on the one or more linguistic analysis results, wherein defining the one or more roles includes mapping one or more entities in the natural language input to the one or more roles; associating a semantic structure of the multiple semantic structures with a particular theme or a particular context of the natural language input; engaging in a dialog session with a human user to receive input from the human user to use by the computing system to evaluate the multiple semantic structures as an understanding of the natural language input; and revising the multiple semantic structures based on one or more responses from the human user to improve the understanding of the natural language input, wherein revising the multiple semantic structures includes defining at least one of a new predicate or one or more new roles for at least one new GSP structure associated with the semantic structure, the at least one new GSP structure expressing a second understanding of the natural language input based at least in part on the one or more responses, wherein defining the at least one of the new predicate or the one or more new roles is based at least in part on the at least one new GSP structure having an above threshold probability of being included in the semantic structure associated with the particular theme or the particular context of the natural language input; and repeating, by the computing system, the process with the natural language input at least once to form one or more additional GSP structures for subsequent natural language inputs, wherein the subsequent natural language inputs have similar or increasingly higher reading comprehension levels. - View Dependent Claims (2, 3, 4, 5, 6, 7, 25)
-
-
8. A computer-implemented method, comprising:
-
receiving, by a computing system, multiple first natural language stories of a first reading comprehension level over a first period of time; for a story of the multiple first natural language stories, developing a story model representation of the story by conducting an understanding process comprising; parsing, by the computer system, the story to produce a syntactic representation of the story; performing, by the computer system, a predicate argument structure (PAS) analysis on the syntactic representation of the story; assigning, by the computer system, one or more entity types to one or more words in the story; determining, by the computer system, co-reference chains in the one or more words in the story; inferring, by the computing system, one or more semantic structures as a semantic representation of the story using, at least in part, the syntactic representation of the story; submitting, by the computing system to a user computing device, one or more questions for a human user to answer to evaluate the one or more semantic structures representing the story; responsive to one or more responses from the human user, revising the one or more semantic structures; and iterating the understanding process until one or more final semantic structures are defined, wherein a final version of the story model includes the one or more final semantic structures; storing, by the computing system, multiple first story models that were defined by iterating the understanding process over the first period of time for the multiple first natural language stories of the first reading comprehension level, wherein a first story model of the multiple first story models includes one or more first semantic structures; receiving, by the computing system, multiple second natural language stories of a second reading comprehension level over a second period of time, wherein the second period of time is after the first period of time; and for a second story of the multiple second natural language stories, developing an associated story model representation of the second story by conducting the understanding process for the second story over the second period of time and using, in part, information learned from conducting the understanding process of the multiple first story models for the multiple first natural language stories, wherein the associated story model representation includes at least one second semantic structure based at least in part on combining at least one of the one or more first semantic structures with another semantic structure. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15)
-
-
16. A computing system, comprising:
-
a current world model maintained in a database; one or more processors to access the current world model maintained in the database; and memory coupled to the one or more processors, the memory storing computer executable instructions that, when executed by the one or more processors, perform acts comprising; processing multiple natural language stories of varying reading comprehension levels over time, the processing including inferring semantic structures as representations of the multiple natural language stories, in part by using information maintained in the current world model, and conducting dialog sessions with one or more human users to evaluate the semantic structures as understandings of the multiple natural language stories, wherein the processing includes; processing, over a first time period, a first story of the multiple natural language stories having a first reading comprehension level; performing analysis on the first story to produce a syntactic representation of the first story; performing predicate argument structure (PAS) analysis on the syntactic representation of the first story; assigning one or more entity types to one or more words in the first story; determining co-reference chains in the one or more words in the first story; inferring one or more first semantic structures as representations of the first story; processing, over a second time period after the first time period, a second story of the multiple natural language stories having a second reading comprehension level that is more difficult than the first reading comprehension level; and inferring at least one second semantic structure as a representation of the second story based at least in part on expanding at least one of the one or more first semantic structures to include a new semantic structure; and expanding the current world model in the database over time to include the semantic structures inferred from the multiple natural language stories and evaluated by the one or more human users. - View Dependent Claims (17, 18, 19, 20, 21)
-
-
22. A computing system, comprising:
-
a datastore containing a current world model that expresses beliefs about how natural language is understood; one or more processors; and memory coupled to the one or more processors, the memory storing computer-executable modules comprising; a story parsing engine to syntactically analyze a natural language story to produce linguistic analysis results; a knowledge induction engine to induce information from a large language corpora to form induced information, wherein the knowledge induction engine comprises a word sense disambiguator to disambiguate word senses; a knowledge integration engine to form semantic structures that provide a semantic representation of the natural language story, the knowledge integration engine using the linguistic analysis results, information from the current world model, and the induced information to form the semantic structures, and to associate at least one semantic structure of the semantic structures with a particular context of the natural language story, wherein forming the semantic structures includes defining multiple generative semantic primitive (GSP) structures with one or more sets of roles; and a dialog engine to facilitate a dialog session with a human user by generating one or more questions, and submitting the one or more questions for presentation via a computer user interface to a computing device used by the human user and collecting one or more responses from the computing device indicative of input from the human user; wherein the knowledge integration engine revises the semantic structures based on the one or more responses from the human user and updates the current world model in the datastore, wherein revising the semantic structures includes defining at least one new GSP structure with a new set of roles for the semantic structure, the at least one new GSP structure having at least a threshold probability of being included with the semantic structure associated with the particular context of the natural language story; and wherein the story parsing engine, the knowledge integration engine, and the dialog engine process multiple stories over time to build the information in the current world model. - View Dependent Claims (23, 24, 26)
-
Specification