Method and system for extracting information from unstructured text using symbolic machine learning
First Claim
Patent Images
1. A method of preparing a learning pattern for extracting information from text, said method comprising:
- receiving an input sample of text as an input into a computer tool executed by a processor on a computer;
receiving inputs from a user to name entities within said sample of text;
parsing said input sample of text to form a parse tree, using a processor on a computer executing a parser that respects named entities of a Named Entity (NE) Annotator, meaning that the parser treats a named entity as a single token;
presenting said parse tree to a user; and
receiving user inputs to;
specify relation arguments and names of components of said parse tree;
define a machine-labeled learning pattern from said parse tree and its associated user inputs, said machine-labeled learning pattern comprising a precedence inclusion pattern wherein elements in said learning pattern are defined in a precedence relation and in an inclusion relation (PI pattern), based on said user'"'"'s inputs; and
store said machine-labeled learning pattern in a memory, said stored learning pattern being available as a query for searching for relation instances in unseen text that matches said PI pattern wherein said user interfaces with said computer tool using;
a first menu to permit the user to input a sample text, to select and designate argument names for linguistic elements from a selected sample text, and to construct a relation instance of said linguistic elements;
a second menu to permit the user to generate a PI pattern from one or more relation instances generated using said first menu; and
a third menu to permit the user to use a PI pattern generated by said second menu to search for undiscovered instances of a relation instance.
0 Assignments
0 Petitions
Accused Products
Abstract
A method (and structure) of extracting information from text, includes parsing an input sample of text to form a parse tree and using user inputs to define a machine-labeled learning pattern from the parse tree.
-
Citations
16 Claims
-
1. A method of preparing a learning pattern for extracting information from text, said method comprising:
-
receiving an input sample of text as an input into a computer tool executed by a processor on a computer; receiving inputs from a user to name entities within said sample of text; parsing said input sample of text to form a parse tree, using a processor on a computer executing a parser that respects named entities of a Named Entity (NE) Annotator, meaning that the parser treats a named entity as a single token; presenting said parse tree to a user; and receiving user inputs to; specify relation arguments and names of components of said parse tree; define a machine-labeled learning pattern from said parse tree and its associated user inputs, said machine-labeled learning pattern comprising a precedence inclusion pattern wherein elements in said learning pattern are defined in a precedence relation and in an inclusion relation (PI pattern), based on said user'"'"'s inputs; and store said machine-labeled learning pattern in a memory, said stored learning pattern being available as a query for searching for relation instances in unseen text that matches said PI pattern wherein said user interfaces with said computer tool using; a first menu to permit the user to input a sample text, to select and designate argument names for linguistic elements from a selected sample text, and to construct a relation instance of said linguistic elements; a second menu to permit the user to generate a PI pattern from one or more relation instances generated using said first menu; and a third menu to permit the user to use a PI pattern generated by said second menu to search for undiscovered instances of a relation instance. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. An apparatus for relational learning, said apparatus comprising:
-
a generator for developing a precedence inclusion (PI) pattern of a learning sample, as executed by a processor on said apparatus, wherein elements in said learning sample are machine-labeled to define a precedence relation and an inclusion relation, based on user inputs, said PI pattern comprising a set equipped with two strict partial orders for said precedence and inclusion that interact with one another through laws of interactive transitivity and interactive irreflexivity; and a graphical user interface (GUI) to permit a user to provide inputs used for said developing said PI pattern, wherein said user inputs are used to define a learning pattern wherein said GUI comprises; a first menu to permit a user to input a sample text, to select and designate argument names for linguistic elements from a selected sample text, and to construct a relation instance of said linguistic elements; a second menu to permit a user to generate a PI pattern from one or more relation instances generated using said first menu; and a third menu to permit a user to use a PI pattern generated by said second menu to search for undiscovered instances of a relation instance. - View Dependent Claims (13, 14)
-
-
15. A non-transitory, signal-bearing storage medium tangibly embodying a program of machine-readable instructions executable by a digital processing apparatus to perform a method of relational learning, said machine-readable instructions comprising:
-
a precedence inclusion (PI) pattern learning module for generating a PI pattern of a learning sample wherein elements in said learning sample are machine-labeled to define a precedence relation and an inclusion relation; and a graphical user interface (GUI) to permit a user to provide inputs to define said PI pattern for each said learning sample;
wherein said GUI comprises;a first menu to permit the user to input a sample text, to select and designate argument names for linguistic elements from a selected sample text, and to construct a relation instance of said linguistic elements; a second menu to permit the user to generate a PI pattern from one or more relation instances generated using said first menu; and a third menu to permit the user to use a PI pattern generated by said second menu to search for undiscovered instances of a relation instance. - View Dependent Claims (16)
-
Specification