Applying a structured language model to information extraction
First Claim
1. A method of training an information extraction system to extract information from a natural language input, comprising:
- generating parses with a structured language model using annotated training data that has semantic constituent labels with semantic constituent boundaries identified;
while generating parses, constraining parses to match the semantic constituent boundaries; and
while generating parses, constraining the parses to match the semantic constituent labels.
2 Assignments
0 Petitions
Accused Products
Abstract
One feature of the present invention uses the parsing capabilities of a structured language model in the information extraction process. During training, the structured language model is first initialized with syntactically annotated training data. The model is then trained by generating parses on semantically annotated training data enforcing annotated constituent boundaries. The syntactic labels in the parse trees generated by the parser are then replaced with joint syntactic and semantic labels. The model is then trained by generating parses on the semantically annotated training data enforcing the semantic tags or labels found in the training data. The trained model can then be used to extract information from test data using the parses generated by the model.
153 Citations
22 Claims
-
1. A method of training an information extraction system to extract information from a natural language input, comprising:
-
generating parses with a structured language model using annotated training data that has semantic constituent labels with semantic constituent boundaries identified;
while generating parses, constraining parses to match the semantic constituent boundaries; and
while generating parses, constraining the parses to match the semantic constituent labels. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method of extracting information from a natural language input, comprising:
-
parsing the natural language input with a structured language model to obtain a parse having a semantic frame label and one or more constituents of the natural language input each having a semantic slot label; and
identifying an information extraction frame corresponding to the natural language input based on the frame label and filling in slots in the frame with the one or more constituents labeled by the slot labels. - View Dependent Claims (10, 11, 12, 13)
-
-
14. An information extraction system for extracting information from a natural language speech input, comprising:
a speech recognizer, including a structured language model, receiving the natural language speech input and generating a textual representation of the natural language speech input based on language modeling by the structured language model, the structured language model parsing the textual representation to obtain a parse having a semantic frame label and one or more semantic slot labels corresponding to constituents of the textual representation, the semantic frame and slot labels identifying the information to be extracted. - View Dependent Claims (15, 16)
-
17. A method of extracting information from a natural language (NL) input, comprising:
-
accessing a schema associated with an application program for which the information is extracted, the schema having frames with a frame structure; and
parsing the NL input to obtain a parse having a semantic frame label and one or more semantic slot labels corresponding to constituents of the NL input, the parse being constrained to the frame structure in the schema. - View Dependent Claims (18, 19, 20, 21, 22)
-
Specification