Exploiting structured content for unsupervised natural language semantic parsing
First Claim
1. A method for natural language semantic parsing, comprising:
- accessing structured content including structured web pages;
parsing a semantic structure identified in the structured content to identify entities linked by a relationship, wherein each entity has a respective tag;
mining a plurality of natural language search queries that can access the structured content to identify, from the plurality of natural language search queries, at least one natural language search query that includes at least one of the entities; and
automatically annotating the at least one natural language search query using the respective tag.
2 Assignments
0 Petitions
Accused Products
Abstract
Structured web pages are accessed and parsed to obtain implicit annotation for natural language understanding tasks. Search queries that hit these structured web pages are automatically mined for information that is used to semantically annotate the queries. The automatically annotated queries may be used for automatically building statistical unsupervised slot filling models without using a semantic annotation guideline. For example, tags that are located on a structured web page that are associated with the search query may be used to annotate the query. The mined search queries may be filtered to create a set of queries that is in a form of a natural language query and/or remove queries that are difficult to parse. A natural language model may be trained using the resulting mined queries. Some queries may be set aside for testing and the model may be adapted using in-domain sentences that are not annotated. The models may be tested using these implicitly annotated natural-language-like queries in an unsupervised fashion.
104 Citations
19 Claims
-
1. A method for natural language semantic parsing, comprising:
-
accessing structured content including structured web pages; parsing a semantic structure identified in the structured content to identify entities linked by a relationship, wherein each entity has a respective tag; mining a plurality of natural language search queries that can access the structured content to identify, from the plurality of natural language search queries, at least one natural language search query that includes at least one of the entities; and automatically annotating the at least one natural language search query using the respective tag. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-readable storage device storing computer-executable instructions that perform a method when executed, the method comprising:
-
accessing structured content including structured web pages; parsing the structured content to identify two entities linked by a relationship, wherein each entity has a respective tag; mining a plurality of natural language search queries that can access the structured content to identify, from the plurality of natural language search queries, at least one natural language search query that includes at least one of the two entities; automatically annotating the at least one natural language search query to form at least one annotated natural language search query; and creating an understanding model including slots using the at least one natural language search query annotated in the automatically annotating. - View Dependent Claims (9, 10, 11, 12)
-
-
13. A system for natural language semantic parsing, comprising:
-
a processor and memory; an operating environment executing using the processor; and a knowledge manager that is configured to perform actions comprising; accessing structured content including structured web pages; parsing the structured content to identify two entities linked by a relationship, wherein each of the entities has a respective tag; mining a plurality of natural language search queries that can access the structured content to identify, from the plurality of natural language search queries, at least one natural language search query that includes at least one of the two entities; automatically annotating the at least one natural language search query using the respective tags; and creating an understanding model including slots using the at least one natural language search query annotated in the automatically annotating. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
Specification