TEMPLATE BOOTSTRAPPING FOR DOMAIN-ADAPTABLE NATURAL LANGUAGE GENERATION
First Claim
Patent Images
1. A computer implemented method comprising:
- a) receiving by a computer comprising a processor and a memory a set of original templates and storing the set of original templates in the memory;
b) accessing by a computer a set of databases comprising a large corpus of documents and searching by a search engine the set of databases based on the set of original templates;
c) identifying by the search engine a set of candidate sentences from a set of documents in the corpus by using a similarity measure to determine a similarity score; and
d) processing the set of candidate sentences to generate a set of natural language generation templates.
5 Assignments
0 Petitions
Accused Products
Abstract
The present invention relates to a system and method for bootstrapping templates for use in natural language sentence generation. More specifically, the present invention relates to identifying a set of candidate sentences from a large corpus based on a set of original templates by using a similarity measure. The set of candidate sentences are then processed or cleaned to generate a set of templates for use in natural language sentence generation.
-
Citations
20 Claims
-
1. A computer implemented method comprising:
-
a) receiving by a computer comprising a processor and a memory a set of original templates and storing the set of original templates in the memory; b) accessing by a computer a set of databases comprising a large corpus of documents and searching by a search engine the set of databases based on the set of original templates; c) identifying by the search engine a set of candidate sentences from a set of documents in the corpus by using a similarity measure to determine a similarity score; and d) processing the set of candidate sentences to generate a set of natural language generation templates. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system for bootstrapping a set of templates for generating natural language sentences, the system comprising:
-
a) at least one database comprising a corpus of documents; b) a computer comprising a processor and a memory, the memory containing a set of executable code executable by the processor; c) a search controller configured to receive a set of original templates and generate a query based on the set of original templates; d) a search engine adapted to receive the query from the search controller and search the corpus of documents using the query based on the set of original templates to identify a set of candidate sentences from the corpus of documents; e) a template analyzer adapted to; i) selecting a set of similar sentences from the identified set of candidate sentences by using a similarity measure to determine a similarity score for each selected sentence; and ii) generating a set of natural language generation templates based at least in part on the similarity scores. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification