Retrieval of records using phrase chunking
First Claim
1. A method for generating phrase chunking rules for titles of records in a database, said method comprising the steps of:
- part-of-speech tagging the title of each record in a first set of records;
creating a plurality of phrase chunking rules based on patterns of part-of-speech tags in the tagged titles; and
applying the phrase chunking rules to the titles of records in a second set of records so as to generate indexes for the records in the second set of records.
3 Assignments
0 Petitions
Accused Products
Abstract
Methods are provided for generating phrase chunking rules for titles of records in a database. According to one method, the title of each record in a first set of records is part-of-speech tagged, and a plurality of phrase chunking rules are created based on patterns of part-of-speech tags in the tagged titles. The phrase chunking rules are applied to the titles of records in a second set of records so as to generate indexes for the records in the second set of records. In a preferred embodiment, the phrase chunking rules are modified if coverage of the second set of records by the phrase chunking rules does not reach a predetermined threshold. Also provided are methods for retrieving records from a database and systems for generating phrase chunking rules.
33 Citations
32 Claims
-
1. A method for generating phrase chunking rules for titles of records in a database, said method comprising the steps of:
-
part-of-speech tagging the title of each record in a first set of records;
creating a plurality of phrase chunking rules based on patterns of part-of-speech tags in the tagged titles; and
applying the phrase chunking rules to the titles of records in a second set of records so as to generate indexes for the records in the second set of records. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method for retrieving records from a database, said method comprising the steps of:
-
applying a plurality of phrase chunking rules to titles of the records in the database so as to generate indexes for the records in the database;
receiving a request for one of the records in the database, the request including at least part of the title of one of the records in the database;
comparing the at least part of the title that is received with the indexes that were generated; and
if the at least part of the title that is received matches one of the indexes, retrieving the record corresponding to the one index. - View Dependent Claims (12, 13, 14)
-
-
15. A machine-readable medium encoded with a program for generating phrase chunking rules for titles of records in a database, said program containing instructions for performing the steps of:
-
part-of-speech tagging the title of each record in a first set of records;
creating a plurality of phrase chunking rules based on patterns of part-of-speech tags in the tagged titles; and
applying the phrase chunking rules to the titles of records in a second set of records so as to generate indexes for the records in the second set of records. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
-
22. A machine-readable medium encoded with a program for retrieving records from a database, said program containing instructions for performing the steps of:
-
applying a plurality of phrase chunking rules to titles of the records in the database so as to generate indexes for the records in the database;
receiving a request for one of the records in the database, the request including at least part of the title of one of the records in the database;
comparing the at least part of the title that is received with the indexes that were generated; and
if the at least part of the title that is received matches one of the indexes, retrieving the record corresponding to the one index. - View Dependent Claims (23, 24, 25)
-
-
26. A system for generating phrase chunking rules for titles of records in a database, said system comprising:
-
a part-of-speech tagger for part-of-speech tagging the title of each record in a first set of records;
first means for creating a plurality of phrase chunking rules based on patterns of part-of-speech tags in the tagged titles; and
an indexer for applying the phrase chunking rules to the titles of records in a second set of records so as to generate indexes for the records in the second set of records. - View Dependent Claims (27, 28, 29, 30, 31, 32)
-
Specification