Methods for prefix indexing
First Claim
Patent Images
1. A computer-implemented method for indexing one or more terms, comprising:
- in response to one or more terms to be indexed, indexing each of the one or more terms in a regular index, the regular index having a plurality of postings lists and each postings list corresponding to a string containing one of the one or more terms; and
for each of the one or more terms having a plurality of characters, indexing at least one prefix portion of each term in a prefix index that is separated from the regular index, wherein the regular index is used for regular searches, and wherein the prefix index is used for prefix searches without having to combine the plurality of postings lists of the regular index at the point in time;
receiving a search query having a search term for searching files that contain the search term from a client;
determining whether a search to be performed is a regular search or a prefix search;
if the search is a regular search, searching in the regular index to identify a plurality of postings lists, each containing the search term, andreturning to the client, a list of item identifiers from the plurality of postings lists obtained from the regular index; and
if the search is a prefix search, searching in the prefix index to identify a single postings list that exactly matches the search term, andreturning to the client, a list of item identifiers from the single postings list;
wherein the regular index and the prefix index are stored in the same file, and wherein each prefix entry of the prefix index is tagged with a marker indicating that the corresponding prefix entry is part of prefix index.
1 Assignment
0 Petitions
Accused Products
Abstract
According to one aspect of the invention, in response to one or more terms to be indexed, each of the terms is indexed in a regular index. In addition, for each of the terms having multiple characters, at least one prefix portion of the term is indexed in a prefix index, where the regular index is used for regular searches and the prefix index is used for prefix searches without having to combine a plurality of postings lists of the regular index at the point in time.
-
Citations
17 Claims
-
1. A computer-implemented method for indexing one or more terms, comprising:
-
in response to one or more terms to be indexed, indexing each of the one or more terms in a regular index, the regular index having a plurality of postings lists and each postings list corresponding to a string containing one of the one or more terms; and for each of the one or more terms having a plurality of characters, indexing at least one prefix portion of each term in a prefix index that is separated from the regular index, wherein the regular index is used for regular searches, and wherein the prefix index is used for prefix searches without having to combine the plurality of postings lists of the regular index at the point in time; receiving a search query having a search term for searching files that contain the search term from a client; determining whether a search to be performed is a regular search or a prefix search; if the search is a regular search, searching in the regular index to identify a plurality of postings lists, each containing the search term, and returning to the client, a list of item identifiers from the plurality of postings lists obtained from the regular index; and if the search is a prefix search, searching in the prefix index to identify a single postings list that exactly matches the search term, and returning to the client, a list of item identifiers from the single postings list; wherein the regular index and the prefix index are stored in the same file, and wherein each prefix entry of the prefix index is tagged with a marker indicating that the corresponding prefix entry is part of prefix index. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A non-transitory computer-readable storage medium having instructions stored therein, which when executed by a computer, cause the computer to perform a method for indexing, one or more terms, the method comprising:
-
in response to one or more terms to be indexed, indexing each of the one or more terms in a regular index, the regular index having a plurality of postings lists and each postings list corresponding to a string containing one of the one or more terms; for each of the one or more terms having a plurality of characters, indexing at least one prefix portion of each term in a prefix index that is separated from the regular index, wherein the regular index is used for regular searches, and wherein the prefix index is used for prefix searches without having to combine a plurality of postings lists of the regular index at the point in time; receiving a search query having a search term for searching files that contain the search term from a client; determining whether a search to be performed is a regular search or a prefix search; if the search is a regular search, searching in the regular index to identify a plurality of postings lists, each containing the search term, and returning to the client, a list of item identifiers from the plurality of postings lists obtained from the regular index; and if the search is a prefix search, searching in the prefix index to identify a single postings list that exactly matches the search term, and returning to the client, a list of item identifiers from the single postings list; wherein the regular index and the prefix index are stored in the same file, and wherein each prefix entry of the prefix index is tagged with a marker indicating that the corresponding prefix entry is part of prefix index. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. a data processing system comprising:
-
a processor; and a memory coupled to the processor for storing instructions, which when executed by the processor, cause the processor to perform a method, the method including in response to one or more terms to be indexed, indexing each of the one or more terms in a regular index, the regular index having a plurality of postings lists and each postings list corresponding to a string containing one of the one or more terms, for each of the one or more terms having a plurality of characters, indexing at least one prefix portion of each term in a prefix index that is separated from the regular index, wherein the regular index is used for regular searches, and wherein the prefix index is used for prefix searches without having to combine the plurality of postings lists of the regular index at the point in time, receiving a search query having a search term for searching files that contain the search term from a client, determining whether a search to be performed is a regular search or a prefix search, if the search is a regular search, searching in the regular index to identify a plurality of postings lists, each containing the search term, and returning to the client, a list of item identifiers from the plurality of postings lists obtained from the regular index, and if the search is a prefix search, searching in the prefix index to identify a single postings list that exactly matches the search term, and returning to the client, a list of item identifiers from the single postings list; wherein the regular index and the prefix index are stored in the same file, and wherein each prefix entry of the prefix index is tagged with a marker indicating that the corresponding prefix entry is part of prefix index. - View Dependent Claims (14, 15, 16, 17)
-
Specification