Context snippet generation for book search system
First Claim
1. One or more computer-readable storage media having computer-executable instructions embodied thereon that perform a method for extracting a phrase related to a book in a collection of books, the method comprising:
- receiving a search string having one or more phrases specified by a user, wherein the phrases include one or more words;
parsing the search string to extract each word and word sequence for the one or more phrases;
accessing a book index corresponding to a collection of books to obtain a list of locations for each word in the one or more phrases, wherein accessing a book index comprises locating a hash corresponding to each word and loading an offset associated with the hash to obtain the list of locations;
traversing the list of locations to find each word in the word sequence specified by the one or more phrases of the search string;
generating a phrase list having the location of the one or more phrases of the search string based on the locations included in the list of locations for each word; and
generating a variable-sized container, associated with a maximum size, to store snippets corresponding to the collection of books based on the search string.
2 Assignments
0 Petitions
Accused Products
Abstract
A book search system and media for generating a book index corresponding to a collection of books and for providing context snippets related to a search string formulated by a user based on the book index are provided. The book index includes a word hash that represents unique words and an offset to a location list that stores locations for each instance of the unique word. The book search system receives the search string from the user, parses the search string to locate phrases and words, and traverses the book index to generate a list of locations for each word or phrase included in the search string. The book search system utilizes a variable-sized container having a maximum size to store subsets of each word or phrase included in the list of locations to generate the context snippets for the search string.
14 Citations
12 Claims
-
1. One or more computer-readable storage media having computer-executable instructions embodied thereon that perform a method for extracting a phrase related to a book in a collection of books, the method comprising:
-
receiving a search string having one or more phrases specified by a user, wherein the phrases include one or more words; parsing the search string to extract each word and word sequence for the one or more phrases; accessing a book index corresponding to a collection of books to obtain a list of locations for each word in the one or more phrases, wherein accessing a book index comprises locating a hash corresponding to each word and loading an offset associated with the hash to obtain the list of locations; traversing the list of locations to find each word in the word sequence specified by the one or more phrases of the search string; generating a phrase list having the location of the one or more phrases of the search string based on the locations included in the list of locations for each word; and generating a variable-sized container, associated with a maximum size, to store snippets corresponding to the collection of books based on the search string. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A book search system having one or more computer-readable storage media configured to provide a data structure for processing searches related to a collection of books, the data structure comprising:
-
one or more headers to store a hash for each unique word in the collection of books and an offset to a location list that provides the location of each instance of the unique word; the location list is accessible via the offset and includes word location and character location pairs for each instance of each unique word in the collection of books, wherein word locations and character locations of each subsequent instance is represented as delta to a previous instance of the unique word; the location list is searchable to obtain all locations for each word included in a search string; and generating a variable-sized container associated with a maximum size, to store snippets corresponding to the collection of books based on the search string. - View Dependent Claims (10, 11)
-
-
12. One or more computer-readable storage media having computer-executable instructions embodied thereon that perform a method for extracting a phrase related to a book in a collection of books, the method comprising:
-
receiving a search string having one or more phrases specified by a user, wherein the phrases include one or more words; parsing the search string to extract each word and word sequence for the one or more phrases; accessing a book index corresponding to a collection of books that includes unstructured books, which lack markup language tags, to obtain a list of locations for each word in the one or more phrases, wherein accessing a book index comprises locating a hash corresponding to each word and loading an offset associated with the hash to obtain the list of locations; traversing the list of locations to find each word in the word sequence specified by the one or more phrases of the search string; generating a phrase list having the location of the one or more phrases of the search string based on the locations included in the list of locations for each word; and generating a variable-sized container, associated with a maximum size, to store snippets corresponding to the collection of books based on the search string.
-
Specification