CONTEXT SNIPPET GENERATION FOR BOOK SEARCH SYSTEM
First Claim
1. One or more computer-readable media having computer-executable instructions embodied thereon that perform a method for extracting a phrase related to a book in a collection of books, the method comprising:
- receiving a search string having one or more phrases specified by a user, wherein the phrases include one or more words;
parsing the search string to extract each word and word sequence for the one or more phrases;
accessing a book index corresponding to a collection of books to obtain a list of locations for each word in the one or more phrases;
traversing the list of locations to find each word in the word sequence specified by the one or more phrases of the search string; and
generating a phrase list having the location of the one or more phrases of the search string based on the locations included in the list of locations for each word.
2 Assignments
0 Petitions
Accused Products
Abstract
A book search system and media for generating a book index corresponding to a collection of books and for providing context snippets related to a search string formulated by a user based on the book index are provided. The book index includes a word hash that represents unique words and an offset to a location list that stores locations for each instance of the unique word. The book search system receives the search string from the user, parses the search string to locate phrases and words, and traverses the book index to generate a list of locations for each word or phrase included in the search string. The book search system utilizes a variable-sized container having a maximum size to store subsets of each word or phrase included in the list of locations to generate the context snippets for the search string.
62 Citations
20 Claims
-
1. One or more computer-readable media having computer-executable instructions embodied thereon that perform a method for extracting a phrase related to a book in a collection of books, the method comprising:
-
receiving a search string having one or more phrases specified by a user, wherein the phrases include one or more words; parsing the search string to extract each word and word sequence for the one or more phrases; accessing a book index corresponding to a collection of books to obtain a list of locations for each word in the one or more phrases; traversing the list of locations to find each word in the word sequence specified by the one or more phrases of the search string; and generating a phrase list having the location of the one or more phrases of the search string based on the locations included in the list of locations for each word. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. One or more computer-readable media having computer-executable instructions embodied thereon that perform a method for generating a context snippet from a collection of books, the method comprising:
-
generating a variable-sized container, associated with a maximum size, to store snippets corresponding to the collection of books based on a search string; traversing a book index to search the collection of books and to generate a list of locations for words or phrases included in the search string; checking the list of locations for words or phrases to determine that the list of locations for words or phrases is not empty; generating an empty variable-sized container, when the list of locations for words or phrases is empty; checking the variable-sized container to determine that a current size of the variable-sized container is less than the maximum size, when the list of locations for words or phrases is not empty; moving one or more words or phrases from the list of locations for words or phrases to the variable-sized container and increasing a match count, when the current size of the variable-sized container is less than the maximum size; removing one or more words from the variable-sized container and decreasing the match count, when the current size of the variable-sized container is greater than the maximum size; and communicating the variable-sized container, in response to the search string. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
-
-
18. A book search system having one or more computer-readable media configured to provide a data structure for processing searches related to a collection of books, the data structure comprising:
-
one or more headers to store a hash for each unique word in the collection of books and an offset to a location list that provides the location of each instance of the unique word; the location list is accessible via the offset and includes word location and character location pairs for each instance of each unique word in the collection of books, wherein word locations and character locations of each subsequent instance is represented as delta to a previous instance of the unique word; and the location list is searchable to obtain all locations for each word included in a search string. - View Dependent Claims (19, 20)
-
Specification