Behavioral word segmentation for use in processing search queries
First Claim
1. A computer-implemented method of processing search queries, comprising:
- under control of one or more computer systems configured with executable instructions,obtaining, using at least one computing device, behavioral information associated with a plurality of previously-submitted queries, the behavioral information associated with each previously-submitted query indicative of one or more actions taken by one or more of the corresponding searchers in connection with the previously-submitted query;
identifying, from the obtained previously-submitted queries, a set of candidate pairs, each candidate pair including a first query and a second query, the first query including a set of separated words and the second query including a single word composed of a connected combination of at least a subset of the set of separated words, wherein the subset includes at least two words;
refining, using at least one computing device, the set of candidate pairs by, for each member pair of at least a subset of the set of candidate pairs, at least;
obtaining first search results corresponding to the first query of the member pair;
obtaining second search results corresponding to the second query of the member pair;
based at least in part on the first search results, the second search results, the obtained behavioral information associated with the first query of the member pair, and obtained behavioral information associated with the second query of the member pair, removing the member pair from the set of candidate pairs;
updating, based at least in part on the refined set of candidate pairs, a segmentation database that includes a plurality of member pairs, wherein each member pair includes a first member comprising a set of separated words and a second member comprising a single word composed of a connected combination of at least a subset of the set of separated words of the first member;
upon receiving a search query, comparing the search query against the plurality of member pairs in the segmentation database;
upon identifying a corresponding member pair for the search query in the segmentation database, substituting the search query with the corresponding member pair; and
processing the search query using the corresponding member pair.
1 Assignment
0 Petitions
Accused Products
Abstract
Substrings within strings, such as words within words, are identified based at least in part on recorded behavior of users that have submitted the strings or substrings as search queries. The behavior may relate to actions taken by the users upon having submitting the search queries. The actions may be actions taken in connection with an electronic marketplace, such as actions related to the consumption of items offered in the electronic marketplace. The identified strings and corresponding substrings are used in connection with processing search queries. The strings and substrings may be used to update a search index and/or to modify received search queries for processing.
37 Citations
28 Claims
-
1. A computer-implemented method of processing search queries, comprising:
under control of one or more computer systems configured with executable instructions, obtaining, using at least one computing device, behavioral information associated with a plurality of previously-submitted queries, the behavioral information associated with each previously-submitted query indicative of one or more actions taken by one or more of the corresponding searchers in connection with the previously-submitted query; identifying, from the obtained previously-submitted queries, a set of candidate pairs, each candidate pair including a first query and a second query, the first query including a set of separated words and the second query including a single word composed of a connected combination of at least a subset of the set of separated words, wherein the subset includes at least two words; refining, using at least one computing device, the set of candidate pairs by, for each member pair of at least a subset of the set of candidate pairs, at least; obtaining first search results corresponding to the first query of the member pair; obtaining second search results corresponding to the second query of the member pair; based at least in part on the first search results, the second search results, the obtained behavioral information associated with the first query of the member pair, and obtained behavioral information associated with the second query of the member pair, removing the member pair from the set of candidate pairs; updating, based at least in part on the refined set of candidate pairs, a segmentation database that includes a plurality of member pairs, wherein each member pair includes a first member comprising a set of separated words and a second member comprising a single word composed of a connected combination of at least a subset of the set of separated words of the first member; upon receiving a search query, comparing the search query against the plurality of member pairs in the segmentation database; upon identifying a corresponding member pair for the search query in the segmentation database, substituting the search query with the corresponding member pair; and processing the search query using the corresponding member pair. - View Dependent Claims (2, 3, 4, 5)
-
6. A computer-implemented method of processing search queries, comprising:
under control of one or more computer systems configured with executable instructions, obtaining, using at least one computing device, behavioral information associated with a plurality of previously-submitted queries, the behavioral information associated with each previously-submitted query indicative of one or more actions taken by one or more of the corresponding searchers in connection with the previously-submitted query; identifying, based at least in part on the obtained behavioral information, a set of query pairs, each query pair including a first previously-submitted query composed of a first separated element and a second separated element and a second previously-submitted query composed of a single element that is a combination of the first separated element and the second separated element; providing, using the at least one processor, the set of query pairs to a segmentation data store for use in processing subsequently received search queries, the segmentation data store including a plurality of query pairs each including a first member comprising a set of separated elements and a second member comprising a single element composed of a connected combination of at least a subset of the set of separated elements of the first member; upon receiving a search query, comparing the search query against the plurality of query pairs in the segmentation data store; upon identifying a corresponding query pair for the search query in the segmentation data store, substituting the search query with the corresponding query pair; and processing the search query using the corresponding query pair. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14)
-
15. A computer-implemented method of processing search queries, comprising:
under control of one or more computer systems configured with executable instructions, obtaining, using at least one computing device, a plurality of character strings forming at least part of queries previously submitted by corresponding searchers; obtaining, using the at least one processor, behavioral information associated with the plurality of obtained character strings, the behavioral information associated with each character string indicative of one or more actions taken by one or more of the corresponding searchers in connection with the character string; identifying, based at least in part on the obtained behavioral information, a plurality of pairs, each pair comprising a single character string from the obtained character strings and one or more substrings of the character string, the single character string being a connected combination of the one or more substrings; and providing, using the at least one processor, the plurality of pairs to a segmentation data store for use in processing subsequently received search queries, the segmentation data store including a plurality of pairs each including a first member composed of one or more substrings of a character string and a second member composed of a single connected combination of at least a subset of the one or more substrings of the character string of the first member; upon receiving a search query, comparing the search query against the plurality of pairs in the segmentation database; upon identifying a corresponding pair for the search query in the segmentation database, substituting the search query with the corresponding pair; and processing the search query using the corresponding pair. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22)
-
23. A computer system for processing queries, comprising:
-
one or more processors; and memory in communication with the one or more processors that includes executable instructions that, when executed by the one or more processors, cause the computer system to at least; obtain behavioral information associated with a plurality of previously-submitted queries, the behavioral information associated with each previously-submitted query indicative of one or more actions taken by one or more corresponding searchers in connection with the previously-submitted query; identify, based at least in part on the obtained behavioral information, a set of query pairs, each query pair including a first previously-submitted query composed of a first separated element and a second separated element and a second previously-submitted query composed of a single element that is a combination of the first separated element and the second separated element; provide the identified set of query pairs to a segmentation data store for use in processing subsequently received queries, the segmentation data store including a plurality of query pairs each including a first member comprising a set of separated elements and a second member comprising a single element composed of a connected combination of at least a subset of the set of separated elements of the first member; upon receiving a search query, compare the search query against the plurality of query pairs in the segmentation data store; upon identifying a corresponding query pair for the search query in the segmentation database, substitute the search query with the corresponding query pair; and process the search query using the corresponding query pair. - View Dependent Claims (24, 25)
-
-
26. A computer-readable storage medium having stored thereon instructions that, when executed by one or more processors of a computer system, cause the computer system to at least:
-
obtain a plurality of character strings forming at least part of queries previously submitted by corresponding searchers; obtain behavioral information associated with the character strings, the behavioral information associated with each character string indicative of one or more actions taken by one or more of the corresponding searchers in connection with the character string; identify, based at least in part on the obtained behavioral information, a plurality of pairs, each pair comprising a single character string from the obtained character strings and one or more substrings of the character string, the single character string being a connected combination of the one or more substrings; provide the identified plurality of pairs to a segmentation data store for use in processing subsequently received queries, the segmentation data store including a plurality of pairs each including a first member composed of one or more substrings of a character string and a second member composed of a single connected combination of at least a subset of the one or more substrings of the character string of the first member; upon receiving a search query, compare the search query against the plurality of pairs in the segmentation data store; upon identifying a corresponding pair for the search query in the segmentation data store, substitute the search query with the corresponding pair; and process the search query using the corresponding pair. - View Dependent Claims (27, 28)
-
Specification