Search term clustering
First Claim
1. A computer-implemented search method, comprising:
- providing a processor executing instructions for;
receiving, from a user, a search request including any of search terms, and phrases;
using syntactic and semantic measures to determine a similarity metric between the any of search terms, and phrases received from the user and any of search terms, and phrases entered by others, wherein said syntactic measures analyze lexical aspects of said any of search terms, and phrases, and said semantic measures consider user activity;
using a clustering technique to cluster said any of search terms, and phrases received from the user within said any of search terms, and phrases entered by others in view of said similarity metric based on a pair-wise distance of said any of search terms, and phrases;
generating a weighted sub-graph of a website based on the similarity metric of said any of search terms, and phrases, said sub-graph assigning weights to Web journey choices based on Web journeys made by others who conduct searches with similar any of search terms, and phrases, wherein a graph is a linkage structure of nodes, wherein in a website, each page is a node in a website graph and an edge is a hyperlink from one page to another page, wherein at least one edge has weight, wherein if the graph has weighted edges, then the graph is a weighted graph, wherein a path is a sequence of nodes in the graph, wherein said weighted sub-graph comprises the weighted graph with sub-structure of a given website, and wherein said weights represent how many times all users visit from one page to another page after searching a particular any of search term, and phrase either internally or externally, on the website;
determining search results, wherein said determination includes determining the cluster associated the search request, and determining a highest weight of the weighted sub-graph weights associated with the any of search terms, and phrases of the cluster associated with the search request; and
generating a list of search results, the list of search results being based on an increased quantity of unique search terms or phrases associated with the cluster generated using the search request, and being ordered by the highest weight of the weighted sub-graph weights; and
displaying the list of search results representing optimized search results in response to the search request.
3 Assignments
0 Petitions
Accused Products
Abstract
When conducting the same or similar search, different users can use different search terms and phrases, resulting in an increase in the quantity of unique search terms and phrases. The intent of the various search terms and phrases is determined based on clustering of the terms and phrases of the various users. User search terms bare clustered using semantic and syntactic distances. Thus, the search engine receives a search query from a user and computes a similarity between and among user search terms. The computation uses syntactic techniques to analyze lexical aspects of linguistic terms, and semantic techniques to consider activity of the user in the particular field of interest. A similarity metric is used to determine the similarity between two search terms by computing their syntactic and semantic distances. A clustering technique is then used to cluster search terms based on their pair-wise distance.
-
Citations
19 Claims
-
1. A computer-implemented search method, comprising:
-
providing a processor executing instructions for; receiving, from a user, a search request including any of search terms, and phrases; using syntactic and semantic measures to determine a similarity metric between the any of search terms, and phrases received from the user and any of search terms, and phrases entered by others, wherein said syntactic measures analyze lexical aspects of said any of search terms, and phrases, and said semantic measures consider user activity; using a clustering technique to cluster said any of search terms, and phrases received from the user within said any of search terms, and phrases entered by others in view of said similarity metric based on a pair-wise distance of said any of search terms, and phrases; generating a weighted sub-graph of a website based on the similarity metric of said any of search terms, and phrases, said sub-graph assigning weights to Web journey choices based on Web journeys made by others who conduct searches with similar any of search terms, and phrases, wherein a graph is a linkage structure of nodes, wherein in a website, each page is a node in a website graph and an edge is a hyperlink from one page to another page, wherein at least one edge has weight, wherein if the graph has weighted edges, then the graph is a weighted graph, wherein a path is a sequence of nodes in the graph, wherein said weighted sub-graph comprises the weighted graph with sub-structure of a given website, and wherein said weights represent how many times all users visit from one page to another page after searching a particular any of search term, and phrase either internally or externally, on the website; determining search results, wherein said determination includes determining the cluster associated the search request, and determining a highest weight of the weighted sub-graph weights associated with the any of search terms, and phrases of the cluster associated with the search request; and generating a list of search results, the list of search results being based on an increased quantity of unique search terms or phrases associated with the cluster generated using the search request, and being ordered by the highest weight of the weighted sub-graph weights; and displaying the list of search results representing optimized search results in response to the search request. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer-implemented search method, comprising:
providing a processor executing instructions for; receiving, from a user, a search request including any of the search terms, and phrases; pre-processing the any of search terms and phrases, in a two-step process in which any of search terms, and phrases normalization is followed by finger-keying, wherein said finger-keying comprises; taking each of the any of search terms, and phrases, and splitting it by space to obtain search words; lemmatizing each of the any of search terms, and phrases for each search word to obtain a list of lemmatized search words; sorting the above list in alphabetic order; clubbing the search words separated by space to obtain finger key form; and taking a Levenshtein distance between finger-keys of the any of search terms, and phrases to get a pair-wise distance between the any of search terms, and phrases; using syntactic measures, and semantic measures to determine a similarity metric between the any of search terms, and phrases, received from the user and the any of search terms, and phrases entered by others, wherein said syntactic measures analyze lexical aspects of the any of search terms, and phrases, and said semantic measures consider user activity; using a clustering technique to cluster said any of search terms, and phrases received from the user within the any of search terms, and phrases entered by others in view of the Levenshtein distance and the similarity metric based on the pair-wise distance of said any of search terms, and phrases; determining a search result, wherein said determination includes determining the cluster associated the search request; generating a list of search results, the list of search results being based on an increased quantity of unique search terms or phrases associated with the cluster generated using the search request, and representing an ordered list of search results in response to the search request; and displaying the search results representing optimized search results in response to the search request. - View Dependent Claims (17)
-
12. A search apparatus, comprising:
-
a processor executing instructions for; receiving, from a user, a search request including any of search terms, and phrases; using syntactic and semantic measures to determine a similarity metric between the any of search terms, and phrases received from the user and the any of search terms, and phrases entered by others, wherein said syntactic measures analyze lexical aspects of said any of search terms, and phrases, and said semantic measures consider user activity; using a clustering technique to cluster said any of search terms, and phrases received from the user within the any of search terms, and phrases entered by others in view of said similarity metric based on a pair-wise distance of said any of search terms, and phrases; generating a weighted sub-graph of a website based on the similarity of said any of search terms, and phrases, said sub-graph assigning weights to Web journey choices based on Web journeys made by others who conduct searches with similar any of search terms, and phrases, wherein a graph is a linkage structure of nodes, wherein in a website, each page is a node in a website graph and an edge is a hyperlink from one page to another page, wherein at least one edge has weight, wherein if the graph has weighted edges, then the graph is a weighted graph, wherein a path is a sequence of nodes in the graph, wherein said weighted sub-graph comprises the weighted graph with sub-structure of a given website, wherein said weights represent how many times all users visit from one page to another page after searching a particular any of search terms, and phrases either internally or externally, on the website; determining search results, wherein said determination includes determining the cluster associated the search request, and determining a highest weight of the weighted sub-graph weights associated with the any of search terms, and phrases of the cluster associated with the search request; generating a list of search results, the list of search results being based on an increased quantity of unique search terms or phrases associated with the cluster generated using the search request, and being ordered by the highest weight of the weighted sub-graph weights; and displaying the list of search results representing optimized search results in response to the search request. - View Dependent Claims (14, 15, 16, 18)
-
-
13. A search apparatus, comprising:
-
a processor executing instructions for; receiving, from a user, a search request including any of search terms, and phrases; pre-processing said any of search terms and phrases in a two-step process in which any of search term and phrase normalization is followed by finger-keying, wherein said finger-keying comprises; taking each any of search terms and phrases, and splitting it by space to obtain search words; lemmatizing each of any of search terms, and phrase for each search word to obtain a list of lemmatized search words; sorting the above list in alphabetic order; clubbing the search words separated by space to obtain finger key form; and taking a Levenshtein distance between finger-keys of any of the search terms and phrases to get a pair-wise distance between the any of the search terms and phrases; using syntactic measures, and semantic measures to determine a similarity metric between the any of search terms, and phrases received from the user and the any of the search terms, and phrases entered by others, wherein said syntactic measures analyze lexical aspects of said search terms, and phrases, and said semantic measures consider user activity; using a clustering technique to cluster the any of search terms, and phrases received from the user within the any of search terms, and phrases entered by others in view of the Levenshtein distance and the similarity metric based on the pair-wise distance of said any of search terms, and phrases; determining a search result, wherein said determination includes determining the cluster associated the search request; generating a list of search results, the list of search results being based on an increased quantity of unique search terms or phrases associated with the cluster generated using the search request, and results representing an ordered list of search results in response to the search request; and displaying the search results representing optimized search results in response to the search request. - View Dependent Claims (19)
-
Specification