Method and apparatus for concept-based searching across a network
First Claim
Patent Images
1. A method comprising:
- receiving a search term for a query;
searching a network of concept terms for terms related to the search term, wherein each related term and the search term appear together in at least one sentence in a web page;
reformulating the query using the search term and the related terms before performing a search for documents based on the search term;
searching a local database for data terms that match the search term and the related terms, wherein the data terms are generated based on occurrence frequencies within a document residing on the web sites located on server connected to, wherein the occurrence frequencies include mutual information associated with a first term and a second term within a given web page using a predetermined algorithm, wherein the mutual information is determined based on one or more weight factors of the first and second terms, the one or more weight factors representing occurrence frequencies of the respective term, and wherein the mutual information (MI) of the first term x and the second term y is determined by MI(x, y)=f(x,y)/f(x)+f(y)−
f(x, y), wherein f(x, y) corresponds to an occurrence frequency of both the first term and the second term, wherein f(x) corresponds to an occurrence frequency of the first term, and wherein f(y) corresponds to an occurrence frequency of the second term; and
in response to matching data terms with the search terms and related terms corresponding to the data terms, retrieving the documents from the respective websites.
1 Assignment
0 Petitions
Accused Products
Abstract
According to one embodiment of the invention, a method includes receiving a search term for a query. The method also includes searching a network of concept terms for terms related to the search term. Additionally, the query is reformulated using the search term and the related terms. A local database is searched for data terms that match the search term and the related terms. The data terms are from documents residing on websites located on servers across a network. Moreover, the method includes retrieving the documents from the websites whose data terms match the search term and the related terms.
-
Citations
26 Claims
-
1. A method comprising:
-
receiving a search term for a query; searching a network of concept terms for terms related to the search term, wherein each related term and the search term appear together in at least one sentence in a web page; reformulating the query using the search term and the related terms before performing a search for documents based on the search term; searching a local database for data terms that match the search term and the related terms, wherein the data terms are generated based on occurrence frequencies within a document residing on the web sites located on server connected to, wherein the occurrence frequencies include mutual information associated with a first term and a second term within a given web page using a predetermined algorithm, wherein the mutual information is determined based on one or more weight factors of the first and second terms, the one or more weight factors representing occurrence frequencies of the respective term, and wherein the mutual information (MI) of the first term x and the second term y is determined by MI(x, y)=f(x,y)/f(x)+f(y)−
f(x, y), wherein f(x, y) corresponds to an occurrence frequency of both the first term and the second term, wherein f(x) corresponds to an occurrence frequency of the first term, and wherein f(y) corresponds to an occurrence frequency of the second term; andin response to matching data terms with the search terms and related terms corresponding to the data terms, retrieving the documents from the respective websites. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A machine-readable storage medium having instructions, when executed by a machine, causes the machine to perform a method, the method comprising;
-
receiving a search term for a query; searching a network of concept terms for terms related to the search term, wherein each related term and the search term appear together in at least one sentence in a web page; reformulating the query using the search term and the related terms before performing a search for documents based on the search term; searching a local database for data terms that match the search term and the related terms, wherein the data terms are generated based on occurrence frequencies within a document residing on web sites located on server connected to, wherein the occurrence frequencies include mutual information associated with a first term and a second term within a given web page using a predetermined algorithm, wherein the mutual information is determined based on one or more weight factors of the first and second terms, the one or more weight factors representing occurrence frequencies of the respective term, and wherein the mutual information (MI) of the first term x and the second term y is determined by MI(x, y)=f(x,y)/f(x)+f(y)−
f(x, y), wherein f(x, y) corresponds to an occurrence frequency of both the first term and the second term, wherein f(x) corresponds to an occurrence frequency of the first term, and wherein f(y) corresponds to an occurrence frequency of the second term; andin response to matching data terms with the search terms and related terms corresponding to the data terms, retrieving the documents from the respective websites. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
-
Specification