System and method for matching search requests and relevant data
First Claim
1. A computerized method of arrangement and representation of terms in context, comprising:
- providing terms and relations from at least one source;
arranging said terms in a Human Knowledge Structure (HKS), said structure being a directed acyclic graph, wherein each one of said graph'"'"'s nodes consists of one term and the graph'"'"'s arcs comprise at least one of “
essence” and
“
contain”
relations between adjacent terms, wherein term ‘
A’
is the essence of term ‘
B’
, or term ‘
A’
contains term ‘
B’
;
providing a document;
identifying in the document frequent terms and marking them with associated respective initial weights on the HKS, said marked terms defining an initial weighted terms set for the document;
creating a final weighted terms set for the document by;
i. marking on the HKS all the terms connecting between said marked terms;
ii. calculating weights for said initial and connecting marked terms, based on said at least two relations in the HKS, whereby the marked terms define a weighted set of terms for the document; and
iii. selecting from said weighted marked terms set terms having weights greater than a predefined threshold, thereby defining a final weighted set of terms for the document; and
iv. linking said final weighted set of terms to the document.
0 Assignments
0 Petitions
Accused Products
Abstract
A system and methods for matching between search requests and relevant data (web pages, online documents, essays, online text in general, images, video, footage etc.). The system comprises three components that can work separately or together and can be integrated with other search engine methods in order to further improve the relevancy of search results. The system can find similarity between different document and measure the distance (in similarity) between documents. The three components are: Context based understanding, comprising putting the documents in the context of aspects of the human knowledge external to the documents, Partial Sentence analysis and 100 percentage points to keyword/tag sets.
28 Citations
21 Claims
-
1. A computerized method of arrangement and representation of terms in context, comprising:
-
providing terms and relations from at least one source; arranging said terms in a Human Knowledge Structure (HKS), said structure being a directed acyclic graph, wherein each one of said graph'"'"'s nodes consists of one term and the graph'"'"'s arcs comprise at least one of “
essence” and
“
contain”
relations between adjacent terms, wherein term ‘
A’
is the essence of term ‘
B’
, or term ‘
A’
contains term ‘
B’
;providing a document; identifying in the document frequent terms and marking them with associated respective initial weights on the HKS, said marked terms defining an initial weighted terms set for the document; creating a final weighted terms set for the document by; i. marking on the HKS all the terms connecting between said marked terms; ii. calculating weights for said initial and connecting marked terms, based on said at least two relations in the HKS, whereby the marked terms define a weighted set of terms for the document; and iii. selecting from said weighted marked terms set terms having weights greater than a predefined threshold, thereby defining a final weighted set of terms for the document; and iv. linking said final weighted set of terms to the document. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A computerized method of matching between a search string and documents, comprising:
-
providing a search string comprising terms that form a partial sentence; retrieving a set of documents comprising at least part of said partial sentence; counting the number of exact occurrences of the partial sentence in each of the retrieved documents; counting the number of occurrences of equivalent permutations of the partial sentence that do not conflict with the meaning of the partial sentence in each of the retrieved document; counting the number of occurrences of close permutations of the partial sentence that do not conflict with the meaning of the partial sentence in each of the retrieved document; for each retrieved document, associating scores to all said counted occurrences, based on the similarity between the occurrence and the partial sentence; summing up the scores to a final score for each document; and ranking the documents in a result list according to said final scores. - View Dependent Claims (14, 15, 16)
-
-
17. A computerized system for arrangement and representation of terms in context, comprising:
-
a server; at least one source of terms and relations; communication means between said at least one source of terms and relations and said server; a first storage device connected with said server; means for storing said terms in said first storage device in a Human Knowledge Structure (HKS) being a directed acyclic graph, wherein each one of said graph'"'"'s nodes comprises one term and the graph'"'"'s arcs consist of at least one of “
essence” and
“
contain”
relations between adjacent terms, wherein term ‘
A’
is the essence of term ‘
B’
, or term ‘
A’
contains term ‘
B’
;at least one source of documents; communication means between said at least one source of documents and said server; a second storage device connected with said server; wherein said server comprises computerized means for; receiving a document from said at least one source of documents; identifying in the document frequent terms and marking them with associated respective initial weights on the HKS, said marked terms defining an initial weighted terms set for the document; creating a final weighted terms set for the document by; marking on the HKS all the terms connecting between said marked terms; calculating weights for said initial and connecting marked, based on said at least two relations in the HKS, whereby the marked terms define a weighted set of terms for the document; selecting from said weighted marked terms set terms having weights greater than a predefined threshold, thereby defining a final weighted set of terms for the document; and linking said final weighted set of terms to the document; and means for storing said linked final weighted set of terms in said second storage device. - View Dependent Claims (18)
-
-
19. A computerized system for matching between a search string and documents, comprising:
-
a server; at least one source of search strings; at least one source of documents; communication means between said at least one source of search strings and said server; and communication means between said at least one source of documents and said server, said server comprising computerized means for; receiving from said at least one source of search strings a search string comprising terms that form a partial sentence; retrieving from said at least one source of documents a set of documents comprising at least part of said partial sentence; counting the number of exact occurrences of the partial sentence in each of the retrieved documents; counting the number of occurrences of equivalent permutations of the partial sentence that do not conflict with the meaning of the partial sentence in each of the retrieved document; counting the number of occurrences of close permutations of the partial sentence that do not conflict with the meaning of the partial sentence in each of the retrieved document; for each retrieved document, associating scores to all said counted occurrences, based on the similarity between the occurrence and the partial sentence; summing up the scores to a final score for each document; and ranking the documents in a result list according to said final scores. - View Dependent Claims (20, 21)
-
Specification