Using a weighted tree to determine document relevance
First Claim
1. A computer implemented method for automatically determining document relevance, the method comprising:
- representing a plurality of terms as a tree data structure comprising internal nodes and leaf nodes interconnected by respective node connections;
assigning to each of the leaf nodes of the tree data structure a respective one of the terms;
assigning to at least one of the leaf nodes a respective location specifying a designated location for the term assigned to the leaf node to appear within a document;
assigning a respective operator to each of the internal nodes of the tree data structure;
assigning a respective weight to each of the node connections in the tree data structure; and
calculating a respective relevance value for at least one document as a function of occurrence in the at least one document of the terms respectively assigned to the leaf nodes of the tree data structure, the operators assigned to internal nodes of the tree data structure, and the weights assigned to the associated node connections and, for each of the leaf nodes assigned a respective location, occurrence of the term assigned to the leaf node at the location in the at least one document specified by the respective location assigned to the leaf node.
4 Assignments
0 Petitions
Accused Products
Abstract
The relevance of documents is automatically determined based upon a weighted tree. Terms considered to be relevant are assigned to the leaf nodes of a tree data structure. A location can also be specified in a leaf node, indicating where in a document the term must appear to be considered relevant. Internal nodes of the tree are assigned operators (e.g., add, maximum or minimum). The connections between nodes are assigned weights. A relevance value for a given document is calculated as a function of occurrence in the document of terms assigned to leaves, operators assigned to internal nodes, and weights assigned to the associated node connections. Weighted trees can be used to process search queries. Documents with high relevance scores calculated against the tree can be returned to a user as the results to a query.
-
Citations
18 Claims
-
1. A computer implemented method for automatically determining document relevance, the method comprising:
-
representing a plurality of terms as a tree data structure comprising internal nodes and leaf nodes interconnected by respective node connections; assigning to each of the leaf nodes of the tree data structure a respective one of the terms; assigning to at least one of the leaf nodes a respective location specifying a designated location for the term assigned to the leaf node to appear within a document; assigning a respective operator to each of the internal nodes of the tree data structure; assigning a respective weight to each of the node connections in the tree data structure; and calculating a respective relevance value for at least one document as a function of occurrence in the at least one document of the terms respectively assigned to the leaf nodes of the tree data structure, the operators assigned to internal nodes of the tree data structure, and the weights assigned to the associated node connections and, for each of the leaf nodes assigned a respective location, occurrence of the term assigned to the leaf node at the location in the at least one document specified by the respective location assigned to the leaf node. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. At least one computer readable medium containing a computer program product for automatically determining document relevance, the computer program product comprising program code that, when executed by a computer, causes the computer to perform operations comprising:
-
representing a plurality of terms as a tree data structure comprising internal nodes and leaf nodes interconnected by respective node connections; assigning to each of the leaf nodes of the tree data structure a respective one of the terms; assigning to at least one of the leaf nodes a respective location specifying a designated location for the term assigned to the leaf node to appear within a document; assigning a respective operator to each of the internal nodes of the tree data structure; assigning a respective weight to each node connections in the tree data structure; and calculating a respective relevance value for at least one document as a function of occurrence in the at least one document of the terms respectively assigned to the leaf nodes of the tree data structure, the operators assigned to parent internal nodes of the corresponding leaf nodes, and the weights assigned to the associated node connections and, for each of the leaf nodes assigned a respective location, occurrence of the term assigned to the leaf node at the location in the at least one document specified by the respective location assigned to the leaf node. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer system for automatically determining document relevance, the computer system comprising computer hardware configured to perform operations comprising:
-
representing a plurality of terms as a tree data structure comprising internal nodes and leaf nodes interconnected by respective node connections; assigning to each of the leaf nodes of the tree data structure a respective one of the terms; assigning to at least one of the leaf nodes a respective location specifying a designated location for the term assigned to the leaf node to appear within a document; assigning a respective operator to each of the internal nodes of the tree data structure; assigning a respective operator to each of the node connections in the tree data structure; and calculating a respective relevance value for at least one document as a function of occurrence in the at least one document of the terms respectively assigned to the leaf nodes of the tree data structure, the operators assigned to parent internal nodes of the corresponding leaf nodes, and the weights assigned to the associated node connections and, for each of the leaf nodes assigned a respective location, occurrence of the term assigned to the leaf node at the location in the at least one document specified by the respective location assigned to the leaf node. - View Dependent Claims (14, 15, 16, 17)
-
-
18. A computer implemented method for automatically determining document relevance, the method comprising:
-
accessing a tree structure representing a plurality of terms, the tree data structure comprising internal nodes and leaf nodes interconnected by respective node connections, each of the leaf nodes of the tree data structure being assigned to a respective one of the terms, at least one of the leaf nodes being assigned to a respective location specifying a designated location for the term assigned to the leaf node to appear within a document, a respective operator being assigned to each of the internal nodes of the tree data structure, and a respective weight being assigned to each of the node connections in the tree data structure; and calculating a respective relevance value for at least one document as a function of occurrence in the at least one document of the terms respectively assigned to the leaf nodes of the tree data structure, the operators assigned to internal nodes of the tree data structure, and the weights assigned to the associated node connections and, for each of the leaf nodes assigned a respective location, occurrence of the term assigned to the leaf node at the location in the at least one document specified by the respective location assigned to the leaf node.
-
Specification