System for generating a search formula by accessing search terms on the basis of a training set of pertinent and non-pertinent objects
First Claim
1. Apparatus for generating a search formula in an information retrieval system comprising:
- input means for inputting given data consisting of pertinent data and non-pertinent data, said pertinent data and said non-pertinent data being designated by a user, said pertinent data and said non-pertinent data comprising pieces of data, each piece of said pertinent data satisfying a need of said user, each piece of said non-pertinent data not satisfying the need of said user;
search term selection means for selecting search terms among the terms included in the pertinent data on the basis of first and second term appearance ratios, said first term appearance ratio being a ratio of a number of pieces of said given data containing said particular term to a total number of pieces of said given data, said second term appearance ratio being a ratio of a number of pieces of pertinent data containing said particular term to a total number of pieces of said pertinent data;
effectiveness calculation means for calculating respective effectiveness values of the search terms selected by the search term selection means, said effectiveness calculation means calculating said effectiveness of each selected search term based on a number of pieces of said pertinent data containing the search term and the total number of pieces of said given data;
threshold determination means for determining a threshold of effectiveness by using the effectiveness values of the search terms included in respective pieces of pertinent data; and
search formula generation means for generating a search formula on the basis of the respective effectiveness values of the search terms and the threshold determined by the threshold determination means, said search formula comprising the selected search terms combined by boolean operators.
1 Assignment
0 Petitions
Accused Products
Abstract
An apparatus for automatically generating a search formula uses pertinent data and non-pertinent data given by the user. Suitable search terms are selected from the pertinent data on the basis of the term appearance ratios only in the pertinent data and in the given data. The respective effectiveness values of the search terms are calculated by using the number of pieces of pertinent data containing the term and the number of pieces of given data containing the term. Among the sums of the effectiveness values of the search terms included in respective pieces of pertinent data, the smallest sum is decided to be the threshold of effectiveness. Finally, the search formula consisting of search terms combined is generated on the basis of the threshold of effectiveness and the respective effectiveness values of the search terms.
45 Citations
20 Claims
-
1. Apparatus for generating a search formula in an information retrieval system comprising:
-
input means for inputting given data consisting of pertinent data and non-pertinent data, said pertinent data and said non-pertinent data being designated by a user, said pertinent data and said non-pertinent data comprising pieces of data, each piece of said pertinent data satisfying a need of said user, each piece of said non-pertinent data not satisfying the need of said user; search term selection means for selecting search terms among the terms included in the pertinent data on the basis of first and second term appearance ratios, said first term appearance ratio being a ratio of a number of pieces of said given data containing said particular term to a total number of pieces of said given data, said second term appearance ratio being a ratio of a number of pieces of pertinent data containing said particular term to a total number of pieces of said pertinent data; effectiveness calculation means for calculating respective effectiveness values of the search terms selected by the search term selection means, said effectiveness calculation means calculating said effectiveness of each selected search term based on a number of pieces of said pertinent data containing the search term and the total number of pieces of said given data; threshold determination means for determining a threshold of effectiveness by using the effectiveness values of the search terms included in respective pieces of pertinent data; and search formula generation means for generating a search formula on the basis of the respective effectiveness values of the search terms and the threshold determined by the threshold determination means, said search formula comprising the selected search terms combined by boolean operators. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method of generating a search formula in an information retrieval system comprising the steps of:
-
receiving from a user given data containing pieces of pertinent data and pieces of non-pertinent data, each piece of pertinent data satisfying a user need, each piece of non-pertinent data not satisfying the user need, each piece of given data containing at least one term; selecting search terms from the terms in the pertinent data on the basis of first and second term appearance ratios for each term, the first term appearance ratio being a ratio of a number of pieces of given data containing the term to a total number of pieces of given data, the second term appearance ratio being a ratio of a number of pieces of pertinent data containing the term to a total number of pieces of pertinent data; calculating effectiveness values for each of the selected search terms based on a number of pieces of pertinent data containing the selected search term and the total number of pieces of given data; determining an effectiveness threshold using the calculated effectiveness values; and generating the search formula based on the effectiveness values and the threshold. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification