Document retrieval system
First Claim
1. A document retrieval system which searches a target document to be retrieved in response to a retrieval request and ranks a retrieval result, said system retaining index information for each of a plurality of fields of said target document and comprising:
- a field rate inputting means for allowing the user to specify a rate of influence of a field on a ranking of said retrieval result, and for allowing the user to specify said rate of influence of any field on the ranking of said retrieval result.
2 Assignments
0 Petitions
Accused Products
Abstract
A document retrieval system for searching a document coinciding with a retrieval request the user inputs and further ranking the document in accordance with the degree of coincidence between the document and the retrieval request. In the document retrieval system, a word frequency calculating section finds out the number of documents where a word appears, a frequency of occurrence of the word in a document and obtains a weighting parameter for the word, and a frequency score calculating section obtains a frequency score on the basis of the output of the word frequency calculating section. In addition, a word cooccurrence relation checking section checks word cooccurrence relations of the retrieval request and the document, and a cooccurrence score calculating section calculates a cooccurrence score from the degree of coincidence therebetween. A document score calculating section calculates a document score on the basis of the frequency score and the cooccurrence score. The documents are ranked in order of document score and displayed to the user.
236 Citations
11 Claims
-
1. A document retrieval system which searches a target document to be retrieved in response to a retrieval request and ranks a retrieval result, said system retaining index information for each of a plurality of fields of said target document and comprising:
- a field rate inputting means for allowing the user to specify a rate of influence of a field on a ranking of said retrieval result, and for allowing the user to specify said rate of influence of any field on the ranking of said retrieval result.
-
2. A document retrieval system which searches a target document to be retrieved in response to a retrieval request and ranks retrieval results, comprising:
-
a word frequency index for storing a frequency of occurrence of a dictionary word in said target document; a word cooccurrence index for storing word cooccurrence information appearing in said target document; word frequency information extracting means for extracting word frequency information from document data to be retrieved to store it in said word frequency index; word cooccurrence information extracting means for extracting word cooccurrence information from said document data to store it in said word cooccurrence index; retrieval request inputting means through which the user inputs said retrieval request; word frequency calculating means for consulting said word frequency index to obtain an occurrence frequency of a dictionary word, included in said retrieval request inputted through said retrieval request inputting means, in a document of said document data; frequency score calculating means for calculating a frequency score of said document indicative of a degree of coincidence between said retrieval request and said document on the basis of said word occurrence frequency obtained through said word frequency calculating means; word cooccurrence information extracting means for extracting word cooccurrence information from said retrieval request; word cooccurrence relation checking means for referring to said word cooccurrence index to find out how many word cooccurrence relations included in said retrieval request and outputted from said word cooccurrence information extracting means appear in said document; cooccurrence score calculating means for calculating a cooccurrence score of said document on the basis of a quantity of said word cooccurrence relation appearing in common in said retrieval request and said document; document score calculating means for calculating a document score on the basis of the output of said frequency score calculating means and the output of said cooccurrence score calculating means; document ranking means for rearranging said target documents being retrieval results in the order of document score obtained by said document score calculating means; and retrieval result displaying means for displaying said retrieval results ranked.
-
-
3. A document retrieval system which searches a target document to be retrieved in response to a retrieval request and ranks retrieval results, comprising:
-
a word frequency index for storing a frequency of occurrence of a dictionary word in said target document; word frequency information extracting means for extracting word frequency information from document data to be retrieved to store it in said word frequency index; primary retrieval request inputting means for allowing the user to input a first retrieval request to be dealt with preferentially; secondary retrieval request inputting means for allowing the user to input a second retrieval request having a lower precedence than that of said first retrieval request; word frequency calculating means for consulting said word frequency index to obtain a frequency of occurrence of a dictionary word, included in said first and second retrieval requests, in a document of said document data; frequency score calculating means for calculating a frequency score of said document indicative of a degree of coincidence between said document and one of said first and second retrieval requests on the basis of said word occurrence frequency obtained in said word frequency calculating means; document score calculating means for calculating a document score of said document indicative of said document and one of said first and second retrieval requests on the basis of said frequency score outputted from said frequency score calculating means; document ranking means for rearranging said target documents being retrieval results in the order of document score obtained by said document score calculating means; and retrieval result displaying means for displaying said retrieval results ranked.
-
-
4. A document retrieval system which searches a target document to be retrieved in response to a retrieval request and ranks retrieval results, comprising:
-
a field word frequency index for storing a frequency of occurrence of a dictionary word in said target document at every field; word frequency information extracting means for extracting word frequency information from document data to be retrieved and for putting it in said field word frequency index; retrieval request inputting means for allowing the user to input said retrieval request; field rate inputting means for allowing the user to input a rate indicative of a degree of influence of a score of a field of a document on a document score; field word frequency calculating means for consulting said field word frequency index in terms of a dictionary word included in said retrieval request to obtain a frequency of occurrence of said dictionary word in said document; field frequency score calculating means for calculating a frequency score indicative of a degree of coincidence between a field of each document and said retrieval request on the basis of said word occurrence frequency acquired in said field word frequency calculating means; document score calculating means for calculating a document score indicative of a degree of coincidence between said document and said retrieval request on the basis of said word occurrence frequency of said field outputted from said field frequency score calculating means and said rate inputted to said field rate inputting means; document ranking means for rearranging said target documents being retrieval results in the order of document score obtained by said document score calculating means; and retrieval result displaying means for displaying said retrieval results ranked.
-
-
5. A document retrieval system which searches a target document to be retrieved in response to a retrieval request and ranks retrieval results, comprising:
-
a word frequency index for storing a frequency of occurrence of a dictionary word in said target document; an occurrence word index for storing a list of words which emerge in said target document; word frequency information extracting means for deriving word frequency information from document data to be retrieved and further for storing it in said word frequency index; occurrence word information extracting means for deriving occurrence word information from said document data and further for retaining it in said occurrence word index; retrieval request inputting means through which the user inputs said retrieval request; word frequency calculating means for consulting said word frequency index to calculate a frequency of occurrence of a dictionary word, included in said target request, in a document of said document data; frequency score calculating means for calculating a score of said document indicative of a degree of coincidence between said document and said retrieval request on the basis of said word occurrence frequency attained in said word frequency calculating means; occurrence word number calculating means for referring to said occurrence word index to find out how many words of words included in said retrieval request appear in said document; occurrence word score calculating means for obtaining an occurrence word score to be added to said document on the basis of the number of occurrence words attained in said occurrence word number calculating means; document score calculating means for calculating a document score of said document indicative of a degree of coincidence between said retrieval request and said document on the basis of said frequency score outputted from said frequency score calculating means and said occurrence word score output from said occurrence word number score calculating means; document ranking means for rearranging said target documents being retrieval results in the order of document score obtained by said document score calculating means; and retrieval result displaying means for displaying said retrieval results ranked.
-
-
6. A document retrieval system which searches a target document to be retrieved in response to a retrieval request and ranks retrieval results, comprising:
-
a word frequency index for storing a frequency of occurrence of a dictionary word in said target document; a word occurrence position index for storing a position of a word appearing in said target document; word frequency information extracting means for extracting word frequency information from document data to be retrieved and further for storing it in said word frequency index; word occurrence position information extracting means for acquiring word position information from said document data and further for retaining it in said word occurrence position index; retrieval request inputting means through which the user inputs said retrieval request; word frequency calculating means for consulting said word frequency index to calculate an occurrence frequency of a dictionary word, included in said retrieval request, in a document of said document data; frequency score calculating means for obtaining a score of said document indicative of a degree of coincidence between said document and said retrieval request on the basis of said word occurrence frequency attained in said word frequency calculating means; occurrence position calculating means for referring to said word occurrence position index to obtain an occurrence position of a word, included in said retrieval request, in said document; word proximity calculating means for calculating a degree of proximity between words of said document on the basis of said word occurrence positions outputted from said word occurrence position calculating means; proximity score calculating means for attaining a proximity score to be given to said document, on the basis of the degree of proximity outputted from said word proximity calculating means; document score calculating means for calculating a score of said document indicative of a degree of coincidence between said document and said retrieval request on the basis of said frequency score outputted from said frequency score calculating means and said proximity score outputted from said proximity score calculating means; document ranking means for rearranging said target documents being retrieval results in the order of document score obtained by said document score calculating means; and retrieval result displaying means for displaying said retrieval results ranked.
-
-
7. A document retrieval system which searches a target document to be retrieved in response to a retrieval request and ranks retrieval results, comprising:
-
an index including a frequency of word occurrence and word cooccurrence information in said target document at every field; field rate inputting means through which the user specifies a field rate of influence on the ranking of said target document at every field; and field word cooccurrence relation checking means for checking whether or not a word cooccurrence relation included in said retrieval request appears in said target document, wherein a score to be given to said target document where said cooccurrence relation appears is increased so that said target document is displayed preferentially.
-
-
8. A document retrieval system which searches a target document to be retrieval in response to a retrieval request and ranks retrieval results, comprising:
-
an index including a word occurrence frequency and word cooccurrence information in said target document; occurrence word calculating means for calculating the number of words of a plurality of words of said retrieval request which also appear in said target document; and word cooccurrence relation checking means for checking whether or not a word cooccurrence relation included in said retrieval request appears in said target document, wherein in cases where said plurality of words included in said retrieval request simultaneously appear in said target document and said word cooccurrence relation appears in said target document, a score to be given to said target document is increased so that said target document is displayed preferentially.
-
-
9. A document retrieval system which searches a target document to be retrieved in response to a retrieval request and ranks retrieval results, comprising:
-
a word frequency index for storing a frequency of occurrence of a dictionary word in said target document; a word cooccurrence index for storing word cooccurrence information appearing in said target document; word frequency information extracting means for extracting word frequency information from document data prepared and further for putting the extracted word frequency information in said word frequency index; word cooccurrence information extracting means for extracting word cooccurrence information from said document data and further for putting the extracted word cooccurrence information in said word cooccurrence index; primary retrieval request inputting means for allowing the user to input a primary retrieval request the user attaches importance to; secondary retrieval request inputting means for allowing the user to input a secondary retrieval request the user attaches lower importance to as compared with said primary retrieval request; word frequency calculating means for consulting said word frequency index to attain a frequency of occurrence of a dictionary word, included in said retrieval requests inputted through said primary retrieval request inputting means and said secondary retrieval request inputting means, in a document; frequency score calculating means for calculating a frequency score of each document on the basis of the word occurrence frequency attained in said word frequency calculating means; word cooccurrence information extracting means for extracting word cooccurrence information from said retrieval requests inputted through said primary retrieval request inputting means and said secondary retrieval request inputting means; word cooccurrence relation checking means for referring to the contents of said word cooccurrence index to obtain the number of word cooccurrence relations included in said retrieval requests outputted from said word cooccurrence information extracting means and appearing in said document; cooccurrence score calculating means for obtaining a cooccurrence score of said document on the basis of the number of word cooccurrence relations attained by said word cooccurrence relation checking means and appearing in common in said retrieval requests and said document; document score calculating means for calculating a final score for said document on the basis of the frequency score outputted from said frequency score calculating means and the cooccurrence score outputted from said cooccurrence score calculating means; document ranking means for rearranging said target documents being retrieval results in the order of document score obtained by said document score calculating means; and retrieval result displaying means for displaying said retrieval results ranked.
-
-
10. A document retrieval system which searches a target document to be retrieved in response to a retrieval request and ranks retrieval results, comprising:
-
a field word frequency index for storing a frequency of occurrence of a dictionary word in said target document at every field; a field word cooccurrence index for storing word cooccurrence information appearing in said target document at every field; word frequency information extracting means for extracting word frequency information from document data prepared and for putting the word frequency information in said field word frequency index; word cooccurrence information extracting means for extracting word cooccurrence information from said document data and for putting the word cooccurrence information in said field word cooccurrence index; retrieval request inputting means through which the user inputs said retrieval request; field word frequency calculating means for consulting said field word frequency index to find out a frequency of occurrence of a dictionary word included in said retrieval request inputted through said retrieval request inputting means at every field in a document; field frequency score calculating means for obtaining a frequency score at every field of said document on the basis of the word occurrence frequency obtained in said field word frequency calculating means; word cooccurrence information extracting means for extracting word cooccurrence information from the retrieval request inputted through the retrieval request inputting means; field word cooccurrence relation checking means for referring to the contents of said field word cooccurrence index to find out the number of word cooccurrence relations included in said retrieval request outputted from said word cooccurrence information extracting means and appearing in a field of said document; field cooccurrence score calculating means for calculating a cooccurrence score at every field of said document on the basis of the number of word cooccurrence relations appearing in common in said field of said document and said retrieval request which is obtained in said field word cooccurrence relation checking means; field rate inputting means through which the user inputs a rate representative of the degree of influence of a score of said field on the ranking of said document; document score calculating means for calculating a final score for said document on the basis of the frequency score outputted from said field frequency score calculating means, the cooccurrence score outputted from said field cooccurrence score calculating means and the rate outputted from said field rate inputting; document ranking means for rearranging said target documents being retrieval results in the order of document score obtained by said document score calculating means; and retrieval result displaying means for displaying said retrieval results ranked.
-
-
11. A document retrieval system which searches a target document to be retrieved in response to a retrieval request and ranks retrieval results, comprising:
-
a word frequency index for storing a frequency of occurrence of a dictionary word in said target document; a word cooccurrence index for storing word cooccurrence information occurring in said target document; word frequency information extracting means for extracting word frequency information from document data prepared and for storing said word frequency information in said word frequency index; word cooccurrence information extracting means for extracting word cooccurrence information from said document data to put said word cooccurrence information in said word cooccurrence index; retrieval request inputting means through which the user inputs said retrieval request; word frequency calculating means for consulting said word frequency index to calculate a frequency of occurrence of a dictionary word, included in said retrieval request inputted through said retrieval request inputting means, in a document; frequency score calculating means for obtaining a frequency score of said document on the basis of the word frequency obtained by said word frequency calculating means; occurrence word number calculating means for consulting said word frequency index to obtain the number of dictionary words included in said retrieval request inputted through said retrieval request inputting means and appearing in said document; occurrence word number score calculating means for calculating an occurrence word number score on the basis of the number of occurrence words obtained by said occurrence word number calculating means; word cooccurrence information extracting means for extracting word cooccurrence information from said retrieval request inputted through said retrieval request inputting means; word cooccurrence relation checking means for referring to the contents of said word cooccurrence index to calculate the number of word cooccurrence relations of the word cooccurrence relations of said retrieval request outputted from said word cooccurrence information extracting means which appears in said document; cooccurrence score calculating means for obtaining a cooccurrence score of said document on the basis of the number of word cooccurrence relations occurring in common in said retrieval request and said document which is obtained by said word cooccurrence relation checking means; document score calculating means for calculating a final score of said document on the basis of the frequency score outputted from said frequency score calculating means, the occurrence word number score outputted from said occurrence word number score calculating means and the cooccurrence score outputted from said cooccurrence score calculating means; document ranking means for rearranging said target documents being retrieval results in the order of document score obtained by said document score calculating means; and retrieval result displaying means for displaying said retrieval results ranked.
-
Specification