×

System of document representation retrieval by successive iterated probability sampling

  • US 5,488,725 A
  • Filed: 03/30/1993
  • Issued: 01/30/1996
  • Est. Priority Date: 10/08/1991
  • Status: Expired due to Term
First Claim
Patent Images

1. In a computer system for identifying a predetermined number of documents of a document collection containing representations that have high probabilities of matching a query containing a plurality of concepts, in which the system has a database containing identifications of documents in the document collection and defining a plurality of representations representing the contents of the documents, the collection comprising a plurality of documents, and query means for defining the query, apparatus comprising:

  • sample selection means for iteratively selecting successive samples of a plurality of documents from the collection, each sample containing fewer documents than the entire collection and each successive sample containing documents different from each previous sample;

    processing means responsive to the sample selection means for calculating, during each iteration, probabilities that documents contained in the sample contain representations that match the query and for identifying a preselected number of documents having the highest probabilities, the documents being identified during an iteration from a group consisting of the respective sample of documents and the documents identified during the next previous iteration, the preselected number being different for each iteration and no greater than the predetermined number; and

    output means outputting the identifications of the predetermined number of documents identified by the processing means.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×