System, method and computer program product for performing unstructured information management and automatic text analysis, including a search operator functioning as a weighted and (WAND)
First Claim
1. A method for processing stored data, comprising:
- storing a collection of data units, said data units comprising documents;
retrieving at least one data unit in response to a query, the query comprising a search operator comprised of a plurality of search sub-expressions each having an associated weight value; and
outputting as a result to the query the retrieved at least one data unit, where said retrieved at least one data unit has a weight value sum that exceeds a threshold weight value sum, where said search operator comprises a weighted AND function, where varying the threshold weight value sum varies the operation of the weighted AND function from being substantially a logical OR function to being substantially a logical AND function, where at least one of the weight value sum and the threshold weight value sum is variable during a search.
0 Assignments
0 Petitions
Accused Products
Abstract
Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. A search query includes a search operator containing of a plurality of search sub-expressions each having an associated weight value. The search engine returns a document or documents having a weight value sum that exceeds a threshold weight value sum. The search operator is implemented as a Boolean predicate that functions as a Weighted AND (WAND).
-
Citations
1 Claim
-
1. A method for processing stored data, comprising:
-
storing a collection of data units, said data units comprising documents; retrieving at least one data unit in response to a query, the query comprising a search operator comprised of a plurality of search sub-expressions each having an associated weight value; and outputting as a result to the query the retrieved at least one data unit, where said retrieved at least one data unit has a weight value sum that exceeds a threshold weight value sum, where said search operator comprises a weighted AND function, where varying the threshold weight value sum varies the operation of the weighted AND function from being substantially a logical OR function to being substantially a logical AND function, where at least one of the weight value sum and the threshold weight value sum is variable during a search.
-
Specification