System, method and computer program product for performing unstructured information management and automatic text analysis, including a search operator functioning as a Weighted AND (WAND)
First Claim
1. A data processing system for processing stored data, comprising:
- data storage for storing a collection of data units; and
coupled to the data storage, a search engine responsive to a query for retrieving at least one data unit from said data storage;
wherethe query comprises a search operator comprised of a plurality of search sub-expressions each having an associated weight value, and where said search engine returns a data unit having a weight value sum that exceeds a threshold weight value sum, where at least one of the weight values and threshold weight value sum are variable during a search.
0 Assignments
0 Petitions
Accused Products
Abstract
Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. A search query includes a search operator containing of a plurality of search sub-expressions each having an associated weight value. The search engine returns a document or documents having a weight value sum that exceeds a threshold weight value sum. The search operator is implemented as a Boolean predicate that functions as a Weighted AND (WAND).
69 Citations
15 Claims
-
1. A data processing system for processing stored data, comprising:
-
data storage for storing a collection of data units; and coupled to the data storage, a search engine responsive to a query for retrieving at least one data unit from said data storage;
wherethe query comprises a search operator comprised of a plurality of search sub-expressions each having an associated weight value, and where said search engine returns a data unit having a weight value sum that exceeds a threshold weight value sum, where at least one of the weight values and threshold weight value sum are variable during a search. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer program product embodied on a computer-readable medium and comprising program code for directing operation of a text intelligence system in cooperation with at least one application, comprising:
-
a computer program segment for storing a collection of data units; and a computer program segment implementing a search engine that is responsive to a query for retrieving at least stored one data unit;
wherethe query comprises a search operator comprised of a plurality of search sub-expressions each having an associated weight value, and where said search engine returns a data unit having a weight value sum that exceeds a threshold weight value sum, where at least one of the weight values and threshold weight value sum are variable during a search. - View Dependent Claims (10)
-
-
11. A computer program product embodied on a computer-readable medium and comprising program code for directing operation of a text intelligence system in cooperation with at least one application, comprising :
-
a computer program segment for storing a collection of data units; and a computer program segment implementing a search engine that is responsive to a query for retrieving at least stored one data unit;
wherethe query comprises a search operator comprised of a plurality of search sub-expression each having an associated weight value, and where said search engine returns a data having a weight value sum that exceeds a threshold weight value sum, where the data unit having a weight value sum that exceeds a threshold weight value sum undergoes a detailed evaluation and where a value associated with the detailed evaluation is compared to minimum value. - View Dependent Claims (12, 13, 14)
-
-
15. A computer program product embodied on a computer-readable medium and comprising program code for directing operation of a text intelligence system in cooperation with at least one application, comprising:
-
a computer program segment for storing a collection of data units; and a computer program segment implementing a search engine that is responsive to a query for retrieving at least stored one data unit;
where the at least one data unit is stored in a heap if the heap is not full and where the query comprises a search operator comprised of a plurality of search sub-expressions each having an associated weight value, where said search engine returns a data unit having a weight value sum that exceeds a threshold weight value sum, where, if the heap is full, the data unit having a weight value sum that exceeds the threshold weight value sum replaces a data unit with the least weight value sum from the heap.
-
Specification