System, method and computer program product for performing unstructured information management and automatic text analysis, including a search operator functioning as a weighted and (WAND)
First Claim
1. A data processing system for processing stored data, comprising:
- data storage for storing a collection of data units; and
coupled to the data storage, a search engine responsive to a query for retrieving at least one data unit from said data storage;
where the query comprises a search operator comprised of a plurality of search sub-expressions each having an associated weight value, and where said search engine returns a data unit having a weight value sum that exceeds a threshold weight value sum.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. A search query includes a search operator containing of a plurality of search sub-expressions each having an associated weight value. The search engine returns a document or documents having a weight value sum that exceeds a threshold weight value sum. The search operator is implemented as a Boolean predicate that functions as a Weighted AND (WAND).
145 Citations
32 Claims
-
1. A data processing system for processing stored data, comprising:
-
data storage for storing a collection of data units; and
coupled to the data storage, a search engine responsive to a query for retrieving at least one data unit from said data storage;
wherethe query comprises a search operator comprised of a plurality of search sub-expressions each having an associated weight value, and where said search engine returns a data unit having a weight value sum that exceeds a threshold weight value sum. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A data processing system for processing stored document data, comprising:
-
data storage for storing a collection of document data; and
coupled to the data storage, a search engine responsive to a query for retrieving at least one document from said data storage;
wherethe query comprises a Boolean predicate that functions as a Weighted AND (WAND), the WAND taking as arguments a list of Boolean variables X1, X2, . . . , Xk, a list of associated positive weights, w1, w2, . . . , wk, and a threshold θ
, where;
(WAND) (X1, w1, . . . Xk, wk, θ
)is true if;
where xi is the indicator variable for Xi, where - View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A computer program product embodied on a computer-readable medium and comprising program code for directing operation of a text intelligence system in cooperation with at least one application, comprising:
-
a computer program segment for storing a collection of data units; and
a computer program segment implementing a search engine that is responsive to a query for retrieving at least stored one data unit;
wherethe query comprises a search operator comprised of a plurality of search sub-expressions each having an associated weight value, and where said search engine returns a data unit having a weight value sum that exceeds a threshold weight value sum. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
-
-
31. A method for processing document data, comprising:
-
receiving a query; and
responding to the query for retrieving at least one document from a data storage;
wherethe query comprises a Boolean predicate that functions as a Weighted AND (WAND), the WAND taking as arguments a list of Boolean variables X1, X2, . . . , Xk, a list of associated positive weights, w1, w2, . . . , wk, and a threshold θ
, where;
(WAND) (X1, w1, . . . Xk, wk, θ
)is true if;
where x is the indicator variable for Xi, where - View Dependent Claims (32)
-
Specification