Structured searching of dynamic structured document corpuses
First Claim
1. A method performed by at least one computer processor executing computer program instructions tangibly stored on at least one non-transitory computer-readable medium, wherein the method is for use with a system, the system comprising:
- (1) a first natural language processing component that parses text within a first document added to a document corpus to generate annotations of a first type within the document corpus, and (2) a search component that performs structured searching of the annotations of the first type but not of annotations of a second type, the method comprising;
(A) detecting addition of a second document to the document corpus;
(B) in response to the detection of the addition of the second document to the document corpus, detecting addition to the system of a second natural language processing component that parses text within the second document added to the document corpus to generate the annotations of the second type;
(C) modifying the search component, in response to the detection of the addition of the second natural language processing component, to enable the search component to perform the structured searching of the annotations of the second type, thereby producing a modified search component;
(D) receiving a query, the query including a term referring to the annotations of the second type; and
(E) using the modified search component to perform the structured searching of the annotations of the second type on the document corpus using the query.
4 Assignments
0 Petitions
Accused Products
Abstract
A system includes a document corpus containing structured documents, which contain both text and annotations of the text. The system also includes a search engine which is adapted to perform structured searches of the structured documents. As new types of annotations are added to the system, the search engine is updated automatically to become capable of performing structured searches for the new types of annotations. For example, if a new natural language processing (NLP) component, adapted to generate annotations of a new type, is added to the system, then the system automatically updates a query language to include a definition of the new type of annotation. The search engine may then immediately be capable of processing structured queries which refer to the new type of annotation.
27 Citations
7 Claims
-
1. A method performed by at least one computer processor executing computer program instructions tangibly stored on at least one non-transitory computer-readable medium, wherein the method is for use with a system, the system comprising:
- (1) a first natural language processing component that parses text within a first document added to a document corpus to generate annotations of a first type within the document corpus, and (2) a search component that performs structured searching of the annotations of the first type but not of annotations of a second type, the method comprising;
(A) detecting addition of a second document to the document corpus; (B) in response to the detection of the addition of the second document to the document corpus, detecting addition to the system of a second natural language processing component that parses text within the second document added to the document corpus to generate the annotations of the second type; (C) modifying the search component, in response to the detection of the addition of the second natural language processing component, to enable the search component to perform the structured searching of the annotations of the second type, thereby producing a modified search component; (D) receiving a query, the query including a term referring to the annotations of the second type; and (E) using the modified search component to perform the structured searching of the annotations of the second type on the document corpus using the query. - View Dependent Claims (2, 3, 4)
- (1) a first natural language processing component that parses text within a first document added to a document corpus to generate annotations of a first type within the document corpus, and (2) a search component that performs structured searching of the annotations of the first type but not of annotations of a second type, the method comprising;
-
5. A non-transitory computer-readable medium having computer program instructions tangibly stored thereon, wherein the computer program instructions are executable by at least one computer processor to perform a method for use with a system, the system comprising:
- (1) a first natural language processing component that parses text within a first document added to a document corpus to generate annotations of a first type within the document corpus, and (2) a search component that performs structured searching of the annotations of the first type but not of annotations of a second type, the computer program instructions which, when run on the at least one computer processor, causes the at least one computer processor to;
(A) detect addition of a second document to the document corpus; (B) detect, in response to the detection of the addition of the second document to the document corpus, addition to the system of a second natural language processing component that parses text within the second document added to the document corpus to generate the annotations of the second type; (C) modify the search component, in response to the detection of the addition of the second natural language processing component, to enable the search component to perform the structured searching of the annotations of the second type, thereby producing a modified search component; (D) receive a query, the query including a term referring to the annotations of the second type; and (E) use the modified search component to perform the structured searching of the annotations of the second type on the document corpus using the query. - View Dependent Claims (6, 7)
- (1) a first natural language processing component that parses text within a first document added to a document corpus to generate annotations of a first type within the document corpus, and (2) a search component that performs structured searching of the annotations of the first type but not of annotations of a second type, the computer program instructions which, when run on the at least one computer processor, causes the at least one computer processor to;
Specification