Structured searching of dynamic structured document corpuses
First Claim
1. A method performed by at least one computer processor executing computer program instructions tangibly stored on at least one non-transitory computer-readable medium, wherein the method is for use with a system, wherein the system comprises:
- a first natural language processing component adapted to parse text within a document corpus to generate annotations of a first type within the document corpus; and
a search component adapted to perform structured searching of annotations of the first type but not of annotations of a second type; and
wherein the method comprises;
(A) identifying a first annotation of the second type within the document corpus;
(B) modifying the search component to enable the search component to perform structured searching of annotations of the second type, thereby producing a modified search component;
(C) receiving a query, the query including a term referring to the second type of annotation;
(D) using the modified search component to perform a search on a document corpus using the query; and
(E) before (A), using a second natural language processing component to parse first text within the document corpus to generate the first annotation of the second type and to add the first annotation of the second type to the document corpus, comprising using the second natural language processing component to recognize text within the document corpus that represents a concept corresponding to the second type of annotation, to generate the annotation of the second type within the document corpus, and to associate the first annotation of the second type with the recognized text.
10 Assignments
0 Petitions
Accused Products
Abstract
A system includes a document corpus containing structured documents, which contain both text and annotations of the text. The system also includes a search engine which is adapted to perform structured searches of the structured documents. As new types of annotations are added to the system, the search engine is updated automatically to become capable of performing structured searches for the new types of annotations. For example, if a new natural language processing (NLP) component, adapted to generate annotations of a new type, is added to the system, then the system automatically updates a query language to include a definition of the new type of annotation. The search engine may then immediately be capable of processing structured queries which refer to the new type of annotation.
155 Citations
22 Claims
-
1. A method performed by at least one computer processor executing computer program instructions tangibly stored on at least one non-transitory computer-readable medium, wherein the method is for use with a system, wherein the system comprises:
-
a first natural language processing component adapted to parse text within a document corpus to generate annotations of a first type within the document corpus; and a search component adapted to perform structured searching of annotations of the first type but not of annotations of a second type; and wherein the method comprises; (A) identifying a first annotation of the second type within the document corpus; (B) modifying the search component to enable the search component to perform structured searching of annotations of the second type, thereby producing a modified search component; (C) receiving a query, the query including a term referring to the second type of annotation; (D) using the modified search component to perform a search on a document corpus using the query; and (E) before (A), using a second natural language processing component to parse first text within the document corpus to generate the first annotation of the second type and to add the first annotation of the second type to the document corpus, comprising using the second natural language processing component to recognize text within the document corpus that represents a concept corresponding to the second type of annotation, to generate the annotation of the second type within the document corpus, and to associate the first annotation of the second type with the recognized text. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A non-transitory computer-readable medium having computer program instructions tangibly stored thereon, wherein the computer program instructions are executable by at least one computer processor to perform a method for use with a system, wherein the system comprises:
-
a first natural language processing component adapted to parse text within a document corpus to generate annotations of a first type within the document corpus; and a search component adapted to perform structured searching of annotations of the first type but not of annotations of a second type; and wherein the method comprises; (A) identifying a first annotation of the second type within the document corpus; (B) modifying the search component to enable the search component to perform structured searching of annotations of the second type, thereby producing a modified search component; (C) receiving a query, the query including a term referring to the second type of annotation; and (D) using the modified search component to perform a search on a document corpus using the query; (E) before (A), using a second natural language processing component to parse first text within the document corpus to generate the first annotation of the second type and to add the first annotation of the second type to the document corpus, comprising using the second natural language processing component to recognize text within the document corpus that represents a concept corresponding to the second type of annotation, to generate the annotation of the second type within the document corpus, and to associate the first annotation of the second type with the recognized text. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
Specification