System and method for topical document searching
First Claim
1. A computerized system for identifying one or more electronic documents within a collection of electronic documents, comprising:
- at least one interface; and
at least one processor coupled to the at least one interface and programmed to (1) accept a search query through one of the interfaces, (2) obtain a definition of a subset of a collection of electronic documents that comprises a plurality of electronic documents, (3) execute the search query within the subset, thereby obtaining at least one result, and (4) provide at least one of the results through one of the interfaces;
wherein obtaining a definition of a subset comprises defining a subset to comprise (1) at least one source document within the collection, each of the source documents comprising at least one reference that identifies an additional document within the collection of documents, distinct from the source document, and (2) further additional documents identifiable by, for some number of iterations, for each additional document added to the subset in the immediately preceding iteration;
(a) retrieving the additional document, (b) finding in the retrieved document one or more references, each of the one or more references identifying an additional document, and (c) adding each of the found references, not in the definition of the subset, to the definition of the subset.
11 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are providing for searching for documents within topically-defined clusters. A search space is defined, starting with one or more source documents, by examining references from one documents to another and following the networks of references to some level of indirection. Depending on the embodiment, references may be followed from a document containing a reference to a referred-to document, or from a referred-to document to a document containing a reference, or both. Once a search space has been defined, a query is executed, and documents within the search space that satisfy the query parameters are identified. In certain embodiments of the invention, the documents primarily relate to legal materials, and one or more source documents are associated with one or more topics within a topic directory. In such embodiments, a search query may be limited to one or more selected topics by executing the search query within a search space defined using the associated document or documents as the source.
-
Citations
15 Claims
-
1. A computerized system for identifying one or more electronic documents within a collection of electronic documents, comprising:
-
at least one interface; and
at least one processor coupled to the at least one interface and programmed to (1) accept a search query through one of the interfaces, (2) obtain a definition of a subset of a collection of electronic documents that comprises a plurality of electronic documents, (3) execute the search query within the subset, thereby obtaining at least one result, and (4) provide at least one of the results through one of the interfaces;
wherein obtaining a definition of a subset comprises defining a subset to comprise (1) at least one source document within the collection, each of the source documents comprising at least one reference that identifies an additional document within the collection of documents, distinct from the source document, and (2) further additional documents identifiable by, for some number of iterations, for each additional document added to the subset in the immediately preceding iteration;
(a) retrieving the additional document, (b) finding in the retrieved document one or more references, each of the one or more references identifying an additional document, and (c) adding each of the found references, not in the definition of the subset, to the definition of the subset. - View Dependent Claims (2)
-
-
3. A computerized system for identifying one or more electronic documents within a collection of electronic documents, comprising:
-
at least one interface; and
at least one processor coupled to the at least one interface and programmed to (1) accept a search query through one of the interfaces, (2) obtain a definition of a subset of a collection of electronic documents that comprises a plurality of electronic documents, (3) execute the search query within the subset, thereby obtaining at least one result, and (4) provide at least one of the results through one of the interfaces;
wherein obtaining a definition of a subset comprises defining a subset to comprise (1) at least one source document within the collection, (2) additional citing documents identifiable by, for some number of iterations, for each document added to the subset in the immediately preceding iteration;
(a) finding one or more additional citing documents in the collection, each comprising at least one reference to the document, and (b) adding each additional citing document, not already in the subset, to the subset. - View Dependent Claims (4)
-
-
5. A method of identifying one or more documents within a collection of documents, comprising:
-
defining a subset of a collection of documents, the collection of documents comprising a plurality of documents, and the subset comprising (1) at least one source document within the collection of documents, each source document comprising at least one reference that identifies an additional document within the collection of documents, distinct from the source document, (2) additional documents identifiable by, for some number of iterations, for each document added to the search space in the immediately preceding iteration;
(a) retrieving the document, (b) finding in the retrieved document one or more references, each of the one or more references identifying an additional document, and (c) adding each of the found references, not already in the definition of the search space, to the definition of the search space;
accepting a search query comprising one or more criteria; and
identifying one or more documents within the subset that satisfy the one or more criteria comprised by the search query. - View Dependent Claims (6, 9, 10)
-
-
7. A method of identifying one or more documents within a collection of documents, comprising:
-
defining a subset within a collection of documents, the collection of documents comprising a plurality of documents, and the subset comprising (1) at least one source document within the collection of documents, (2) additional citing documents identifiable by, for some number of iterations, for each document added to the subset in the immediately preceding iteration;
(a) finding one or more additional citing documents each comprising at least one reference to the document, and (b) adding each additional citing document, not already in the subset, to the subset;
accepting a search query comprising one or more criteria; and
identifying one or more documents within the subset that satisfy the one or more criteria comprised by the search query. - View Dependent Claims (8)
-
-
11. A method of defining a topical subset of a collection of documents, comprising:
-
defining a subset of a collection of documents to comprise at least one source document within the collection of documents, each source document comprising at least one reference that identifies an additional document within the collection of documents, distinct from the source document, wherein this defining comprises a first iteration, and defining the subset to comprise additional documents identifiable by, for some number of iterations, for each document added to the search space in the immediately preceding iteration;
(a) retrieving the document, (b) finding in the retrieved document one or more references, each of the one or more references identifying an additional document, and (c) adding each of the found references, not already in the definition of the search space, to the definition of the search space. - View Dependent Claims (12, 15)
-
-
13. A method of defining a topical subset of a collection of documents, comprising:
-
defining a subset of a collection of documents to comprise at least one source document within the collection of documents, such defining constituting a first iteration;
defining the subset to comprise at least one additional document identifiable by, for some number of iterations, for each document added to the subset in the immediately preceding iteration;
(a) finding one or more additional citing documents within the collection of documents, each additional citing document comprising one or more references to the document, and (b) adding each of the additional citing documents to the subset. - View Dependent Claims (14)
-
Specification