Apparatus and method for document retrieval
First Claim
1. A document retrieval system for execution of document retrieval of one or more documents in an unstructured database, comprising:
- a primary query designating part that designates a primary query as a provisional retrieval expression, the primary query being constituted by enumeration of arbitrary words based on an intention of a user;
a query candidate synthesizing part that, on the basis of the primary query designated by the primary query designating part, synthesizes a query candidate group based on precollected information retrieved from the unstructured database during a non-query process for querying the unstructured database; and
a feedback indicating part that performs relevance feedback which presents the query candidate group synthesized by said query candidate synthesizing part to the user, and performs relevance feedback for establishing a query selected from the thus-presented query candidate group as a query for the execution of document retrieval;
a database which holds relational representation data included in the precollected information; and
a relation expanding/reducing part extracting relation representation data corresponding to said primary query from the relational representation data held in said database, wherein said query candidate synthesizing part synthesizes the candidate group based on relational representation data extracted by said relation expanding/reducing part.
1 Assignment
0 Petitions
Accused Products
Abstract
Document retrieval system and method are disclosed which can diminish a gap between the user'"'"'s retrieval intention in information retrieval and the configuration of a query as well as document representations in database and which permits easy retrieval reflecting the user'"'"'s retrieval intention. The user enumerates a group of words which the user hits upon, as a primary query. Upon receipt of the primary query, the system estimates relational representations which the words (group) of the primary query can possess, and then makes expansion of the query through a partial coincidence of the relational representations and sample spaces extracted from document data to prepare a query candidate representation group. The expanded query candidate representation group is presented to the user. The user then simply chooses a relational representation candidate in accordance with his or her intention. A retrieval execution query is constituted by the thus-selected representation.
72 Citations
20 Claims
-
1. A document retrieval system for execution of document retrieval of one or more documents in an unstructured database, comprising:
-
a primary query designating part that designates a primary query as a provisional retrieval expression, the primary query being constituted by enumeration of arbitrary words based on an intention of a user;
a query candidate synthesizing part that, on the basis of the primary query designated by the primary query designating part, synthesizes a query candidate group based on precollected information retrieved from the unstructured database during a non-query process for querying the unstructured database; and
a feedback indicating part that performs relevance feedback which presents the query candidate group synthesized by said query candidate synthesizing part to the user, and performs relevance feedback for establishing a query selected from the thus-presented query candidate group as a query for the execution of document retrieval;
a database which holds relational representation data included in the precollected information; and
a relation expanding/reducing part extracting relation representation data corresponding to said primary query from the relational representation data held in said database, wherein said query candidate synthesizing part synthesizes the candidate group based on relational representation data extracted by said relation expanding/reducing part. - View Dependent Claims (2, 3, 4, 5, 6)
a relation estimating part that estimates a correlation of the words constituting said primary query;
an expanding part that expands the constituent elements of said primary query into a relational representation on the basis of the correlation of the words estimated by said relation estimating part; and
a partial coincidence retrieving part that, on the basis of the relational representation expanded by said expanding part, extracts from said database relational representation data partially coincident with the expanded relational representation.
-
-
4. The document retrieval system according to claim 3, wherein said relation expanding/reducing part further comprises a sample holding part that holds sample data obtained by sampling from said database, and the extraction of the relational representation data by said partial coincidence retrieving part is executed for the sample data held by said sample holding part.
-
5. The document retrieval system according to claim 3, wherein said expanding part classifies the constituent elements of the relational representation of said primary query estimated by said relation estimating part into one or more independent words (W) and relation data (R) showing a correlation of said independent words, and determines an independent word (W) for the retrieval to be executed by said partial coincidence retrieving part, or a combination of the independent word (W) with the relation data (R), and
said partial coincidence retrieving part executes a partial coincidence retrieval on the basis of the independent word (W) or combination of the independent word (W) with relation data (R) determined by said expanding part. -
6. The document retrieval system according to claim 3, wherein the relational representation expanded by said expanding part is a representation corresponding to relational representation data cataloged beforehand as index in said database.
-
7. A document retrieval method for the execution of document retrieval of one or more documents in an unstructured database, comprising:
-
a primary query designating step of designating a primary query as a provisional retrieval expression which is constituted by enumeration of arbitrary words based on an intention of a user;
a query candidate synthesizing step of synthesizing a query candidate group for querying the unstructured database based on information pre-retrieved from the unstructured database during a non-query process and the primary query designated in said primary query designating step;
a feedback step of presenting to the user the query candidate group synthesized in said query candidate synthesizing step and establishing a query selected from the thus-presented query candidate group as a query for the execution of document retrieval; and
a relational representation data extracting step of extracting relational representation data corresponding to said primary query from relational representation data generated based on the pre-retrieved information held in a database, wherein said query candidate synthesizing step synthesizes a query candidate group based on the extracted relational representation data. - View Dependent Claims (8, 9, 10)
a partial coincidence retrieving step which extracts from said database relational representation data partially coincident with the relational representation expanded in said expansion step, on the basis of the expanded relational representation. -
9. The document retrieval method according to claim 8, wherein the extraction of the relational representation data in said partial coincidence retrieving step is executed for sample data held by a sample holding part that holds sample data obtained by sampling from said database.
-
10. The document retrieval method according to claim 8, wherein said expansion step comprises a step of classifying the constituent elements of the relational representation of said primary query estimated in said relation estimating step into one or more independent words (W) and relation data (R) showing a correlation of said independent words and then determining an independent word (W) for the retrieval to be executed by said partial coincidence retrieving part, or a combination of the independent word (W) with the relation data (R),
and said partial coincidence retrieving step executes a partial coincidence retrieval on the basis of the independent word (W) or combination of the independent word (W) with the relation data (R) determined in said expansion step.
-
-
11. A document retrieval system that retrieves one or more documents from an unstructured database, comprising:
-
a primary query designator that receives a primary query that specifies a provisional retrieval expression, the provisional retrieval expression including arbitrary words;
a query candidate synthesizer that synthesizes a candidate query based on the provisional retrieval expression and information pre-retrieved from the unstructured database during a non-query process for querying the unstructured database; and
a feedback indicator that presents the candidate query with relevance information to the user prior to a query execution, and receives acceptance of the candidate query for retrieving one or more documents;
a database which holds relational representation data;
a relation expanding/reducing part extracting relation representation data corresponding to said primary query from the relational representation data held in said database, wherein said query candidate synthesizing part synthesizes the query candidate based on the relational representation data extracted by said relation expanding/reducing part. - View Dependent Claims (13, 14, 15, 16, 17)
a relation estimating part that estimates a correlation of the words constituting said primary query;
an expanding part that expands constituent elements of said primary query into a relational representation based on the correlation of the words estimated by said relation estimating part; and
a partial coincidence retrieving part that, on the basis of the relational representation expanded by said expanding part, extracts from said database relational representation data partially coincident with the expanded relational representation.
-
-
15. The document retrieval system according to claim 14, wherein said relation expanding/reducing part further comprises a sample holding part that holds sample data obtained by sampling from said database, and the extraction of the relational representation data by said partial coincidence retrieving part is executed for the sample data held by said sample holding part.
-
16. The document retrieval system according to claim 14, wherein said expanding part classifies the constituent elements of the relational representation of said primary query estimated by said relation estimating part into one or more independent words (W) and relation data (R) showing a correlation of said independent words, and determines an independent word (W) for the retrieval to be executed by said partial coincidence retrieving part, or a combination of the independent word (W) with the relation data (R), and
said partial coincidence retrieving part executes a partial coincidence retrieval on the basis of the independent word (W) or combination of the independent word (W) with relation data (R) determined by said expanding part. -
17. The document retrieval system according to claim 14, wherein the relational representation expanded by said expanding part is a representation corresponding to relational representation data cataloged beforehand as index in said database.
-
12. A method for document retrieval from an unstructured database, comprising:
-
receiving a primary query that specifies a provisional retrieval expression which includes arbitrary words;
synthesizing a candidate query for querying the unstructured database based on information pre-retrieved from the unstructured database during a non-query process and the provisional retrieval expression;
presenting to a user a candidate query with relevance information prior to a query execution;
receiving an acceptance of the candidate query; and
extracting relational representation data corresponding to said primary query from relational representation data generated based on the pre-retrieved information, wherein the candidate query is synthesized based on extracted relational representation data. - View Dependent Claims (18, 19, 20)
a partial coincidence retrieving step which extracts from said database relational representation data partially coincident with the relational representation expanded in said expansion step, on the basis of the expanded relational representation. -
19. The method for document retrieval according to claim 18, wherein the extraction of the relational representation data in said partial coincidence retrieving step is executed for sample data held by a sample holding part that holds sample data obtained by sampling from said database.
-
20. The method for document retrieval according to claim 18, wherein said expansion step comprises a step of classifying the constituent elements of the relational representation of said primary query estimated in said relation estimating step into one or more independent words (W) and relation data (R) showing a correlation of said independent words and then determining an independent word (W) for the retrieval to be executed by said partial coincidence retrieving step, or a combination of the independent word (W) with the relation data (R), and
said partial coincidence retrieving step executes a partial coincidence retrieval on the basis of the independent word (W) or combination of the independent word (W) with relation data (R) determined in said expansion step.
-
Specification