Information generation and retrieval method based on standardized format of sentence structure and semantic structure and system using the same
First Claim
1. An apparatus for generating and retrieving information based on standardized formats of sentence structure and semantic structure, the apparatus comprising:
- a data storing means for storing language knowledge data used to analyze a sentence for information supply and a query for information request from a user, semantic representation data for representing sense of sentence as a conceptual graph, and Web documents;
an input means for receiving a natural language query sentence for generation of a natural language sentence for information supply and specification of information request from the user;
an input sentence analyzing means for analyzing sentence structure of the natural language sentence or the natural language query sentence inputted from the user with reference to data stored at the data storing means to generate semantic relation;
semantic structure processing means for partitioning the semantic structure analyzed by the input sentence analyzing means to index and store or for computing semantic relevance to search supply information and document most semantically relevant to the requested information specification;
an interactive processing means for outputting sentence format rule for which failure data from the input sentence analyzing means is corrected depending on the standardized formats of sentence structure and semantic structure, and indexing and searching result; and
an information transferring means for transferring the data from the interactive processing means to the user, wherein the semantic relevance (S(x,y)) is a distance from a node x to another node y in the thesaurus system and can be express as;
where d(x,y) is a distance between the nodes x and y in the thesaurus system, and d(x,y), i.e., the distance from the node x to the node y in the thesaurus system, is 0 if the node y is one of lower nodes and is computed as the number of edges between the nodes if otherwise.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention relates to an information generation and retrieval apparatus based on a standardized format of sentence structure and semantic structure and a method thereof and a computer readable recording medium for recording a program for implementing the method. The method for generating and retrieving information for use in an apparatus for generating and retrieving information based on standardized formats of sentence structure and semantic structure, comprises a first step of transforming a natural language sentence (information and knowledge) described by a information provider to a conceptual graph depending on standardized formats of sentence structure and semantic structure and indexing the conceptual graph; and a second step of transforming a natural language query sentence inputted from a user to a conceptual graph depending on the standardized formats of sentence structure and semantic structure and searching information relevant to the requirement of the user among the indexed information.
-
Citations
12 Claims
-
1. An apparatus for generating and retrieving information based on standardized formats of sentence structure and semantic structure, the apparatus comprising:
-
a data storing means for storing language knowledge data used to analyze a sentence for information supply and a query for information request from a user, semantic representation data for representing sense of sentence as a conceptual graph, and Web documents;
an input means for receiving a natural language query sentence for generation of a natural language sentence for information supply and specification of information request from the user;
an input sentence analyzing means for analyzing sentence structure of the natural language sentence or the natural language query sentence inputted from the user with reference to data stored at the data storing means to generate semantic relation;
semantic structure processing means for partitioning the semantic structure analyzed by the input sentence analyzing means to index and store or for computing semantic relevance to search supply information and document most semantically relevant to the requested information specification;
an interactive processing means for outputting sentence format rule for which failure data from the input sentence analyzing means is corrected depending on the standardized formats of sentence structure and semantic structure, and indexing and searching result; and
an information transferring means for transferring the data from the interactive processing means to the user, wherein the semantic relevance (S(x,y)) is a distance from a node x to another node y in the thesaurus system and can be express as;
where d(x,y) is a distance between the nodes x and y in the thesaurus system, and d(x,y), i.e., the distance from the node x to the node y in the thesaurus system, is 0 if the node y is one of lower nodes and is computed as the number of edges between the nodes if otherwise. - View Dependent Claims (2, 3, 4)
-
-
5. A method for generating and retrieving information for use in an apparatus for generating and retrieving information based on standardized formats of sentence structure and semantic structure, the method comprising the steps of:
-
(a) transforming a natural language sentence (information and knowledge) described by an information provider to a conceptual graph depending on standardized formats of sentence structure and semantic structure and indexing the conceptual graph; and
(b) transforming a natural language query sentence inputted from a user to a conceptual graph depending on the standardized formats of sentence structure and semantic structure and searching information relevant to the natural language query sentence inputted from the user among the indexed information, wherein the natural language sentence (information and knowledge) described by the information provider and the natural language query sentence inputted from the user to the conceptual graph depending on the standardized formats of sentence structure and semantic structure includes the steps of;
(f) morphologically analyzing the natural language sentence by a morphological analyzer when the natural language sentence for information to be provided by the information provider or to be supplied to the information provider and checking whether morphological analysis is performed successfully;
(g) if morphological analysis fails, generating failure type data depending failure type, and, if morphological analysis is performed successfully, analyzing the sentence structure by using the morphological analysis result;
(h) transforming the sentence analysis tree to the semantic relation depending on the generation of the analyzed sentence structure; and
(i) inputting the semantic relation to a conceptual graph transformer depending on appropriateness of the semantic relation for the standardized format and partitioning the conceptual graph;
wherein the step (f) includes the steps of;
f1) initializing a highest node level (d) and depth (N) of the partitioned graph in order to retrieve request information and document of the information provider;
f2) after the initializing step, searching a relation node (n) that belongs to the level (d) of the conceptual graph depending on comparison result for the highest node level (d) and depth (N) of the partitioned graph;
f3) determining language characteristic search priority nodes (c1, c2) and computing semantic relevance (S(x,y)) of each record searched from a table related to the relation node (n) and depending on the priority rule of the language (L1-Ln) for the determined priority nodes (c1, c2); and
f4) depending on computation of the semantic relevance (S(x,y)), increasing the level (d) of the highest node and repeating the step (j), wherein the semantic relevance (S(x,y)) is a distance from a node x to another node y in the thesaurus system and can be express as;
where d(x,y) is a distance between the nodes x and y in the thesaurus system, and d(x,y), i.e., the distance from the node x to the node y in the thesaurus system, is 0 if the node y is one of lower nodes and is computed as the number of edges between the nodes if otherwise. - View Dependent Claims (6, 7, 8, 9, 10, 11)
-
-
12. A computer readable medium for recording a program for implementing, at an information generating and retrieving apparatus based on standardized formats of sentence structure and semantic structure having a processor, the functions of:
-
(a) transforming a natural language sentence (information and knowledge) described by a information provider to a conceptual graph depending on standardized formats of sentence structure and semantic structure and indexing the conceptual graph; and
(b) transforming a natural language query sentence inputted from a user to a conceptual graph depending on the standardized formats of sentence structure and semantic structure and searching information relevant to the requirement of the user among the indexed information, wherein the natural language sentence (information and knowledge) described by the information provider and the natural language query sentence inputted from the user to the conceptual graph depending on the standardized formats of sentence structure and semantic structure includes the steps of;
(f) morphologically analyzing the natural language sentence by a morphological analyzer when the natural language sentence for information to be provided by the information provider or to be supplied to the information provider and checking whether morphological analysis is performed successfully;
(g) if morphological analysis fails, generating failure type data depending failure type, and, if morphological analysis is performed successfully, analyzing the sentence structure by using the morphological analysis result;
(h) transforming the sentence analysis tree to the semantic relation depending on the generation of the analyzed sentence structure; and
(i) inputting the semantic relation to a conceptual graph transformer depending on appropriateness of the semantic relation for the standardized format and partitioning the conceptual graph, wherein the step (f) includes the steps of;
f1) initializing a highest node level (d) and depth (N) of the partitioned graph in order to retrieve request information and document of the information provider;
f2) after the initializing step, searching a relation node (n) that belongs to the level (d) of the conceptual graph depending on comparison result for the highest node level (d) and depth (N) of the partitioned graph;
f3) determining language characteristic search priority nodes (c1, c2) and computing semantic relevance (S(x,y)) of each record searched from a table related to the relation node (n) and depending on the priority rule of the language (L1-Ln) for the determined priority nodes (c1, c2); and
f4) depending on computation of the semantic relevance (S(x,y)), increasing the level (d) of the highest node and repeating the step (j), wherein the semantic relevance (S(x,y)) is a distance from a node x to another node y in the thesaurus system and can be express as;
where d(x,y) is a distance between the nodes x and y in the thesaurus system, and d(x,y), i.e., the distance from the node x to the node y in the thesaurus system, is 0 if the node y is one of lower nodes and is computed as the number of edges between the nodes if otherwise.
-
Specification