Method for synthesizing a self-learning system for extraction of knowledge from textual documents for use in search
First Claim
1. A method for synthesizing a self-learning system for extraction of knowledge in a given natural language from textual documents for use in search systems, comprising the following steps:
- providing a self-learning mechanism in a form of a stochastically indexed artificial intelligence system, which system is based on application of unique combinations of binary signals of stochastic information indices;
automatically instructing the system on grammatical and semantic analysis rules by using equivalent transformations of stochastically indexed text fragments and a logical conclusion, and by forming a linked semantic structures from said fragments and stochastic indexing them for representation in a form of production rules;
carrying out a morphological analysis and a stochastic indexing of linguistic documents in an electronic form in said language, with simultaneous automatic instructing the system on morphological analysis rules;
carrying out a morphological and a syntactical analysis, and a stochastic indexing of textual documents in the electronic form, pertaining to a given theme, in said language, with simultaneous automatic instructing the system on syntactical analysis rules;
carrying out a semantic analysis of the stochastically indexed textual documents in the electronic form, pertaining to the given theme, with simultaneous automatic instructing the system on semantic analysis rules;
forming a user'"'"'s request in the given natural language and transforming it in the electronic form after stochastically indexing thereof as an interrogative sentence;
transforming the user'"'"'s request in a stochastically indexed form into a set of new requests equivalent to said user'"'"'s request;
carrying out a preliminary selection, based on the user'"'"'s request, stochastically indexed fragments of textual documents in the electronic form, comprising all word combinations of said new requests;
generating a stochastically indexed semantic structure from said stochastically indexed fragments of textual documents;
basing on said structure, generating a brief reply from the system by the logical conclusion providing a link between stochastically indexed fragments of textual documents, and equivalent transformation of texts;
checking a relevancy of said brief reply to the user'"'"'s request by generating an interrogative sentence from said brief reply, and comparing generated interrogative sentence with the user'"'"'s request;
wherein when the generated interrogative sentence is identical to the user'"'"'s request, confirming the relevancy of said brief reply to the user'"'"'s request, and presenting said brief reply to the user in the given natural language.
1 Assignment
0 Petitions
Accused Products
Abstract
The invention relates to computer science, information-search and intelligent systems, and can be used in developing information-search and other information and intelligent systems that operate on the basis of Internet. The invention provides the possibility of automatic creation of knowledge by extraction of knowledge from textual documents in electronic form in different languages; intelligent processing of textual information and users'"'"' requests to extract knowledge in any foreign language. The claimed method provides a mechanism of self-learning in the form of a stochastically indexed system of artifical intelligence, providing automatic instruction of the system in rules of grammatical and semantic analysis. The method includes creating databases of stochastically indexed dictionaries, tables of indices of linguistic texts and knowledge bases of morphological analysis; performing morphological and syntactical analysis, and also stochastic indexing of textual documents in respect to a given theme from the search system in a given language, and creating knowledge base of syntactical analysis. Stochastically indexed textual documents pertaining to the given theme are subjected to semantic analysis, and knowledge bases of semantic analysis. A user'"'"'s request is compiled and transformed, in the stochastically indexed form, into a plurality of new requests that are equivalent to the original request; and stochastically indexed fragments of textual documents that comprise all word combinations of the transformed request are selected. A stochastically indexed structure is generated from the selected documents and basing on said structure by means of logical conclusion a brief reply of the system is generated. Relevancy of the obtained brief reply is checked by generating an interrogative sentence based on said reply, and by comparing said sentence with the request. When the user'"'"'s request is identical to the obtained interrogative sentence, the decision is made that the brief reply of the system is identical to the request, and the reply is submitted to the user.
-
Citations
20 Claims
-
1. A method for synthesizing a self-learning system for extraction of knowledge in a given natural language from textual documents for use in search systems, comprising the following steps:
-
providing a self-learning mechanism in a form of a stochastically indexed artificial intelligence system, which system is based on application of unique combinations of binary signals of stochastic information indices;
automatically instructing the system on grammatical and semantic analysis rules by using equivalent transformations of stochastically indexed text fragments and a logical conclusion, and by forming a linked semantic structures from said fragments and stochastic indexing them for representation in a form of production rules;
carrying out a morphological analysis and a stochastic indexing of linguistic documents in an electronic form in said language, with simultaneous automatic instructing the system on morphological analysis rules;
carrying out a morphological and a syntactical analysis, and a stochastic indexing of textual documents in the electronic form, pertaining to a given theme, in said language, with simultaneous automatic instructing the system on syntactical analysis rules;
carrying out a semantic analysis of the stochastically indexed textual documents in the electronic form, pertaining to the given theme, with simultaneous automatic instructing the system on semantic analysis rules;
forming a user'"'"'s request in the given natural language and transforming it in the electronic form after stochastically indexing thereof as an interrogative sentence;
transforming the user'"'"'s request in a stochastically indexed form into a set of new requests equivalent to said user'"'"'s request;
carrying out a preliminary selection, based on the user'"'"'s request, stochastically indexed fragments of textual documents in the electronic form, comprising all word combinations of said new requests;
generating a stochastically indexed semantic structure from said stochastically indexed fragments of textual documents;
basing on said structure, generating a brief reply from the system by the logical conclusion providing a link between stochastically indexed fragments of textual documents, and equivalent transformation of texts;
checking a relevancy of said brief reply to the user'"'"'s request by generating an interrogative sentence from said brief reply, and comparing generated interrogative sentence with the user'"'"'s request;
wherein when the generated interrogative sentence is identical to the user'"'"'s request, confirming the relevancy of said brief reply to the user'"'"'s request, and presenting said brief reply to the user in the given natural language. - View Dependent Claims (3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
2. A method for synthesizing a self-learning system for extraction of knowledge in any given natural language from textual documents for use in search systems, comprising the following steps:
-
providing a self-learning mechanism in a form of a stochastically indexed artificial intelligence system, which system is based on application of unique combinations of binary signals of stochastic information indices for stochastic indexing and search for linguistic texts fragments in a given base language, comprising description of grammatical and semantic analysis procedures, and automatically instructing the system on grammatical and semantic analysis rules by using equivalent transformations of stochastically indexed linguistic text fragments and a logical conclusion, and by forming linked semantic structures from said fragments and stochastic indexing said structures for representation in a form of production rules;
carrying out a morphological analysis and a stochastic indexing of linguistic documents in an electronic form in the given base language, while simultaneous automatic instructing the system on morphological analysis rules, building a database of stochastically indexed dictionaries and tables of linguistic text indices for each given foreign language, and a knowledge base of morphological analysis, containing production rules for the base language and each given foreign language;
carrying out a morphological and a syntactical analysis, and a stochastic indexing of textual documents in the electronic form, on a given theme, in each given foreign language, from the search system, representing said documents as tables of indices of textual documents and storing said documents in bases of stochastically indexed texts, while simultaneous automatically instructing the system on syntactical analysis rules using the stochastically indexed linguistic texts in the base language, and building a knowledge base of syntactical analysis for the base language and each given foreign language;
carrying out a semantic analysis of said stochastically indexed textual documents in the electronic form, on the given theme, with simultaneous automatically instructing the system on semantic analyses rules, and building a knowledge base of semantic analysis for the base language and each given foreign language;
forming a user'"'"'s request in a natural foreign language and transforming it in the electronic form after the stochastic indexing thereof as an interrogative sentence including an interrogative word combination and word combinations determining semantics of the user'"'"'s request;
transforming the user'"'"'s request in a stochastically indexed form into a set of new requests equivalent to said user'"'"'s request;
carrying out a preliminary selection, based on the user'"'"'s request, stochastically indexed fragments of textual documents in the electronic form, comprising all word combinations of said new requests;
generating a stochastically indexed semantic structure from said stochastically indexed fragments of textual documents;
basing on said structure, generating a brief reply from the system by the logical conclusion providing a link between stochastically indexed fragments of textual documents, and equivalent transformation of the text, which reply contains stochastically indexed word combinations defining the user request semantics, and a reply word group, corresponding to the interrogative word combination of the user request;
checking a relevancy of said brief reply to the user'"'"'s request by replacing the reply word group by the corresponding stochastically indexed interrogative word combination, and comparing a generated interrogative sentence with the user'"'"'s request;
wherein when the generated interrogative sentence is identical to the user'"'"'s request, confirming the relevancy of said brief reply to the user'"'"'s request, and presenting said brief reply to the user in the given foreign language.
-
Specification