Method and system for semantic searching
First Claim
1. A computer-implemented method for facilitating a semantic search, the method comprising:
- identifying a corpora of natural language texts including a plurality of sentences;
performing a syntactic-semantic analysis on each sentence of the plurality of sentences using a linguistic description associated with a language of the sentence, wherein the syntactic-semantic analysis comprises;
generating a graph of generalized constituents for each sentence of the plurality of sentences; and
generating one or more syntactic trees based on the graphs of generalized constituents to represent the corresponding sentences;
generating at least one syntactic structure for each sentence of the plurality of sentences by selecting a best syntactic tree from the generated one or more syntactic trees to represent the at least one syntactic structure of the sentence;
generating a semantic structure for each sentence of the corpora of natural language texts, based on the generated at least one syntactic structure of the sentence, wherein the semantic structure is language independent and wherein the semantic structure comprises semantic classes, semantemes, deep slots, and non-tree links;
associating the generated syntactic structures and the generated semantic structures with the respective sentences;
creating syntactic index for each meaning of at least one linguistic parameter of each of the generated syntactic structures;
creating a semantic index for each meaning of at least one parameter of the semantic structures;
receiving a search query comprising semantic language-independent terms;
searching the semantic index based on the semantic language-independent terms and the language-independent semantic structures; and
receiving semantic search results from the semantic index, wherein the search results from the corpora of natural language texts includes sentences in different languages.
4 Assignments
0 Petitions
Accused Products
Abstract
A method comprising a preliminary automated analysis of at least one corpus of natural language text is disclosed. For each sentence of a corpus, the method includes performing a syntactic analysis using linguistic descriptions to generate at least one syntactic structure for the sentence, building a semantic structure for the sentence, associating each generated syntactic and semantic structure with the sentence, and saving each structure. For each corpus text that was preliminary analyzed, performing an indexing operation to index lexical meanings and values of linguistic parameters of each syntactic structure and each semantic structure associated with sentences in the corpus text. A semantic search includes at least one automatic preliminary analyzed corpus of sentences comprising searched values of linguistic, syntactic and semantic parameters. Due to a deep semantic analysis of a corpus, the search may be executed in various languages, in resources of various languages, and in the text of corpora of various languages regardless of the language of the query.
-
Citations
46 Claims
-
1. A computer-implemented method for facilitating a semantic search, the method comprising:
-
identifying a corpora of natural language texts including a plurality of sentences; performing a syntactic-semantic analysis on each sentence of the plurality of sentences using a linguistic description associated with a language of the sentence, wherein the syntactic-semantic analysis comprises; generating a graph of generalized constituents for each sentence of the plurality of sentences; and generating one or more syntactic trees based on the graphs of generalized constituents to represent the corresponding sentences; generating at least one syntactic structure for each sentence of the plurality of sentences by selecting a best syntactic tree from the generated one or more syntactic trees to represent the at least one syntactic structure of the sentence; generating a semantic structure for each sentence of the corpora of natural language texts, based on the generated at least one syntactic structure of the sentence, wherein the semantic structure is language independent and wherein the semantic structure comprises semantic classes, semantemes, deep slots, and non-tree links; associating the generated syntactic structures and the generated semantic structures with the respective sentences; creating syntactic index for each meaning of at least one linguistic parameter of each of the generated syntactic structures; creating a semantic index for each meaning of at least one parameter of the semantic structures; receiving a search query comprising semantic language-independent terms; searching the semantic index based on the semantic language-independent terms and the language-independent semantic structures; and receiving semantic search results from the semantic index, wherein the search results from the corpora of natural language texts includes sentences in different languages. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A computer-implemented method for providing a result of a search, the method comprising:
-
for each sentence of a corpus of texts, the corpus of texts including a plurality of sentences, generating at least one syntactic structure for each sentence of the plurality of sentences, using at least one linguistic description associated with a source natural language by performing a syntactic-semantic analysis of the sentences comprising; generating graph of generalized constituents for each sentence of the plurality of sentences; generating one or more syntactic trees based on the graphs of generalized constituents to represent the corresponding sentences; and selecting a best syntactic tree from the generated one or more syntactic trees to represent the at least one syntactic structure; building a language-independent semantic structure for each said sentence based on the at least one syntactic structure of the sentence, wherein the semantic structure comprises semantic classes, semantemes, deep slots, and non-tree links; associating each generated syntactic structure and each language independent semantic structure with a respective sentence; indexing at least one meaning of linguistic parameters associated with each sentence; indexing at least one lexical meaning associated with each lexical unit of each sentence; indexing at least one value associated with linguistic parameters related to a syntactic structure of each sentence; indexing at least one value associated with semantic parameters related to the language-independent semantic structure of each sentence; and receiving a search query comprising semantic language-independent terms; searching the index of at least one value associated with semantic parameters based on the semantic language-independent terms and the language-independent semantic structures; and receiving semantic search results from the index of at least one value associated with semantic parameters, wherein the search results include sentences from the corpus of text in different languages. - View Dependent Claims (16, 17, 18)
-
-
19. A system for facilitating a semantic search, the system comprising:
-
a first processor and a computer readable memory; a corpus of natural language texts including a plurality of sentences; an analyzer in the computer readable memory configured to; perform a syntactic-semantic analysis on each sentence of the plurality of sentences using a linguistic description associated with a language of the sentence wherein the syntactic-semantic analysis comprises; generating a graph of generalized constituents for each sentence of the plurality of sentences; and generating one or more syntactic trees based on the graphs of generalized constituents to represent the corresponding sentences; generating at least one syntactic structure for each sentence of the plurality of sentences by selecting a best syntactic tree from the generated one or more syntactic trees to represent the at least one syntactic structure of the sentence; generating a semantic structure for each sentence of the corpus of natural language texts, based on the generated at least one syntactic structure of the sentence, wherein the semantic structure is language independent and wherein the semantic structure comprises semantic classes, semantemes, deep slots, and non-tree links; and associate the generated syntactic structures and the generated semantic structures with the respective sentences; an index generation component configured to; create a syntactic index for each meaning of at least one linguistic parameter of the generated syntactic structures; and create a semantic index for each meaning of at least one parameter of the language independent semantic structures; and the first processor configured to; receive a search query comprising semantic language-independent terms; search the semantic index of at least one value associated with semantic parameters based on the semantic language-independent terms and the language-independent semantic structures; and receive semantic search results from the semantic index of at least one value associated with semantic parameters, wherein the search results include sentences from the corpus of text in different languages. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)
-
-
33. One or more non-transitory computer-readable media having computer-executable instructions, which when executed perform steps, comprising:
-
instructions for identifying a corpus of natural language texts including a plurality of sentences; instructions for performing a syntactic-semantic analysis on each sentence of the plurality of sentences using a linguistic description associated with a language of the sentence, wherein the syntactic-semantic analysis comprises; generating graph of generalized constituents for each sentence of the plurality of sentences; and generating one or more syntactic trees based on the graphs of generalized constituents to represent the corresponding sentences; generating at least one syntactic structure for each sentence of the plurality of sentences by selecting a best syntactic tree from the generated one or more syntactic trees to represent the at least one syntactic structure of the sentence; generating a semantic structure for each sentence of the corpus of natural language texts, wherein the semantic structure is language independent and wherein the semantic structure comprises semantic classes, semantemes, deep slots, and non-tree links; instructions for associating the language independent semantic structure with a respective sentence; instructions for creating a syntactic index for each meaning of at least one linguistic parameter of the generated syntactic structures; instructions for creating a semantic index for each meaning of at least one parameter of the language independent semantic structures; instructions for receiving a search query comprising semantic language-independent terms; instructions for searching the semantic index of at least one value associated with semantic parameters based on the semantic language-independent terms and the language-independent semantic structures; and instructions for receiving semantic search results from the semantic index of at least one value associated with semantic parameters, wherein the search results include sentences from the corpus of text in different languages. - View Dependent Claims (34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46)
-
Specification