Cross-lingual information extraction program
First Claim
1. A method for constructing a cross-lingual information extraction program, the method comprising:
- utilizing at least one processor to execute computer code that performs the steps of;
constructing a plurality of language-specific representations from text expressed in a plurality of languages by parsing the text of each language using a language-specific semantic parser;
mapping the plurality of language-specific representations to a single cross-lingual semantic representation, wherein the cross-lingual semantic representation is language independent, language-invariant, and encompasses the plurality of languages, wherein the cross-lingual semantic representation comprises a graphical representation that identifies the semantic relationship between words contained within a text without regard for language of the text and comprises semantic and syntactic components;
constructing the cross-lingual information extraction program based on the cross-lingual semantic representation, wherein the cross-lingual information extraction program comprises a set of rules created from the cross-lingual semantic representation and wherein the cross-lingual information extraction program is language independent and facilitates extraction of structured information from unstructured text different from the text expressed in a plurality of languages; and
extracting structured information from texts in a language using the constructed cross-lingual information extraction program.
1 Assignment
0 Petitions
Accused Products
Abstract
One embodiment provides method for constructing a cross-lingual information extraction program, the method including: utilizing at least one processor to execute computer code that performs the steps of: constructing a plurality of language-specific representations from text expressed in a plurality of languages by parsing the text of each language using a language-specific semantic parser; mapping the plurality of language-specific representations to a single cross-lingual semantic representation, wherein the cross-lingual semantic representation encompasses the plurality of languages; and constructing the cross-lingual information extraction program based on the cross-lingual semantic representation. Other aspects are described and claimed.
-
Citations
20 Claims
-
1. A method for constructing a cross-lingual information extraction program, the method comprising:
-
utilizing at least one processor to execute computer code that performs the steps of; constructing a plurality of language-specific representations from text expressed in a plurality of languages by parsing the text of each language using a language-specific semantic parser; mapping the plurality of language-specific representations to a single cross-lingual semantic representation, wherein the cross-lingual semantic representation is language independent, language-invariant, and encompasses the plurality of languages, wherein the cross-lingual semantic representation comprises a graphical representation that identifies the semantic relationship between words contained within a text without regard for language of the text and comprises semantic and syntactic components; constructing the cross-lingual information extraction program based on the cross-lingual semantic representation, wherein the cross-lingual information extraction program comprises a set of rules created from the cross-lingual semantic representation and wherein the cross-lingual information extraction program is language independent and facilitates extraction of structured information from unstructured text different from the text expressed in a plurality of languages; and extracting structured information from texts in a language using the constructed cross-lingual information extraction program. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. An apparatus for constructing a cross-lingual information extraction program, the apparatus comprising:
-
at least one processor; and a computer readable storage medium having computer readable program code embodied therewith and executable by the at least one processor, the computer readable program code comprising; computer readable program code that constructs a plurality of language-specific representations from text expressed in a plurality of languages by parsing the text of each language using a language-specific semantic parser; computer readable program code that maps the plurality of language-specific representations to a single cross-lingual semantic representation, wherein the cross-lingual semantic representation is language independent, language-invariant, and encompasses the plurality of languages, wherein the cross-lingual semantic representation comprises a graphical representation that identifies the semantic relationship between words contained within a text without regard for language of the text and comprises semantic and syntactic components; computer readable program code that constructs the cross-lingual information extraction program based on the cross-lingual semantic representation, wherein the cross-lingual information extraction program comprises a set of rules created from the cross-lingual semantic representation and wherein the cross-lingual information extraction program is language independent and facilitates extraction of structured information from unstructured text different from the text expressed in a plurality of languages; and computer readable program code that extracts structured information from texts in a language using the constructed cross-lingual information extraction program.
-
-
10. A computer program product for constructing a cross-lingual information extraction program, the computer program product comprising:
-
a computer readable storage medium having computer readable program code embodied therewith, the computer readable program code comprising; computer readable program code that constructs a plurality of language-specific representations from text expressed in a plurality of languages by parsing the text of each language using a language-specific semantic parser; computer readable program code that maps the plurality of language-specific representations to a single cross-lingual semantic representation, wherein the cross-lingual semantic representation is language independent, language-invariant, and encompasses the plurality of languages, wherein the cross-lingual semantic representation comprises a graphical representation that identifies the semantic relationship between words contained within a text without regard for language of the text and comprises semantic and syntactic components; computer readable program code that constructs the cross-lingual information extraction program based on the cross-lingual semantic representation, wherein the cross-lingual information extraction program comprises a set of rules created from the cross-lingual semantic representation and wherein the cross-lingual information extraction program is language independent and facilitates extraction of structured information from unstructured text different from the text expressed in a plurality of languages; and computer readable program code that extracts structured information from texts in a language using the constructed cross-lingual information extraction program. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
-
-
18. A method for creating a cross-lingual information extraction program, the method comprising:
-
receiving a plurality of phrases, wherein the plurality of phrases comprises phrases each of which is expressed in more than one language; parsing, using a language-specific semantic parser, each of the plurality of languages; constructing a plurality of language-specific representations of each of the parsed plurality of languages; mapping the plurality of language-specific representations to a language-invariant representation, wherein the language-invariant representation is language independent, encompasses the plurality of languages and comprises a graphical representation that identifies the semantic relationship between words contained within a phrase without regard for language of the phrase and comprises semantic and syntactic components; creating the cross-lingual information extraction program using the language-invariant representation, wherein the cross-lingual information extraction program comprises a set of rules created from the language-invariant representation and wherein the cross-lingual information extraction program is language independent and facilitates extraction of structured information from unstructured text different from the text expressed in a plurality of languages; and extracting structured information from texts in a language using the constructed cross-lingual information extraction program. - View Dependent Claims (19, 20)
-
Specification