Language ambiguity detection of text
First Claim
1. A method of detecting a semantic ambiguity, comprising:
- analyzing, using one or more processors, a first sentence of a first natural language text to identify syntactic relationships among constituents of the first sentence;
forming a graph of the constituents of the first sentence based on the syntactic relationships and a lexical-morphological structure of the first sentence;
analyzing the graph to produce a plurality of syntactic structures representing the first sentence;
determining semantic structures corresponding to the syntactic structures;
selecting a first and a second semantic structure among the semantic structures, wherein each of the first and second semantic structures is associated with a corresponding syntactic structure having a rating exceeding a threshold value;
determining a difference between the first and second semantic structures by computing a function of a sum of differences of pairs of semantic classes, such that a first semantic class of each pair belongs to the first semantic structure and a second semantic class of the pair belongs to the second semantic structure;
identifying a semantic ambiguity in the first sentence based on the difference between the first and second semantic structures;
generating a first translation of the first sentence and a second translation of the first sentence based on the semantic ambiguity; and
presenting the first translation and the second translation via a user interface.
4 Assignments
0 Petitions
Accused Products
Abstract
Disclosed are systems, computer-readable mediums, and methods for detecting language ambiguity. A sentence is analyzed to determine syntactic relationships among generalized constituents of the sentence. A graph of the generalized constituents is formed based on syntactic relationships and a lexical-morphological structure of the sentence. The graph is analyzed to determine a plurality of syntactic structures of the sentence. Each syntactic structure is rated on its probability that the syntactic structure is an accurate hypothesis about a full syntactic structure of the sentence. Semantic structures corresponding to the syntactic structures are determined. A first semantic structure and a second semantic structure of the semantic structures are selected, where the first and second semantic structures are different and each have a corresponding syntactic structure having a rating of at least a threshold value. A semantic ambiguity in the sentence is determined based on a difference between the first and second semantic structures.
111 Citations
17 Claims
-
1. A method of detecting a semantic ambiguity, comprising:
-
analyzing, using one or more processors, a first sentence of a first natural language text to identify syntactic relationships among constituents of the first sentence; forming a graph of the constituents of the first sentence based on the syntactic relationships and a lexical-morphological structure of the first sentence; analyzing the graph to produce a plurality of syntactic structures representing the first sentence; determining semantic structures corresponding to the syntactic structures; selecting a first and a second semantic structure among the semantic structures, wherein each of the first and second semantic structures is associated with a corresponding syntactic structure having a rating exceeding a threshold value; determining a difference between the first and second semantic structures by computing a function of a sum of differences of pairs of semantic classes, such that a first semantic class of each pair belongs to the first semantic structure and a second semantic class of the pair belongs to the second semantic structure; identifying a semantic ambiguity in the first sentence based on the difference between the first and second semantic structures; generating a first translation of the first sentence and a second translation of the first sentence based on the semantic ambiguity; and presenting the first translation and the second translation via a user interface.
-
-
2. The method of claim 1, wherein the semantic structures are language-independent and represent a meaning of the first sentence.
-
3. The method of claim 1, further comprising:
-
identifying, in a second natural language text, a second sentence corresponding to the first sentence of the first natural language text, wherein the second natural language text is a language translation of the first natural language text; determining semantic structures corresponding to the second sentence; determining a difference between the semantic structures corresponding to the first sentence and the semantic structures corresponding to the second sentence; and identifying a semantic ambiguity in the first sentence based on the difference between the semantic structures corresponding to the first sentence and the semantic structures corresponding to the second sentence.
-
-
4. The method of claim 1, further comprising visually marking the semantic ambiguity in the sentence using a graphical user interface.
-
5. The method of claim 4, wherein visually marking the semantic ambiguity in the sentence includes showing at least one of a specific word, sentence, phrase, or paragraph associated with the semantic ambiguity.
-
6. The method of claim 1, wherein nodes of the graph store alternative lexical values for words of the first sentence, and wherein edges of the graph express relationships between the lexical values.
-
7. The method of claim 1, wherein the graph represents the first sentence in its entirety.
-
8. A system for detecting semantic ambiguity, the system comprising one or more processors configured to:
-
analyze a first sentence of a first natural language text to determine syntactic relationships among constituents of the sentence; form a graph of the constituents of the first sentence based on the syntactic relationships and a lexical-morphological structure of the first sentence; analyze the graph to produce a plurality of syntactic structures representing the first sentence; associate each syntactic structure of the plurality of syntactic structures with a rating indicative of a probability of the syntactic structure representing a full syntactic structure of the first sentence; determine semantic structures corresponding to the syntactic structures; select a first and a second semantic structure among the semantic structures, wherein each of the first and second semantic structures is associated with a corresponding syntactic structure having a rating exceeding a threshold value; determine a difference between the first and second semantic structures by computing a function of a sum of differences of pairs of semantic classes, such that a first semantic class of each pair belongs to the first semantic structure and a second semantic class of the pair belongs to the second semantic structure; and identify a semantic ambiguity in the first sentence based on the difference between the first and second semantic structures; generate a first translation of the first sentence and a second translation of the first sentence based on the semantic ambiguity; and present the first translation and the second translation via a user interface.
-
-
9. The system of claim 8, wherein the semantic structures are language-independent and represent a meaning of the first sentence.
-
10. The system of claim 8, wherein the one or more processors are further configured to:
-
identify, in a second natural language text, a second sentence corresponding to the first sentence of the first natural language text, wherein the second natural language text is a language translation of the first natural language text; determine semantic structures corresponding to the sentence of the second text; determine a difference between the semantic structures corresponding to the first sentence and the semantic structures corresponding to the second sentence; and identify a semantic ambiguity in the first sentence based on the determined difference between the semantic structures corresponding to the first sentence and the semantic structures corresponding to the second sentence.
-
-
11. The system of claim 8, wherein the one or more processors are further configured to cause a graphical user interface to visually mark the semantic ambiguity in the sentence.
-
12. The system of claim 8, wherein nodes of the graph store alternative lexical values for words of the first sentence, and wherein edges of the graph express relationships between the lexical values.
-
13. A non-transitory computer-readable storage medium having instructions stored thereon, that when executed by a processor, cause the processor to:
-
analyze a first sentence of a first natural language text to identify syntactic relationships among constituents of the first sentence; form a graph of the constituents of the first sentence based on the syntactic relationships and a lexical-morphological structure of the first sentence; instructions to analyze the graph to produce a plurality of syntactic structures representing the first sentence; associate each syntactic structure of the plurality of syntactic structures with a rating indicative of a probability of the syntactic structure representing a full syntactic structure of the first sentence; determine semantic structures corresponding to the syntactic structures; select a first and a second semantic structure among the semantic structures, wherein each of the first and second semantic structures is associated with a corresponding syntactic structure having a rating exceeding a threshold value; determine a difference between the first and second semantic structures by computing a function of a sum of differences of pairs of semantic classes, such that a first semantic class of each pair belongs to the first semantic structure and a second semantic class of the pair belongs to the second semantic structure; and identify a semantic ambiguity in the first sentence based on the difference between the first and second semantic structures; generate a first translation of the first sentence and a second translation of the first sentence based on the semantic ambiguity; and present the first translation and the second translation via a user interface.
-
-
14. The non-transitory computer-readable storage medium of claim 13, further comprising executable instructions that when executed by the processor, cause the processor to:
-
identify, in a second natural language text, a second sentence corresponding to the first sentence of the first natural language text, wherein the second natural language text is a language translation of the first natural language text; determine semantic structures corresponding to the second sentence; determine a difference between the semantic structures corresponding to the first sentence and the semantic structures corresponding to the second sentence; and identify a semantic ambiguity in the first sentence based on the determined difference between the semantic structures corresponding to the first sentence and the semantic structures corresponding to the second sentence.
-
-
15. The method of claim 1, wherein the sum of differences of pairs of semantic classes is normalized by a product of a first cardinality of a first set of semantic classes of the first semantic structure and a second cardinality of a second set of semantic classes of the second semantic structure.
-
16. The method of claim 15, wherein the first cardinality is different from the second cardinality.
-
17. The method of claim 1, wherein the rating is indicative of a probability of the selected syntactic structure representing the first sentence.
Specification