Method and system for the representation of multiple analyses in dependency grammar and parser for generating such representation
First Claim
1. Method for unambiguously coding multiple parsing analyses of a natural language word sequence in dependency grammar in which dependencies are defined between pairs of words, each pair consisting of a superordinate word or governor and a thereto related word or dependent, said method comprising the following steps:
- a) determining for each word in said natural language word sequence a word index, representing the rank of order of said word in said sequence, determining for each word all dependents thereof permitted by the grammar and determining the relation between said word and said dependents using a parsing algorithm in combination with a grammar defining all the permitted dependency relations in the language and a dictionary in which all words of the language are stored together with their syntactic interpretation and an interpretation index, representing the rank order of the syntactic interpretation of the word in the dictionary in order to distinguish between multiple syntactic interpretations of said word,b) defining a syntactic network which is represented as a tree consisting of nodes mutually coupled by edges and comprising at least one top node, one or more terminal nodes and eventually a number of intermediate nodes, each node being interpreted as an exclusive OR node serving as a pointer if there is only one alternative and serving as a choice point if there are several alternatives, whereby each of the pointer nodes is assigned to a word of the sequence and each edge is assigned to the syntactic relation between the two nodes coupled by said edge, whereby each node is coded by an identifier which in case of a pointer node is directly related to the entry of a word in the dictionary and in the case of a choice point comprises a list of further identifiers one of which has to be selected.
3 Assignments
0 Petitions
Accused Products
Abstract
Method for unambiguously coding multiple parsing analyses of a natural language word sequence in dependency grammar in which dependencies are defined between pairs of words, each pair consisting of a superordinate word or governor and a thereto related word or dependent. For each word in the sequence a word index is determined, representing the rank of order of said word in the sequence. All possible dependents of each word are determined as well as the relation between the word and the dependents using a parsing algorithm in combination with a grammar and a dictionary in which all words of the language are stored together with their syntactic interpretation and an interpretation index, representing the rank order of the syntactic interpretation of the word in the dictionary in order to distinguish between multiple syntactic interpretations of said word. A syntactic network is determined which is represented as a tree consisting of nodes mutually coupled by edges and comprising at least one top node, one or more terminal nodes and eventually a number of intermediate nodes, each node being interpreted as an exlusive OR node serving as a pointer if there is only one alternative and serving as a choice point if there are several alternatives, whereby each of the pointer nodes is assigned to a word of the sequence and each edge is assigned to the syntactic relation between the two nodes coupled by said edge, whereby each node is coded by an identifier which in case of a pointer node is directly related to the entry of a word in the dictionary and in the case of a choice point comprises a list of further identifiers one of which has to be selected.
-
Citations
7 Claims
-
1. Method for unambiguously coding multiple parsing analyses of a natural language word sequence in dependency grammar in which dependencies are defined between pairs of words, each pair consisting of a superordinate word or governor and a thereto related word or dependent, said method comprising the following steps:
-
a) determining for each word in said natural language word sequence a word index, representing the rank of order of said word in said sequence, determining for each word all dependents thereof permitted by the grammar and determining the relation between said word and said dependents using a parsing algorithm in combination with a grammar defining all the permitted dependency relations in the language and a dictionary in which all words of the language are stored together with their syntactic interpretation and an interpretation index, representing the rank order of the syntactic interpretation of the word in the dictionary in order to distinguish between multiple syntactic interpretations of said word, b) defining a syntactic network which is represented as a tree consisting of nodes mutually coupled by edges and comprising at least one top node, one or more terminal nodes and eventually a number of intermediate nodes, each node being interpreted as an exclusive OR node serving as a pointer if there is only one alternative and serving as a choice point if there are several alternatives, whereby each of the pointer nodes is assigned to a word of the sequence and each edge is assigned to the syntactic relation between the two nodes coupled by said edge, whereby each node is coded by an identifier which in case of a pointer node is directly related to the entry of a word in the dictionary and in the case of a choice point comprises a list of further identifiers one of which has to be selected. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
Specification