Method and system for the representation of multiple analyses in dependency grammar and parser for generating such representation

US 5,060,155 A
Filed: 01/31/1990
Issued: 10/22/1991
Est. Priority Date: 02/01/1989
Status: Expired due to Fees

First Claim

Patent Images

1. Method for unambiguously coding multiple parsing analyses of a natural language word sequence in dependency grammar in which dependencies are defined between pairs of words, each pair consisting of a superordinate word or governor and a thereto related word or dependent, said method comprising the following steps:

a) determining for each word in said natural language word sequence a word index, representing the rank of order of said word in said sequence, determining for each word all dependents thereof permitted by the grammar and determining the relation between said word and said dependents using a parsing algorithm in combination with a grammar defining all the permitted dependency relations in the language and a dictionary in which all words of the language are stored together with their syntactic interpretation and an interpretation index, representing the rank order of the syntactic interpretation of the word in the dictionary in order to distinguish between multiple syntactic interpretations of said word,b) defining a syntactic network which is represented as a tree consisting of nodes mutually coupled by edges and comprising at least one top node, one or more terminal nodes and eventually a number of intermediate nodes, each node being interpreted as an exclusive OR node serving as a pointer if there is only one alternative and serving as a choice point if there are several alternatives, whereby each of the pointer nodes is assigned to a word of the sequence and each edge is assigned to the syntactic relation between the two nodes coupled by said edge, whereby each node is coded by an identifier which in case of a pointer node is directly related to the entry of a word in the dictionary and in the case of a choice point comprises a list of further identifiers one of which has to be selected.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Method for unambiguously coding multiple parsing analyses of a natural language word sequence in dependency grammar in which dependencies are defined between pairs of words, each pair consisting of a superordinate word or governor and a thereto related word or dependent. For each word in the sequence a word index is determined, representing the rank of order of said word in the sequence. All possible dependents of each word are determined as well as the relation between the word and the dependents using a parsing algorithm in combination with a grammar and a dictionary in which all words of the language are stored together with their syntactic interpretation and an interpretation index, representing the rank order of the syntactic interpretation of the word in the dictionary in order to distinguish between multiple syntactic interpretations of said word. A syntactic network is determined which is represented as a tree consisting of nodes mutually coupled by edges and comprising at least one top node, one or more terminal nodes and eventually a number of intermediate nodes, each node being interpreted as an exlusive OR node serving as a pointer if there is only one alternative and serving as a choice point if there are several alternatives, whereby each of the pointer nodes is assigned to a word of the sequence and each edge is assigned to the syntactic relation between the two nodes coupled by said edge, whereby each node is coded by an identifier which in case of a pointer node is directly related to the entry of a word in the dictionary and in the case of a choice point comprises a list of further identifiers one of which has to be selected.

Citations

7 Claims

1. Method for unambiguously coding multiple parsing analyses of a natural language word sequence in dependency grammar in which dependencies are defined between pairs of words, each pair consisting of a superordinate word or governor and a thereto related word or dependent, said method comprising the following steps:
- a) determining for each word in said natural language word sequence a word index, representing the rank of order of said word in said sequence, determining for each word all dependents thereof permitted by the grammar and determining the relation between said word and said dependents using a parsing algorithm in combination with a grammar defining all the permitted dependency relations in the language and a dictionary in which all words of the language are stored together with their syntactic interpretation and an interpretation index, representing the rank order of the syntactic interpretation of the word in the dictionary in order to distinguish between multiple syntactic interpretations of said word,b) defining a syntactic network which is represented as a tree consisting of nodes mutually coupled by edges and comprising at least one top node, one or more terminal nodes and eventually a number of intermediate nodes, each node being interpreted as an exclusive OR node serving as a pointer if there is only one alternative and serving as a choice point if there are several alternatives, whereby each of the pointer nodes is assigned to a word of the sequence and each edge is assigned to the syntactic relation between the two nodes coupled by said edge, whereby each node is coded by an identifier which in case of a pointer node is directly related to the entry of a word in the dictionary and in the case of a choice point comprises a list of further identifiers one of which has to be selected.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. Method according to claim 1, wherein the syntactic network comprises at least one tree structure each consisting of a top node functioning as governor for at least one dependent, said dependents being selected from the group consisting of lexical nodes and top nodes of subtree structures whereby a tree identifier is added to each top node of a tree or subtree structure, said tree identifier consisting of the node identifier referring to the word from the sequence assigned to said node combined with a duplicate index.
  - 3. Method according to claim 2, wherein the node identifier for each node comprises the word index and interpretation index of the word assigned to said node.
  - 4. Method according to claim 3, wherein the top node of the first tree of the network functions as a network entry node and that the node identifier of said top node comprises a unique combination of a word index and interpretation index, not assigned to any word in the dictionary, and that said top node furthermore comprises a list of reference to at least one of the tree structures in the syntactic network.
  - 5. Method according to claim 4, wherein the references in the list are arranged in a predetermined order.
  - 6. Method according to claim 1, wherein in step a) the words of the sequence are scanned in their natural order, whereby each word belonging to a syntactic category in the dependency grammar and taking dependents is considered as a possible governor and whereby for each possible governor the sequence is searched in the natural order for dependents, the syntactic relation between a possible governor and each dependent is determined and a subtree is added to the governor with a pointer node containing the word index of the dependent, whereby if the dependent is optional the pointer node also contains a reference to the null tree, whereafter a consistence check procedure is carried out determining for each dependent whether or not it has more than one possible governor and determining whether the dependent is optional for all governors, in which case all pointers to the dependent will be changed into choice points, or if the dependent is obligatory for one of the governors in which case only for this governor the pointer to the dependent is maintained and all other pointers to the dependent are deleted.
  - 7. Method according to claim 6, characterized in that if a dependent is obligatory for more than one governor an error correction procedure will be initiated.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
BSO/Beheer B.V.
Original Assignee
Bso/Buro VOOR Systeemontwikkeling BV
Inventors
van Zuijlen, Job M.
Primary Examiner(s)
Shaw, Dale M.
Assistant Examiner(s)
Chung, Xuong M.

Application Number

US07/472,831
Time in Patent Office

629 Days
Field of Search

364/419, 364/900 MS File, 364/200 MS File, 364/943, 364/274.8, 364/286.2
US Class Current

704/9
CPC Class Codes

G06F 40/205   Parsing

G06F 40/211   Syntactic parsing, e.g. bas...

G06F 40/279   Recognition of textual enti...

G06F 40/289   Phrasal analysis, e.g. fini...

Method and system for the representation of multiple analyses in dependency grammar and parser for generating such representation

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

7 Claims

Specification

Solutions

Use Cases

Quick Links

Method and system for the representation of multiple analyses in dependency grammar and parser for generating such representation

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

7 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links