Method and apparatus for processing natural language using tape-intersection
First Claim
Patent Images
1. In a system for processing natural language, a method for intersecting tapes of a first multi-tape automaton (MTA) and a second MTA, with each MTA having a plurality of tapes and a plurality of paths, comprising:
- (a) computing a cross-product MTA using the first MTA and the second MTA;
(b) generating string tuples for paths of the cross-product MTA;
(c) for each string tuple generated at (b), evaluating whether the string of a first tape equals the string of a second tape;
(d) for each string tuple evaluated at (c) having equal strings at the first and second tapes, retaining the corresponding string tuple in the cross-product MTA;
(e) for each string tuple evaluated at (c) having unequal strings at the first and second tapes, restructuring the cross-product MTA to remove the corresponding string tuple;
(f) removing redundant strings in the string tuples retained in the cross-product MTA at (d) to produce an output MTA.
1 Assignment
0 Petitions
Accused Products
Abstract
Operations for weighted and non-weighted multi-tape automata are described for use in natural language processing tasks such as morphological analysis, disambiguation, and entity extraction.
-
Citations
25 Claims
-
1. In a system for processing natural language, a method for intersecting tapes of a first multi-tape automaton (MTA) and a second MTA, with each MTA having a plurality of tapes and a plurality of paths, comprising:
-
(a) computing a cross-product MTA using the first MTA and the second MTA;
(b) generating string tuples for paths of the cross-product MTA;
(c) for each string tuple generated at (b), evaluating whether the string of a first tape equals the string of a second tape;
(d) for each string tuple evaluated at (c) having equal strings at the first and second tapes, retaining the corresponding string tuple in the cross-product MTA;
(e) for each string tuple evaluated at (c) having unequal strings at the first and second tapes, restructuring the cross-product MTA to remove the corresponding string tuple;
(f) removing redundant strings in the string tuples retained in the cross-product MTA at (d) to produce an output MTA. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. In a system for processing natural language, a method for intersecting a first tape of a first multi-tape automaton (MTA) and a second tape of a second MTA, with each MTA having a plurality of tapes and a plurality of paths, comprising:
-
defining a simulated filter automaton (SFA) that controls how epsilon-transitions are composed along pairs of paths in the first MTA and the second MTA;
building an output MTA by;
(a) creating an initial state from the initial states of the first MTA, the second MTA, and the SFA;
(b) intersecting a selected outgoing transition of the first MTA with a selected outgoing transition of the second MTA, where each outgoing transition has a source state, a target state, and a label;
(c) if the label of the first tape of the selected outgoing transition of the first MTA equals the label of the second tape of the selected outgoing transition of the second MTA, creating (i) a transition in the output MTA whose label results from pairing the labels of the selected outgoing transitions, and (ii) a target state corresponding to the target states of the selected outgoing transitions and the initial state of the SFA;
(d) if an epsilon transition is encountered on the first tape, creating a transition in the output MTA with a target state that is a function of (i) the target state of the outgoing transition of the first MTA, (ii) the source state of the outgoing transition of the second MTA, and (iii) a first non-initial state of the SFA;
(e) if an epsilon transition is encountered on the second tape, creating a transition in the output MTA with a target state that is a function of (i) the source state of the outgoing transition of the first MTA, (ii) the target state of the outgoing transition of the second MTA, and (iii) a second non-initial state of the SFA; and
(f) repeating (b)-(e) for each outgoing transition of the first MTA and the second MTA. - View Dependent Claims (13, 14, 15, 16, 17, 18)
-
-
19. In a system for processing natural language, a method for intersecting tapes of a first multi-tape automaton (MTA) and a second MTA, with each MTA having a plurality of tapes and a plurality of paths, comprising:
-
(a) composing the first MTA and the second MTA by intersecting a first tape of the first MTA with a first tape of the second MTA to produce an output MTA;
the first tape of the first MTA and the first tape of the second MTA corresponding to a first intersected tape and a second intersected tape of the output MTA, respectively; and
(b) removing at least one of the first and the second intersected tapes from the output MTA while preserving all its other tapes without modification. - View Dependent Claims (20, 21, 22, 23, 24, 25)
-
Specification