Systems and methods for generating weighted finite-state automata representing grammars
First Claim
1. A method for converting a context-free grammar to a finite-state automaton representing the context-free grammar, the context-free grammar having non-terminal symbols and terminal symbols and having a set of rules, the method comprising:
- generating a first finite-state automaton from the set of rules;
generating, from the first finite-state automaton, at least one second finite-state automaton, each second finite state automaton defining a delayed acceptor for a plurality of the non-terminal symbols of the context free grammar;
receiving a topology that defines an application of the context-free grammar;
generating a third finite-state automaton that represents the received topology; and
expanding the third finite-state automaton based on at least one of the at least one generated second finite-state automaton.
6 Assignments
0 Petitions
Accused Products
Abstract
A context-free grammar can be represented by a weighted finite-state transducer. This representation can be used to efficiently compile that grammar into a weighted finite-state automaton that accepts the strings allowed by the grammar with the corresponding weights. The rules of a context-free grammar are input. A finite-state automaton is generated from the input rules. Strongly connected components of the finite-state automaton are identified. An automaton is generated for each strongly connected component. A topology that defines a number of states, and that uses active ones of the non-terminal symbols of the context-free grammar as the labels between those states, is defined. The topology is expanded by replacing a transition, and its beginning and end states, with the automaton that includes, as a state, the symbol used as the label on that transition. The topology can be fully expanded or dynamically expanded as required to recognize a particular input string.
196 Citations
58 Claims
-
1. A method for converting a context-free grammar to a finite-state automaton representing the context-free grammar, the context-free grammar having non-terminal symbols and terminal symbols and having a set of rules, the method comprising:
-
generating a first finite-state automaton from the set of rules; generating, from the first finite-state automaton, at least one second finite-state automaton, each second finite state automaton defining a delayed acceptor for a plurality of the non-terminal symbols of the context free grammar; receiving a topology that defines an application of the context-free grammar; generating a third finite-state automaton that represents the received topology; and expanding the third finite-state automaton based on at least one of the at least one generated second finite-state automaton. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53)
-
-
54. A system that generates a finite-state automaton from a set of rules of a grammar, comprising:
-
a first finite-state automaton generating circuit, routine or application that generates a first finite-state transducer from the set of rules; a dependency graph generating circuit, routine or application that generates a dependency graph from the first finite-state transducer; a strongly connected component identifying circuit, routine or application that identifies strongly connected components of the dependency graph; and a second finite-state automaton generating circuit, routine or application that generates at least one finite-state automaton based on the at least one identified strongly connected component of the dependency graph. - View Dependent Claims (55, 56, 57, 58)
-
Specification