Method and apparatus for robust efficient parsing
First Claim
1. A method of parsing text in a computing device to form a logical representation of the text, the logical representation having tokens representing non-terminals and words of the text, the method comprising:
- selecting a token;
identifying an integer that represents the selected token; and
utilizing the integer to identify at least one non-terminal token of the logical representation that begins with the selected token.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention provides a method for improving the efficiency of parsing text. Aspects of the invention include representing parse tokens as integers where a portion of the integer indicates the location in which a definition for the token can be found. In a further aspect, an integer representing a token points to an array of tokens that can be activated by the token. In another aspect, a list of pointers to partial parses is created before attempting to parse a next word in the text string. The list of pointers includes pointers to partial parses that are expecting particular semantic tokens.
-
Citations
10 Claims
-
1. A method of parsing text in a computing device to form a logical representation of the text, the logical representation having tokens representing non-terminals and words of the text, the method comprising:
-
selecting a token;
identifying an integer that represents the selected token; and
utilizing the integer to identify at least one non-terminal token of the logical representation that begins with the selected token. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method of parsing text to form a tokenized representation of the text, the method comprising:
-
selecting a word from the text;
forming a partial parse of a token based on the selected word;
identifying an item that is needed to extend the partial parse; and
placing a pointer to the partial parse in a table associated with a next word in the text, the pointer being mapped from the item that is needed to extend the parse. - View Dependent Claims (7, 8, 9)
-
-
10. A method of parsing text to identify a parse structure containing tokens that represent non-terminals and words, the method comprising:
-
converting a selected token into a token ID;
using a first portion of the token ID to identify a table containing definitions for tokens of a same type as the selected token;
using a second portion of the token ID to locate the definition for the selected token in the identified table; and
using the definition for the selected token as part of the method of identifying the parse structure.
-
Specification