Method and apparatus for expanding dictionaries during parsing
First Claim
Patent Images
1. A method of parsing text in a corpus, the method comprising:
- hypothesizing a possible new entry for a dictionary based on a first segment of a text;
forming a successful parse of the first segment of said text using the possible new entry;
changing the dictionary to include the new entry based on the successful parse by making an existing entry in the dictionary active; and
using the new entry in the dictionary to parse a second segment of said text.
2 Assignments
0 Petitions
Accused Products
Abstract
A method is provided for parsing text in a corpus. The method includes hypothesizing a possible new entry for a dictionary based on a first segment of text. A successful parse is then formed for the first segment of text using the possible new entry. Based on the successful parse, the dictionary is changed to include the new entry. The new entry in the dictionary is then used to parse a second segment of text.
38 Citations
21 Claims
-
1. A method of parsing text in a corpus, the method comprising:
-
hypothesizing a possible new entry for a dictionary based on a first segment of a text; forming a successful parse of the first segment of said text using the possible new entry; changing the dictionary to include the new entry based on the successful parse by making an existing entry in the dictionary active; and using the new entry in the dictionary to parse a second segment of said text. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer-readable medium having computer-executable instructions for performing steps comprising:
-
hypothesizing a possible attribute for a word in a text segment, the possible attribute not being listed for the word in a dictionary used to form parse structures from text segments; creating a parse token based on the possible attribute for the word; using the parse token to form a parse structure for the text segment through steps comprising; creating a second parse token for the same word as the parse token; assigning a low score to the parse token relative to a score for the second parse token; making the parse token and the second parse token available for forming the parse structure; selecting the parse token instead of the second parse token when forming the parse structure; based on the parse token appearing in the parse structure, adding the possible attribute for the word to a dictionary used to form parse structures; and accessing the dictionary to retrieve the possible attribute for the word in the dictionary as part of forming a parse structure for second segment of said text. - View Dependent Claims (11, 12, 13, 14)
-
-
15. A method of parsing a text having at least two sentences, the method comprising:
-
parsing a first sentence of the text to form a first parse structure by using a dictionary to retrieve an attribute of a word in the first sentence; based on the first parse structure, dynamically modifying the dictionary to form a modified dictionary before parsing another sentence of the text; and using the modified dictionary to parse a second sentence of the text to form a second parse structure. - View Dependent Claims (16, 17, 18)
-
-
19. A dictionary formed through a dynamic process that utilizes a corpus of text comprising at least two sentences, the process comprising:
-
forming an initial dictionary; using the initial dictionary to parse a sentence in the corpus to form a first parse structure; modifying the initial dictionary based on the first parse structure before parsing another sentence from the corpus to form the dictionary. - View Dependent Claims (20, 21)
-
Specification