METHOD AND APPARATUS FOR NAMED ENTITY RECOGNITION IN NATURAL LANGUAGE
First Claim
1. A method for recognizing a named entity included in natural language, comprising the steps of:
- recognizing candidate named entities by a gradual recognizer;
extracting character-based global features of the recognized candidate named entities by a refusal recognizer;
testing said candidate named entities by using said global features; and
if a score of the testing is larger than a predetermined threshold, accepting said candidate named entity, otherwise, refusing the recognition thereof.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention provides a method for recognizing a named entity included in natural language, comprising the steps of: performing gradual parsing model training with the natural language to obtain a classification model; performing gradual parsing and recognition according to the obtained classification model to obtain information on positions and types of candidate named entities; performing a refusal recognition process for the candidate named entities; and generating a candidate named entity lattice from the refusal-recognition-processed candidate named entities, and searching for a optimal path. The present invention uses a one-class classifier to score or evaluate these results to obtain the most reliable beginning and end borders of the named entities on the basis of the forward and backward parsing and recognizing results obtained only by using the local features.
183 Citations
23 Claims
-
1. A method for recognizing a named entity included in natural language, comprising the steps of:
-
recognizing candidate named entities by a gradual recognizer; extracting character-based global features of the recognized candidate named entities by a refusal recognizer; testing said candidate named entities by using said global features; and if a score of the testing is larger than a predetermined threshold, accepting said candidate named entity, otherwise, refusing the recognition thereof. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method for recognizing a named entity included in natural language, comprising the steps of:
-
extracting local features of words or characters included in the center of a feature window by using the feature window; on the basis of a classification model obtained after training the natural language with a gradual parsing model, gradually parsing and recognizing the natural language so as to obtain information on positions and types of candidate named entities; extracting global features of the candidate named entities included in the center of the feature window by using the feature window, and; performing a refusal recognition process for the candidate named entities; and generating a candidate named entity lattice from the refusal-recognition-processed candidate named entities, and searching for an optimal path. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. An off-line training method for recognizing a named entity included in natural language, comprising the steps of:
-
performing forward gradual parsing model training to a natural sentence to obtain a forward gradual classification model; performing backward gradual parsing model training to the natural sentence to obtain a backward gradual classification model; and performing refusal recognition model training to candidate named entities based on the obtained forward and backward gradual classification models to obtain a refusal recognition model.
-
-
19. An on-line recognizing method for recognizing a named entity included in natural language, comprising the steps of:
-
recognizing the natural language with a forward gradual classification model to obtain a forward recognition result; recognizing the natural language with a backward gradual classification model to obtain a backward recognition result; generating a candidate named entity lattice according to said forward and backward recognition results; and calculating an optimal path according to said generated candidate named entity lattice to output a named entity.
-
-
20. An off-line training system for recognizing a named entity included in natural language, comprising:
-
a local feature extracting device for having a supplied training text generate a named entity training sample denoted by a feature vector and a sample token; a multiple-class support vector machine training device for training the training text to generate a gradual classification model; a global feature extracting device for having the named entity training sample generate a character-based refusal recognition training sample denoted by a feature vector and a sample token; a one-class support vector machine training device for refusal-recognition-training the obtained refusal recognition training sample to generate a refusal recognition classification model; a training sample memory for storing the training texts used during the training.
-
-
21. An on-line recognizing system for recognizing a named entity included in natural language, comprising:
-
a local feature extracting device for having a provided training sample generate a local feature vector; a multiple-class support vector machine recognizing device for recognizing the input text according to the local feature vector of the sample to obtain candidate named entities; a global feature extracting device for extracting a global feature vector of the candidate named entities and the contexts thereof; and a one-class support vector machine recognizing device for recognizing the input candidate named entities according to the global feature vector of the sample, wherein said multiple-class support vector machine recognizing device uses a multiple-class classification model to test the input local feature vector so as to obtain its type token, and forms a candidate named entity according to a series of beginning and continuing tokens belonging to the same type of named entity; and
said one-class support vector machine recognizing device uses a one-class classification model to test the input global feature vector so as to obtain its testing score, substrates different thresholds from the obtained testing score to obtain a refusal recognition score, searches for an optimal path according to the refusal recognition score, and accepts the candidate named entities on the optimal path. - View Dependent Claims (22, 23)
-
Specification