Method to improve the named entity classification
First Claim
Patent Images
1. A method of providing a named entity classification in a computing system, the method comprising:
- reading, from an LOD (Linking Opening Data) set, an LOD node corresponding to a to-be-classified named entity, wherein the LOD node is associated with a uniform resource identifier and corresponds to a web page, the LOD node further comprising at least a plurality of property entries;
determining a type attribute of the LOD node corresponding to the to-be-classified named entity as a tagged type of the to-be classified named entity;
reading a candidate type;
computing a possibility of the to be-classified named entity belonging to the candidate type, comprising;
mapping the candidate type to a node of an intermediate ontology, wherein the intermediate ontology includes at least a structured correlation of generic-specific relationships, the generic-specific relationships including at least an identical relationship, a homologous relationship and a conflicting relationship;
computing an attribute matching score between the candidate type and the tagged type based on a relationship between the mapped node of intermediate ontology and the candidate type, wherein computing the attribute matching score of a mapped intermediate ontology having an identical type is based on a predetermined value, the attribute matching score of a mapped intermediate ontology having a generic-specific relationship is based on the predetermined value and a first count of nodes between two mapped nodes of the intermediate ontology, and the attribute matching score of a mapped intermediate ontology having a homologue relationship is based on the predetermined value and a second count of nodes between two mapped nodes of the of the intermediate ontology to a common source node and a third count of nodes between two mapped nodes of the of the intermediate ontology to the common source node;
performing statistical processing to attribute matching scores corresponding to the candidate type to obtain a possibility of the to-be classified named entity belonging to the candidate type, wherein performing statistical processing to each attribute matching score corresponding to a same candidate type further comprising;
converting the attribute matching scores to a node matching score based on the correspondence relationship between the attribute matching scores and the LOD node in order to reduce type attribute entry noise;
performing statistical processing to each node matching score corresponding to a same candidate type, thereby obtaining a possibility of the to-be-classified named entity belonging to the candidate type; and
selecting one or more tagged types based on satisfaction of an attribute matching score threshold.
1 Assignment
0 Petitions
Accused Products
Abstract
A method is described for providing a named entity classification in a computing system having a processor, comprising the steps of the processor reading, from an LOD (Linking Opening Data) set, an LOD node corresponding to a to-be-classified named entity. The processor also determining a type attribute of the LOD node corresponding to the to-be-classified named entity as a tagged type of the to-be-classified named entity and further reading a candidate type. Finally, the processor computing, based on the tagged type, a possibility of the to-be-classified named entity belonging to the candidate type.
7 Citations
16 Claims
-
1. A method of providing a named entity classification in a computing system, the method comprising:
-
reading, from an LOD (Linking Opening Data) set, an LOD node corresponding to a to-be-classified named entity, wherein the LOD node is associated with a uniform resource identifier and corresponds to a web page, the LOD node further comprising at least a plurality of property entries; determining a type attribute of the LOD node corresponding to the to-be-classified named entity as a tagged type of the to-be classified named entity; reading a candidate type; computing a possibility of the to be-classified named entity belonging to the candidate type, comprising; mapping the candidate type to a node of an intermediate ontology, wherein the intermediate ontology includes at least a structured correlation of generic-specific relationships, the generic-specific relationships including at least an identical relationship, a homologous relationship and a conflicting relationship; computing an attribute matching score between the candidate type and the tagged type based on a relationship between the mapped node of intermediate ontology and the candidate type, wherein computing the attribute matching score of a mapped intermediate ontology having an identical type is based on a predetermined value, the attribute matching score of a mapped intermediate ontology having a generic-specific relationship is based on the predetermined value and a first count of nodes between two mapped nodes of the intermediate ontology, and the attribute matching score of a mapped intermediate ontology having a homologue relationship is based on the predetermined value and a second count of nodes between two mapped nodes of the of the intermediate ontology to a common source node and a third count of nodes between two mapped nodes of the of the intermediate ontology to the common source node; performing statistical processing to attribute matching scores corresponding to the candidate type to obtain a possibility of the to-be classified named entity belonging to the candidate type, wherein performing statistical processing to each attribute matching score corresponding to a same candidate type further comprising; converting the attribute matching scores to a node matching score based on the correspondence relationship between the attribute matching scores and the LOD node in order to reduce type attribute entry noise; performing statistical processing to each node matching score corresponding to a same candidate type, thereby obtaining a possibility of the to-be-classified named entity belonging to the candidate type; and selecting one or more tagged types based on satisfaction of an attribute matching score threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. An apparatus for named entity classification, comprising:
-
LOD node reading means configured to read an LOD (Linking Opening Data) node corresponding to a to-be-classified named entity from an LOD data set, wherein the LOD node is associated with an uniform resource identifier (URI) and corresponds to a web page, the LOD node further comprising at least a plurality of property entries; tagged type determining means configured to determine a type attribute of the LOD node corresponding to the to-be-classified named entity as a tagged type of the to-be-classified named entity; candidate type reading means configured to read a candidate type; possibility determining means configured to compute, based on the tagged type, a possibility of the to-be-classified named entity belonging to the candidate type, further comprising; mapping means configured to map the candidate type to a node of an intermediate ontology, wherein the intermediate ontology includes at least a structured correlation of generic-specific relationships, the generic-specific relationships including at least an identical relationship, a homologous relationship and a conflicting relationship; computing means configured to compute an attribute matching score between the candidate type and the tagged type based on a relationship between the mapped node of intermediate ontology and the candidate type, wherein computing the attribute matching score of a mapped intermediate ontology having an identical type is based on a predetermined vale, the attribute matching score of a mapped intermediate ontology having a generic-specific relationship is based on the predetermined value and a first count of nodes between two mapped nodes of the intermediate ontology, and the attribute matching score of a mapped intermediate ontology having a homologue relationship is based on the predetermined value and a second count of nodes between two mapped nodes of the of the intermediate ontology to a common source node and a third count of nodes between two mapped nodes of the of the intermediate ontology to the common source node; performing means configured to perform statistical processing to attribute matching scores corresponding to a same candidate type, thereby obtaining a possibility of the to-be classified named entity belonging to the candidate type, wherein performing statistical processing to each attribute matching score corresponding to a same candidate type, thereby obtaining a possibility of the to-be-classified named entity belonging to the candidate type futher comprising; converting means configured to convert the attribute matching scores to a node matching score based on the correspondence relationship between the attribute matching scores and the LOD node in order to reduce type attribute entry noise; performing means configured to perform statistical processing to each node matching score corresponding to a same candidate type, thereby obtaining a possibility of the to-be-classified named entity belonging to the candidate type; and selecting means configured to select one or more tagged types based on satisfaction of an attribute matching score threshold. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
Specification