Method and apparatus for concept-based classification of natural language discourse
First Claim
1. A method, performed by computing hardware and programmable memory, for determining whether a first pinnacle concept is referenced by a first unit of natural language discourse, comprising:
- parsing the first unit of natural language discourse into a first parse structure that represents each sub-unit, of the first unit of natural language discourse, by a node;
adding at least one concept-value pair, each of which indicates a reference to a same non-Quantifier concept, to at least one node of the first parse structure, wherein each reference is determined by identifying an occurrence of a first linguistic feature from a first set of linguistic features and the first set of linguistic features is approximately complete with respect to the non-Quantifier concept;
adding at least one concept-value pair, each of which indicates a reference to a same Quantifier concept, to at least one node of the first parse structure, wherein each reference is determined by identifying an occurrence of a second linguistic feature from a second set of linguistic features and the second set of linguistic features is approximately complete with respect to the Quantifier concept;
propagating the at least one concept-value pair for the Quantifier concept;
identifying a first node of the first parse structure that has at least one concept-value pair for the non-Quantifier concept and at least one concept-value pair for the Quantifier concept;
determining a first value to be scaled from the least one concept-value pair for the non-Quantifier concept;
determining a first scaling value from the least one concept-value pair for the Quantifier concept;
scaling the first value to be scaled with the first scaling value to produce a first scaled value; and
propagating the at least one concept-value pair for the non-Quantifier concept.
18 Assignments
0 Petitions
Accused Products
Abstract
Pinnacle concepts are not amenable to detection by the use of keywords. A unit of natural language discourse (UNLD) “refers” to a pinnacle concept “C” when that UNLD uses linguistic expressions in such a way that “C” is regarded as expressed, used or invoked by an ordinary reader of “L.” A reference can have a “reference level” value that is proportional to: the “strength” with which the pinnacle concept is referenced, the probability that a pinnacle concept is referenced or both strength and probability. Pinnacle concepts can be divided into Quantifiers and non-Quantifiers. A Quantifier can modify the reference level assigned to a non-Quantifier. A concept “C,” that is determined to be referenced by a UNLD “x,” after application of its Quantifiers, is said to be asserted by “x.” Concept-based classification is the identification of whether a pinnacle concept “C” is asserted by a UNLD. Concept-based classification can be used for concept-based search.
-
Citations
34 Claims
-
1. A method, performed by computing hardware and programmable memory, for determining whether a first pinnacle concept is referenced by a first unit of natural language discourse, comprising:
-
parsing the first unit of natural language discourse into a first parse structure that represents each sub-unit, of the first unit of natural language discourse, by a node; adding at least one concept-value pair, each of which indicates a reference to a same non-Quantifier concept, to at least one node of the first parse structure, wherein each reference is determined by identifying an occurrence of a first linguistic feature from a first set of linguistic features and the first set of linguistic features is approximately complete with respect to the non-Quantifier concept; adding at least one concept-value pair, each of which indicates a reference to a same Quantifier concept, to at least one node of the first parse structure, wherein each reference is determined by identifying an occurrence of a second linguistic feature from a second set of linguistic features and the second set of linguistic features is approximately complete with respect to the Quantifier concept; propagating the at least one concept-value pair for the Quantifier concept; identifying a first node of the first parse structure that has at least one concept-value pair for the non-Quantifier concept and at least one concept-value pair for the Quantifier concept; determining a first value to be scaled from the least one concept-value pair for the non-Quantifier concept; determining a first scaling value from the least one concept-value pair for the Quantifier concept; scaling the first value to be scaled with the first scaling value to produce a first scaled value; and propagating the at least one concept-value pair for the non-Quantifier concept. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27)
-
-
28. A data processing system, made with computing hardware and programmable memory, for determining whether a first pinnacle concept is referenced by a first unit of natural language discourse, comprising the following sub-systems:
-
a sub-system configured, as a result of the computing hardware and programmable memory, to accomplish parsing the first unit of natural language discourse into a first parse structure that represents each sub-unit, of the first unit of natural language discourse, by a node; a sub-system configured, as a result of the computing hardware and programmable memory, to accomplish adding at least one concept-value pair, each of which indicates a reference to a same non-Quantifier concept, to at least one node of the the first parse structure, wherein each reference is determined by identifying an occurrence of a first linguistic feature from a first set of linguistic features and the first set of linguistic features is approximately complete with respect to the non-Quantifier concept; a sub-system configured, as a result of the computing hardware and programmable memory, to accomplish adding at least one concept-value pair, each of which indicates a reference to a same Quantifier concept, to at least one node of the first parse structure, wherein each reference is determined by identifying an occurrence of a second linguistic feature from a second set of linguistic features and the second set of linguistic features is approximately complete with respect to the Quantifier concept; a sub-system configured, as a result of the computing hardware and programmable memory, to accomplish propagating the at least one concept-value pair for the Quantifier concept; a sub-system configured, as a result of the computing hardware and programmable memory, to accomplish identifying a first node of the first parse structure that has at least one concept-value pair for the non-Quantifier concept and at least one concept-value pair for the Quantifier concept; a sub-system configured, as a result of the computing hardware and programmable memory, to accomplish determining a first value to be scaled from the least one concept-value pair for the non-Quantifier concept; a sub-system configured, as a result of the computing hardware and programmable memory, to accomplish determining a first scaling value from the least one concept-value pair for the Quantifier concept; a sub-system configured, as a result of the computing hardware and programmable memory, to accomplish scaling the first value to be scaled with the first scaling value to produce a first scaled value; and a sub-system configured, as a result of the computing hardware and programmable memory, to accomplish propagating the at least one concept-value pair for the non-Quantifier concept.
-
-
29. A computer program on a non-transitory computer readable medium, having computer-readable code devices embodied therein, for determining whether a first pinnacle concept is referenced by a first unit of natural language discourse, the computer program comprising
computer readable program code devices configured to accomplish parsing the first unit of natural language discourse into a first parse structure that represents each sub-unit, of the first unit of natural language discourse, by a node; -
computer readable program code devices configured to accomplish adding at least one concept-value pair, each of which indicates a reference to a same non-Quantifier concept, to at least one node of the first parse structure, wherein each reference is determined by identifying an occurrence of a first linguistic feature from a first set of linguistic features and the first set of linguistic features is approximately complete with respect to the non-Quantifier concept; computer readable program code devices configured to accomplish adding at least one concept-value pair, each of which indicates a reference to a same Quantifier concept, to at least one node of the first parse structure, wherein each reference is determined by identifying an occurrence of a second linguistic feature from a second set of linguistic features and the second set of linguistic features is approximately complete with respect to the Quantifier concept; computer readable program code devices configured to accomplish propagating the at least one concept-value pair for the Quantifier concept; computer readable program code devices configured to accomplish identifying a first node of the first parse structure that has at least one concept-value pair for the non-Quantifier concept and at least one concept-value pair for the Quantifier concept; computer readable program code devices configured to accomplish determining a first value to be scaled from the least one concept-value pair for the non-Quantifier concept; computer readable program code devices configured to accomplish determining a first scaling value from the least one concept-value pair for the Quantifier concept; computer readable program code devices configured to accomplish scaling the first value to be scaled with the first scaling value to produce a first scaled value; and computer readable program code devices configured to accomplish propagating the at least one concept-value pair for the non-Quantifier concept.
-
-
30. A method, performed by computing hardware and programmable memory, for determining whether a concept is referenced by a first unit of natural language discourse, comprising:
-
parsing the first unit of natural language discourse into a first parse structure that represents each sub-unit, of the first unit of natural language discourse, by a node; adding at least one concept-value pair, each of which indicates a reference to a same first concept, to at least one node of the first parse structure, wherein each reference is determined by identifying an occurrence of a first linguistic feature from a first set of linguistic features and the first set of linguistic features contains many linguistic features; adding at least one concept-value pair, each of which indicates a reference to a same modifier concept that can modify a reference level assigned to the first concept, to at least one node of the first parse structure, wherein each reference to the modifier concept is determined by identifying an occurrence of a second linguistic feature from a second set of linguistic features and the second set of linguistic features contains many linguistic features; propagating the at least one concept-value pair for the modifier concept; identifying a first node of the first parse structure that has at least one concept-value pair for the first concept and at least one concept-value pair for the modifier concept; determining a first value to be scaled from the least one concept-value pair for the first concept; determining a first scaling value from the least one concept-value pair for the modifier concept; scaling the first value to be scaled with the first scaling value to produce a first scaled value; and propagating the at least one concept-value pair for the first concept. - View Dependent Claims (33, 34)
-
-
31. A data processing system, made with computing hardware and programmable memory, for determining whether a concept is referenced by a first unit of natural language discourse, comprising:
-
a sub-system configured, as a result of the computing hardware and programmable memory, to accomplish parsing the first unit of natural language discourse into a first parse structure that represents each sub-unit, of the first unit of natural language discourse, by a node; a sub-system configured, as a result of the computing hardware and programmable memory, to accomplish adding at least one concept-value pair, each of which indicates a reference to a same first concept, to at least one node of the first parse structure, wherein each reference is determined by identifying an occurrence of a first linguistic feature from a first set of linguistic features and the first set of linguistic features contains many linguistic features; a sub-system configured, as a result of the computing hardware and programmable memory, to accomplish adding at least one concept-value pair, each of which indicates a reference to a same modifier concept that can modify a reference level assigned to the first concept, to at least one node of the first parse structure, wherein each reference to the modifier concept is determined by identifying an occurrence of a second linguistic feature from a second set of linguistic features and the second set of linguistic features contains many linguistic features; a sub-system configured, as a result of the computing hardware and programmable memory, to accomplish propagating the at least one concept-value pair for the modifier concept; a sub-system configured, as a result of the computing hardware and programmable memory, to accomplish identifying a first node of the first parse structure that has at least one concept-value pair for the first concept and at least one concept-value pair for the modifier concept; a sub-system configured, as a result of the computing hardware and programmable memory, to accomplish determining a first value to be scaled from the least one concept-value pair for the first concept; a sub-system configured, as a result of the computing hardware and programmable memory, to accomplish determining a first scaling value from the least one concept-value pair for the modifier concept; a sub-system configured, as a result of the computing hardware and programmable memory, to accomplish scaling the first value to be scaled with the first scaling value to produce a first scaled value; and a sub-system configured, as a result of the computing hardware and programmable memory, to accomplish propagating the at least one concept-value pair for the first concept.
-
-
32. A data processing system, made with computing hardware and programmable memory, for determining whether a concept is referenced by a first unit of natural language discourse, comprising:
-
a means for parsing the first unit of natural language discourse into a first parse structure that represents each sub-unit, of the first unit of natural language discourse, by a node; a means for adding at least one concept-value pair, each of which indicates a reference to a same first concept, to at least one node of the first parse structure, wherein each reference is determined by identifying an occurrence of a first linguistic feature from a first set of linguistic features and the first set of linguistic features contains many linguistic features; a means for adding at least one concept-value pair, each of which indicates a reference to a same modifier concept that can modify a reference level assigned to the first concept, to at least one node of the first parse structure, wherein each reference to the modifier concept is determined by identifying an occurrence of a second linguistic feature from a second set of linguistic features and the second set of linguistic features contains many linguistic features; a means for propagating the at least one concept-value pair for the modifier concept; a means for identifying a first node of the first parse structure that has at least one concept-value pair for the first concept and at least one concept-value pair for the modifier concept; a means for determining a first value to be scaled from the least one concept-value pair for the first concept; a means for determining a first scaling value from the least one concept-value pair for the modifier concept; a means for scaling the first value to be scaled with the first scaling value to produce a first scaled value; and a means for propagating the at least one concept-value pair for the first concept.
-
Specification