Methods and apparatus for determining theme for discourse
First Claim
1. A method for determining themes in an input discourse, said method comprising the steps of:
- storing a thematic profile for said input discourse that includes a plurality of thematic tags for words in said input discourse, wherein said thematic tags indicate the existence or non-existence of a plurality of thematic constructions, and wherein said thematic constructions comprise a plurality of tests made against said words in the exact context of said discourse to determine thematic aspects or information about the overall theme of said discourse;
storing a lexicon comprising a plurality of words and definitional characteristics for said words; and
generating theme terms from words in said input discourse based on existence or non existence of said thematic constructions as indicated by said thematic tags, and based on definitional characteristics of said words as indicated by said lexicon, wherein said theme terms identify overall content of said input discourse.
1 Assignment
0 Petitions
Accused Products
Abstract
A content processing system determines content of input discourse. The content processing system includes a theme vector processor that determines themes in the input discourse. The theme vector processor identifies themes, including identifying the relative importance of the themes in the input discourse by generating a theme strength. The theme strength indicates relative thematic importance for the theme in the input discourse. A knowledge catalog, which includes static ontologies arranged in a hierarchical structure, is also disclosed. The static ontologies are independent and parallel of each other, and contain knowledge concepts to represent a world view of knowledge. The theme vector processor utilizes the static ontologies to generate a theme concept for each theme by extracting a knowledge concept from a higher level node in the hierarchical structure of a static ontology.
324 Citations
11 Claims
-
1. A method for determining themes in an input discourse, said method comprising the steps of:
-
storing a thematic profile for said input discourse that includes a plurality of thematic tags for words in said input discourse, wherein said thematic tags indicate the existence or non-existence of a plurality of thematic constructions, and wherein said thematic constructions comprise a plurality of tests made against said words in the exact context of said discourse to determine thematic aspects or information about the overall theme of said discourse;
storing a lexicon comprising a plurality of words and definitional characteristics for said words; and
generating theme terms from words in said input discourse based on existence or non existence of said thematic constructions as indicated by said thematic tags, and based on definitional characteristics of said words as indicated by said lexicon, wherein said theme terms identify overall content of said input discourse. - View Dependent Claims (2, 3, 4, 5, 6, 7)
storing a plurality of categories arranged hierarchically in a knowledge catalog; and
classifying themes of said input discourse by mapping at least one theme term into a category of said knowledge catalog.
-
-
5. The method as set forth in claim 4, further comprising the step of generating a theme concept for said theme term by extracting a category from a higher level node in said knowledge catalog.
-
6. The method as set forth in claim 5, further comprising the step of adding a theme concept as a theme term if more than one theme term map to said theme concept.
-
7. The method as set forth in claim 1, further comprising the steps of:
-
determining whether each theme term is essentially non-ambiguous such that said theme term is commonly recognized as having a single sense; and
utilizing only non-ambiguous terms as theme terms.
-
-
8. A method for classifying themes of an input discourse, said method comprising the steps of:
-
receiving a plurality of themes from said input discourse;
storing, to represent a knowledge catalog, a plurality of categories arranged hierarchically such that child categories associated with parent categories include both semantic and linguistic associations, wherein said linguistic associations include associations between at least two concepts where a concept representing a child category is a type of a concept representing a parent category, and semantic associations include associations between at least two concepts, generally associated together in language usage, but concepts of child categories are not a type of concepts of parent categories; and
classifying said themes of said input discourse by relating themes into categories of said knowledge catalog, wherein classification of said themes in categories of said knowledge catalog reflects semantic and linguistic relationships between said themes. - View Dependent Claims (9, 10)
determining whether each theme term is essentially non-ambiguous such that said theme term is commonly recognized as having a single sense; and
classifying only non-ambiguous terms in said knowledge catalog.
-
-
11. A method for determining theme in input discourse, said method comprising the steps of:
-
identifying a plurality of words or terms in said input discourse that define thematic content of said input discourse;
determining whether words or terms identified are essentially non-ambiguous such that said words or terms are commonly recognized as having a single sense;
selecting only non-ambiguous words or terms for processing to determine themes of said input discourse; and
processing said non-ambiguous words or terms to determine themes of said input discourse.
-
Specification