Method for defining and optimizing criteria used to detect a contextually specific concept within a paragraph
First Claim
Patent Images
1. A methodology for defining a software folder used to construct a self-populating directory, comprising the steps of:
- providing a label describing a concept to be associated with the folder; and
providing a folder definition having folder-specific criteria for detecting an expression of the concept and criteria inherited by hierarchically subordinate folders in the directory for detecting the context of the expression.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention discloses building blocks necessary for creating a self-populating directory in which individual paragraphs are mapped to folders, each folder being associated with a specific concept or idea. Criterion for specifying a desired context for the associated concept is inherited by hierarchically subordinate files. The inheritance of context criterion greatly simplifies the task of designing a self-populating directory. Also disclosed are routines for optimizing the level of recall and precision of the criterion used to populate the folder.
-
Citations
6 Claims
-
1. A methodology for defining a software folder used to construct a self-populating directory, comprising the steps of:
-
providing a label describing a concept to be associated with the folder; and
providing a folder definition having folder-specific criteria for detecting an expression of the concept and criteria inherited by hierarchically subordinate folders in the directory for detecting the context of the expression. - View Dependent Claims (2, 3, 4)
-
-
5. A tool for optimizing the recall level of a folder definition having folder-specific criteria for detecting an expression of the concept, comprising:
-
providing a collection of paragraphs;
providing a collection of noise words;
comparing each paragraph in the collection of paragraphs against the folder definition, and extracting from the collection any paragraph not satisfying the folder definition criteria;
extracting noise words contained in the collection of noise words from the collection of paragraphs;
extracting sentences from the collection of paragraphs which do not contain word stems used to specify the criteria for detecting the expression of the concept;
tabulating and outputting the frequency that combinations of one, two, three and four adjacent words occur within the sentences remaining in the collection of paragraphs; and
wherein the user visually examines the frequency table to find combinations indicative of the concept, which are not already detected by the existing stem phrases.
-
-
6. A tool for optimizing the precision level of a folder definition having folder-specific criteria for detecting an expression of the concept and criteria inherited by hierarchically subordinate folders in the directory for detecting the context of the expression, comprising:
-
providing a collection of paragraphs;
comparing each paragraph in the collection of paragraphs against the folder definition, and extracting from the collection any paragraph not satisfying the folder definition criteria;
examine the collection of paragraphs and flag those paragraphs in which the concept appears within an irrelevelant context;
examine the collection of paragraphs to identify word(s) which recur in the flagged paragraphs at a substantially greater incidence than the non-flagged paragraphs, and modify the folder definition to disqualify paragraphs using such word(s);
if no recurring word(s) are detected for excluding the flagged paragraphs from the folder, then identify word stems which recur in the flagged paragraphs at a substantially greater incidence than the non-flagged paragraphs, and amend the folder definition to exclude such word stems;
if no recurring word stem is detected, then identify Proximity Restriction(s) which exclude the flagged paragraphs at a substantially greater incidence than the non-flagged paragraphs, and amend the folder definition to include said Proximity Restriction(s); and
if no suitable Proximity Restriction is detected, then identify Order Restriction(s) which exclude the flagged paragraphs at a substantially greater incidence than the non-flagged paragraphs, and amend the folder definition to include said Order Restriction(s).
-
Specification