Method for dynamic knowledge capturing in production printing workflow domain
First Claim
1. A knowledge base system for managing a knowledge base executable by at least one processor for providing for collecting, organizing and receiving a data instance comprising:
- at least one storage device accessible by the at least one processor for storing a plurality of data instances;
a user interface device for receiving the at least one data instance; and
a memory storing a series of executable instructions executable by the at least one processor for capturing a received data instance and determining via a field dependent heuristic determination if the received data instance is a duplicate of any data instance of the plurality of stored data instances, wherein the series of executable instructions are further executed by the at least one processor to manage the knowledge base system as a dynamic knowledge base system comprising updating the knowledge base system, which includes storing the received data instance in the at least one storage device as a new data instance only when the determination of duplicity is that the received data instance is not a duplicate of any of the data instances of the plurality of stored data instances, wherein the received data instance and the plurality of stored data instances each include at least one field each having an item, each item including at least one token, each token including a sequence of at least one character;
wherein the determination by the at least one processor comprises;
for each field of the received data instance comparing between tokens of the at least one token of the field and the at least one token of a corresponding field of a respective stored data instance and generating at least one corresponding token similarity value, wherein each token comparison between a first token and a second token includes determining a degree of matching between characters of the at least one character of the first token that and the at least one character of the second token, including taking character sequence into account, and outputting a field similarity degree based on the at least one token similarity value; and
for each respective stored data instance generating an instance similarity value based on the field similarity degree corresponding to the respective fields of the received data instance, wherein the determination of duplicity between the received data instance and the respective stored data instance is based on the instance similarity value.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method are provided for managing a knowledge base system storing a plurality of data instances, each data instance including at least one field, each field having at least one item and provided with an associated field type indicating whether the field is allowed to have only a single item or multiple items. At least one large itemset is determined by generating a plurality of itemsets formed of possible combinations of items selected from items corresponding to fields of the stored data instances. Itemsets having a combination of more than one item corresponding to a field having an associated field type indicating that the field is allowed to have only a single value are eliminated. The remaining itemsets are processed for generating associate rules.
44 Citations
22 Claims
-
1. A knowledge base system for managing a knowledge base executable by at least one processor for providing for collecting, organizing and receiving a data instance comprising:
-
at least one storage device accessible by the at least one processor for storing a plurality of data instances; a user interface device for receiving the at least one data instance; and a memory storing a series of executable instructions executable by the at least one processor for capturing a received data instance and determining via a field dependent heuristic determination if the received data instance is a duplicate of any data instance of the plurality of stored data instances, wherein the series of executable instructions are further executed by the at least one processor to manage the knowledge base system as a dynamic knowledge base system comprising updating the knowledge base system, which includes storing the received data instance in the at least one storage device as a new data instance only when the determination of duplicity is that the received data instance is not a duplicate of any of the data instances of the plurality of stored data instances, wherein the received data instance and the plurality of stored data instances each include at least one field each having an item, each item including at least one token, each token including a sequence of at least one character;
wherein the determination by the at least one processor comprises;for each field of the received data instance comparing between tokens of the at least one token of the field and the at least one token of a corresponding field of a respective stored data instance and generating at least one corresponding token similarity value, wherein each token comparison between a first token and a second token includes determining a degree of matching between characters of the at least one character of the first token that and the at least one character of the second token, including taking character sequence into account, and outputting a field similarity degree based on the at least one token similarity value; and for each respective stored data instance generating an instance similarity value based on the field similarity degree corresponding to the respective fields of the received data instance, wherein the determination of duplicity between the received data instance and the respective stored data instance is based on the instance similarity value. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A knowledge base system for managing a knowledge base executable by at least one processor for providing for collecting, organizing and receiving a data instance for operation in a production printing workflow environment comprising:
-
at least one storage device accessible by at least one processor for storing a plurality of data instances; and a memory storing a series of executable instructions executable by the at least one processor for generating at least one associate rule associated with a plurality of stored data instances, wherein the plurality of stored data instances each include at least one field, each having at least one item, and an associated field type for indicating whether the field is allowed to have one of only a single item and multiple items, wherein the generating at least one associate rule by the at least one processor comprises; generating a plurality of itemsets formed of possible combinations of at least one item selected from the at least one item corresponding to the at least one field of the plurality of stored data instances; eliminating at least one itemset from the plurality of itemsets having a combination of more than one item corresponding to a field having an associated field type indicating that the field is allowed to have only a single value; and processing a remaining at least one itemset for deriving at least one associate rule. - View Dependent Claims (14, 15, 16)
-
-
17. A method for managing a knowledge base system, the method comprising:
-
storing a plurality of data instances, each data instance of the plurality of data instances including at least one field each having at least one item; providing each field of the at least one field with an associated field type for indicating whether the field is allowed to have one of only a single item and multiple items; generating a plurality of itemsets formed of possible combinations of at least one item selected from the at least one item corresponding to the at least one field of the plurality of stored data instances; eliminating at least one itemset having a combination of more than one item corresponding to a field having an associated field type indicating that the field is allowed to have only a single value; and processing at least one remaining itemset for generating at least one associate rule. - View Dependent Claims (18, 19, 20, 21, 22)
-
Specification