INFORMATION FILTERING SYSTEM, INFORMATION FILTERING METHOD AND INFORMATION FILTERING PROGRAM
First Claim
1. An information filtering system comprising:
- a first filter unit inputting document data belonging to at least any kind among a plurality of kinds from an input equipment, carrying out a predetermined classifying process to specify a kind to which the document data inputted belongs using a CPU (Central Processing Unit), and specifying the kind to which the document data inputted belongs as first classified information;
a second filter unit inputting the document data from an input equipment, carrying out a predetermined classifying process being different from the classifying process of the first filter unit using a CPU, specifying a kind to which the document data inputted belongs as second classified information;
a first correct answer comparing unit comparing the first classified information of a plurality of pieces of learning document data specified by the first filter unit with treating each of the plurality of pieces of learning document data which belongs to a kind previously specified as the document data and correct answer information of the plurality of pieces of learning document data which belongs to the kind previously specified using a CPU, generating first learning result information of the plurality of pieces of learning document data showing whether the first classified information matches the correct answer information or not based on comparison result, and storing the first learning result information generated of the plurality of pieces of learning document data in a memory equipment;
a second correct answer comparing unit comparing the second classified information of the plurality of pieces of learning document data specified by the second filter unit with treating each of the plurality of pieces of learning document data as the document data and the correct answer information of the plurality of pieces of learning document data using a CPU, generating second learning result information of the plurality of pieces of learning document data showing whether the second classified information matches the correct answer information based on comparison result, and storing the second learning result information generated of the plurality of pieces of learning document data in a memory equipment;
an error rate calculating unit calculating a first error rate showing a rate that the first classified information does not match the correct answer information based on the first learning result information of the plurality of pieces of learning document data generated by the first correct answer comparing unit using a CPU, and as well calculating a second error rate showing a rate that the second classified information does not match the correct answer information based on the second learning result information of the plurality of pieces of learning document data generated by the second correct answer comparing unit using a CPU; and
a result outputting unit specifying a kind to which the classifying target document data belongs using a CPU based on the first classified information specified by the first filter unit with treating classifying target document data which is a target to be classified to a specific kind as the document data, the second classified information specified by the second filter unit with treating the classifying target document data as the document data, the first error rate calculated by the error rate calculating unit, and the second error rate calculated by the error rate calculating unit, and outputting the kind specified to an output equipment as a classified result.
1 Assignment
0 Petitions
Accused Products
Abstract
A string matching unit 110 specifies a category of an input document 801 by string matching of the input document 801 and a classifying keyword shown by matching condition information 109. Learning data 209 shows statistic information of each category. A classifying unit 220 specifies the category of the input document 801 based on a correspondence ratio of the input document 801 and the statistic information shown by the learning data 209. A correct answer comparing unit 120 compares the category specified by the string matching unit 110 and a category of correct answer information 803. A learning unit 210 compares the category specified by the classifying unit 220 and the category of the correct answer information 803. An error rate calculating unit 310 calculates a classifying error rate of a string matching filter unit 100 and a learning filter unit 200 based on the comparison result of the correct answer comparing unit 120 and the comparison result of the learning unit 210. A result outputting unit 320 outputs the category specified by the filter having a smaller classifying error rate as a classified result 301 of a classifying target document 804.
31 Citations
18 Claims
-
1. An information filtering system comprising:
-
a first filter unit inputting document data belonging to at least any kind among a plurality of kinds from an input equipment, carrying out a predetermined classifying process to specify a kind to which the document data inputted belongs using a CPU (Central Processing Unit), and specifying the kind to which the document data inputted belongs as first classified information; a second filter unit inputting the document data from an input equipment, carrying out a predetermined classifying process being different from the classifying process of the first filter unit using a CPU, specifying a kind to which the document data inputted belongs as second classified information; a first correct answer comparing unit comparing the first classified information of a plurality of pieces of learning document data specified by the first filter unit with treating each of the plurality of pieces of learning document data which belongs to a kind previously specified as the document data and correct answer information of the plurality of pieces of learning document data which belongs to the kind previously specified using a CPU, generating first learning result information of the plurality of pieces of learning document data showing whether the first classified information matches the correct answer information or not based on comparison result, and storing the first learning result information generated of the plurality of pieces of learning document data in a memory equipment; a second correct answer comparing unit comparing the second classified information of the plurality of pieces of learning document data specified by the second filter unit with treating each of the plurality of pieces of learning document data as the document data and the correct answer information of the plurality of pieces of learning document data using a CPU, generating second learning result information of the plurality of pieces of learning document data showing whether the second classified information matches the correct answer information based on comparison result, and storing the second learning result information generated of the plurality of pieces of learning document data in a memory equipment; an error rate calculating unit calculating a first error rate showing a rate that the first classified information does not match the correct answer information based on the first learning result information of the plurality of pieces of learning document data generated by the first correct answer comparing unit using a CPU, and as well calculating a second error rate showing a rate that the second classified information does not match the correct answer information based on the second learning result information of the plurality of pieces of learning document data generated by the second correct answer comparing unit using a CPU; and a result outputting unit specifying a kind to which the classifying target document data belongs using a CPU based on the first classified information specified by the first filter unit with treating classifying target document data which is a target to be classified to a specific kind as the document data, the second classified information specified by the second filter unit with treating the classifying target document data as the document data, the first error rate calculated by the error rate calculating unit, and the second error rate calculated by the error rate calculating unit, and outputting the kind specified to an output equipment as a classified result. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. An information filtering method comprising:
-
by a first filter unit, performing a first filter process of inputting document data belonging to at least any kind among a plurality of kinds from an input equipment, carrying out a predetermined classifying process to specify a kind to which the document data inputted belongs using a CPU (Central Processing Unit), and specifying the kind to which the document data inputted belongs as first classified information; by a second filter unit, performing a second filtering process of inputting the document data from the input equipment, carrying out a predetermined classifying process being different from the classifying process of the first filter unit using a CPU, specifying the kind to which the document data inputted belongs as second classified information; by a first correct answer comparing unit, performing a first correct answer comparing process of treating each of a plurality of pieces of learning document data which belongs to a kind previously specified as the document data, comparing the first classified information of the plurality of pieces of learning document data specified by the first filter unit and correct answer information of the plurality of pieces of learning document data which belongs to the kind previously specified using a CPU, generating first learning result information of the plurality of pieces of learning document data showing whether the first classified information matches the correct answer information or not based on comparison result, and storing the first learning result information generated of the plurality of pieces of learning document data in a memory equipment; by a second correct answer comparing unit, performing a second correct answer comparing process of treating each of the plurality of pieces of learning document data as the document data, comparing the second classified information of the plurality of pieces of learning document data specified by the second filter unit and the correct answer information of the plurality of pieces of learning document data using a CPU, generating second learning result information of the plurality of pieces of learning document data showing whether the second classified information matches the correct answer information based on the comparison result, and storing the second learning result information generated of the plurality of pieces of learning document data in the memory equipment; by an error rate calculating unit, performing an error rate calculating process of calculating a first error rate showing a rate that the first classified information does not match the correct answer information based on the first learning result information of the plurality of pieces of learning document data generated by the first correct answer comparing unit using a CPU, and as well calculating a second error rate showing a rate that the second classified information does not match the correct answer information based on the second learning result information of the plurality of pieces of learning document data generated by the second correct answer comparing unit using a CPU; and by a result outputting unit, performing a result outputting process of treating classifying target document data which is a target to be classified to a specific kind as the document data, specifying the kind to which the classifying target document data belongs using a CPU based on the first classified information specified by the first filter unit, the second classified information specified by the second filter unit, the first error rate calculated by the error rate calculating unit, and the second error rate calculated by the error rate calculating unit, and outputting the kind specified to an output equipment as a classified result. - View Dependent Claims (18)
-
Specification