Dictionary creation device and dictionary creation method
First Claim
1. A dictionary creation device that creates a dictionary used for searching, classifying, or filtering information written as text, said device comprising:
- a keyword extraction unit configured to extract a keyword from a text information group made up of one or more pieces of text information;
a keyword statistics unit configured to find statistics regarding an appearance of the keyword within the text information group;
a keyword assessment value calculation unit configured to calculate, using a processor, an assessment value for the keyword based on the statistics;
a keyword storage unit configured to store a pair made up of the keyword and the assessment value of that keyword, said keyword storage unit being a memory unit;
a determination unit configured to determine whether or not to register the keyword in the dictionary, or whether or not to delete the keyword from the dictionary, based on a degree of change between the assessment value newly calculated by said keyword assessment value calculation unit and the assessment value stored in said keyword storage unit; and
a dictionary registration and deletion unit configured to register or delete the keyword in the dictionary based on a result of the determination,wherein the assessment value calculated by said keyword assessment value calculation unit is an appearance frequency at which the keyword appears in the text information group,wherein said determination unit is configured to determine to delete the keyword from the dictionary in the case where the keyword is already registered in the dictionary and a degree of change in the appearance frequency is greater than or equal to a predetermined threshold value, the degree of change in the appearance frequency indicating a difference between the appearance frequency and a previously calculated appearance frequency, andwherein the same keyword is used for calculating the appearance frequency and the previously calculated appearance frequency.
2 Assignments
0 Petitions
Accused Products
Abstract
A dictionary creation device and dictionary creation method are provided which optimally create and update a dictionary for classifying, searching, or extracting text information in accordance with a changes in content of text information groups. The dictionary creation device includes a keyword extraction unit that extracts a keyword from inputted text information and a keyword statistics unit that finds statistics regarding an appearance of the keyword. The dictionary creation device further includes a keyword assessment value calculation unit that calculates an assessment value of the extracted keyword based on the statistics regarding the appearance of the keyword, a determination unit that determines whether or not to register or delete the keyword based on the calculated assessment value, a dictionary registration and deletion unit which registers or deletes the keyword in or from a dictionary database based on a result of the determination performed by the determination unit, and the dictionary database.
-
Citations
13 Claims
-
1. A dictionary creation device that creates a dictionary used for searching, classifying, or filtering information written as text, said device comprising:
-
a keyword extraction unit configured to extract a keyword from a text information group made up of one or more pieces of text information; a keyword statistics unit configured to find statistics regarding an appearance of the keyword within the text information group; a keyword assessment value calculation unit configured to calculate, using a processor, an assessment value for the keyword based on the statistics; a keyword storage unit configured to store a pair made up of the keyword and the assessment value of that keyword, said keyword storage unit being a memory unit; a determination unit configured to determine whether or not to register the keyword in the dictionary, or whether or not to delete the keyword from the dictionary, based on a degree of change between the assessment value newly calculated by said keyword assessment value calculation unit and the assessment value stored in said keyword storage unit; and a dictionary registration and deletion unit configured to register or delete the keyword in the dictionary based on a result of the determination, wherein the assessment value calculated by said keyword assessment value calculation unit is an appearance frequency at which the keyword appears in the text information group, wherein said determination unit is configured to determine to delete the keyword from the dictionary in the case where the keyword is already registered in the dictionary and a degree of change in the appearance frequency is greater than or equal to a predetermined threshold value, the degree of change in the appearance frequency indicating a difference between the appearance frequency and a previously calculated appearance frequency, and wherein the same keyword is used for calculating the appearance frequency and the previously calculated appearance frequency. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A dictionary creation method for creating a dictionary used for searching, classifying, or filtering information written as text, said method comprising:
-
extracting a keyword from a text information group made up of one or more pieces of text information; finding statistics regarding an appearance of the keyword within the text information group; calculating, using a processor, an assessment value for the keyword based on the statistics; storing, in a keyword storage unit, a pair made up of the keyword and the assessment value of that keyword, said keyword storage unit being a memory unit; determining whether or not to register the keyword in the dictionary, or whether or not to delete the keyword from the dictionary, based on a degree of change between the assessment value newly calculated in said calculating and the assessment value stored in said keyword storage unit; and registering or deleting the keyword in the dictionary based on a result of the determination, wherein the assessment value calculated in said calculating is an appearance frequency at which the keyword appears in the text information group, wherein said determining comprises determining to delete the keyword from the dictionary in the case where the keyword is already registered in the dictionary and a degree of change in the appearance frequency is greater than or equal to a predetermined threshold value, the degree of change in the appearance frequency indicating a difference between the appearance frequency and a previously calculated appearance frequency, and wherein the same keyword is used for calculating the appearance frequency and the previously calculated appearance frequency.
-
-
13. A program embodied on a non-transitory computer-readable storage medium for creating a dictionary used for searching, classifying, or filtering information written as text, said program causing a computer to execute the steps of:
-
extracting a keyword from a text information group made up of one or more pieces of text information; finding statistics regarding an appearance of the keyword within the text information group; calculating, using a processor, an assessment value for the keyword based on the statistics; storing, in a keyword storage unit, a pair made up of the keyword and the assessment value of that keyword, said keyword storage unit being a memory unit; determining whether or not to register the keyword in the dictionary, or whether or not to delete the keyword from the dictionary, based on a degree of change between the assessment value newly calculated in said calculating and the assessment value stored in said keyword storage unit; and registering or deleting the keyword in the dictionary based on a result of the determination, wherein the assessment value calculated in said calculating is an appearance frequency at which the keyword appears in the text information group, wherein said determining comprises determining to delete the keyword from the dictionary in the case where the keyword is already registered in the dictionary and a degree of change in the appearance frequency is greater than or equal to a predetermined threshold value, the degree of change in the appearance frequency indicating a difference between the appearance frequency and a previously calculated appearance frequency, and wherein the same keyword is used for calculating the appearance frequency and the previously calculated appearance frequency.
-
Specification