NORMALIZING NON-NUMERIC FEATURES OF FILES
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments include method, computer program products and apparatuses for normalizing non-numeric features of files and corresponding apparatus Aspects include segmenting at least one pair of positive instances of a non-numeric feature of a file into a number of tokens and comparing the tokens in the at least one pair of positive instances to obtain matching tokens. Aspects also include calculating weights of their matching the file, for the matching tokens, and storing the tokens and their weights in a token base.
15 Citations
20 Claims
-
1-9. -9. (canceled)
-
10. An apparatus for normalizing non-numeric features of files, comprising:
-
a token segmenting module configured to segment at least one pair of positive instances of a non-numeric feature of a file into a number of tokens; a token matching module configured to compare the tokens in the at least one pair of positive instances to obtain matching tokens; and a token base constructing module configured to, for the matching tokens, calculate weights of their matching the file, and store the tokens and their weights in a token base. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
-
-
18. A computer program product for normalizing non-numeric features of files, the computer program product comprising:
-
a non-transitory storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising; segmenting at least one pair of positive instances of a non-numeric feature of a file into a number of tokens; comparing the tokens in the at least one pair of positive instances to obtain matching tokens; and for each of the matching tokens, calculating weights of their matching the file, and storing the tokens and their weights in a token base. - View Dependent Claims (19, 20)
-
Specification