Construction of trainable semantic vectors and clustering, classification, and searching using trainable semantic vectors
0 Assignments
0 Petitions
Accused Products
Abstract
An apparatus and method are disclosed for producing a semantic representation of information in a semantic space. The information is first represented in a table that stores values which indicate, a relationship with predetermined categories. The categories correspond to dimensions in the semantic space. The significance of the information with respect to the predetermined categories is then determined. A trainable semantic vector (TSV) is constructed to provide a semantic representation of the information. The TSV has dimensions equal to the number of predetermined categories and represents the significance of the information relative to each of the predetermined categories. Various types of manipulation and analysis, such as searching, classification, and clustering, can subsequently be performed on a semantic level.
-
Citations
78 Claims
-
1-40. -40 (Cancelled)
-
41. A method of classifying new datasets within a predetermined number of categories based on assignment of a plurality of sample datasets to each category, the method comprising the steps:
-
constructing a trainable semantic vector for each sample dataset relative to the predetermined categories in a multi-dimensional semantic space;
constructing a trainable semantic vector for each category based on the trainable semantic vectors for the sample datasets;
receiving a new dataset;
constructing a trainable semantic vector for the new dataset;
determining a distance between the trainable semantic vector for the new dataset and the trainable semantic vector of each category; and
classifying the new dataset within the category whose trainable semantic vector has the shortest distance to the trainable semantic vector of the new dataset. - View Dependent Claims (42, 43, 44, 45, 72)
-
-
46. A method of classifying new datasets within a predetermined number of categories based on assignment of a plurality of sample datasets to each category, the method comprising the steps:
-
constructing a trainable semantic vector for each sample dataset relative to the predetermined categories in a multi-dimensional semantic space;
receiving a new dataset;
constructing a trainable semantic vector for the new dataset;
identifying a select number of sample datasets whose trainable semantic vectors are closest in distance to the trainable semantic vector for the new dataset; and
classifying the new dataset in the category containing the greatest number of the select sample datasets. - View Dependent Claims (47, 48, 49, 73)
-
-
50. A method of classifying new datasets within a predetermined number of categories, the method comprising the steps:
-
receiving a new dataset;
constructing a trainable semantic vector for the new dataset, where the dimensions of the trainable semantic vector correspond to the predetermined number of categories;
classifying the dataset in the category whose corresponding dimension in the trainable semantic vector has the largest value. - View Dependent Claims (51, 52, 74)
-
-
53-62. -62 (Cancelled)
-
63. A system for classifying new datasets within a predetermined number of categories based on assignment of a plurality of sample datasets to each category, the system comprising:
a computer configure to;
construct a trainable semantic vector for each sample dataset relative to the predetermined categories in a multi-dimensional semantic space;
construct a trainable semantic vector for each category based on the trainable semantic vectors for the sample datasets;
receive a new dataset;
construct a trainable semantic vector for the new dataset;
determine a distance between the trainable semantic vector for the new dataset and the trainable semantic vector of each category; and
classify the new dataset within the category whose trainable semantic vector has the shortest distance to the trainable semantic vector of the new dataset. - View Dependent Claims (75)
-
64. A system for classifying new datasets within a predetermined number of categories based on assignment of a plurality of sample datasets to each category, the system comprising:
a computer configured to;
construct a trainable semantic vector for each sample dataset relative to the predetermined categories in a multi-dimensional semantic space;
receive a new dataset;
construct a trainable semantic vector for the new dataset;
identify a select number of sample datasets whose trainable semantic vectors are closest in distance to the trainable semantic vector for the new dataset; and
classify the new dataset in the category containing the greatest number of the select sample datasets. - View Dependent Claims (76)
-
65-68_Cancelled_. -68(Cancelled)
-
69. A computer-readable medium carrying one or more sequences of instructions for classifying new datasets within a predetermined number of categories based on assignment of a plurality of sample datasets to each category, wherein execution of the one or more sequences of instructions by one or more processors causes the one or more processors to perform the steps of:
-
constructing a trainable semantic vector for each sample dataset relative to the predetermined categories in a multi-dimensional semantic space;
constructing a trainable semantic vector for each category based on the trainable semantic vectors for the sample datasets;
receiving a new dataset;
constructing a trainable semantic vector for the new dataset;
determining a distance between the trainable semantic vector for the new dataset and the trainable semantic vector of each category; and
classifying the new dataset within the category whose trainable semantic vector has the shortest distance to the trainable semantic vector of the new dataset. - View Dependent Claims (77)
-
-
70. A computer-readable medium carrying one or more sequences of instructions for classifying new datasets within a predetermined number of categories based on assignment of a plurality of sample datasets to each category, wherein execution of the one or more sequences of instructions by one or more processors causes the one or more processors to perform the steps of:
-
constructing a trainable semantic vector for each sample dataset relative to the predetermined categories in a multi-dimensional semantic space;
receiving a new dataset;
constructing a trainable semantic vector for the new dataset;
identifying a select number of select datasets whose trainable semantic vectors are closest in distance to the trainable semantic vector for the new dataset; and
classifying the new dataset in the category containing the greatest number of the select datasets. - View Dependent Claims (78)
-
-
71. (Cancelled)
Specification