Learning Discriminative Projections for Text Similarity Measures
First Claim
1. A method performed on at least one processor for optimizing model parameters, comprising:
- mapping raw text representations of text objects to a compact vector space using the model parameters;
computing similarity scores based upon compact vectors for two text objects;
calculating error values using a loss function operating on the computed similarity scores and labels associated with pairs of text objects; and
adjusting the model parameters to minimize the error values.
2 Assignments
0 Petitions
Accused Products
Abstract
A model for mapping the raw text representation of a text object to a vector space is disclosed. A function is defined for computing a similarity score given two output vectors. A loss function is defined for computing an error based on the similarity scores and the labels of pairs of vectors. The parameters of the model are tuned to minimize the loss function. The label of two vectors indicates a degree of similarity of the objects. The label may be a binary number or a real-valued number. The function for computing similarity scores may be a cosine, Jaccard, or differentiable function. The loss function may compare pairs of vectors to their labels. Each element of the output vector is a linear or non-linear function of the terms of an input vector. The text objects may be different types of documents and two different models may be trained concurrently.
42 Citations
20 Claims
-
1. A method performed on at least one processor for optimizing model parameters, comprising:
-
mapping raw text representations of text objects to a compact vector space using the model parameters; computing similarity scores based upon compact vectors for two text objects; calculating error values using a loss function operating on the computed similarity scores and labels associated with pairs of text objects; and adjusting the model parameters to minimize the error values. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A system, comprising:
-
a data storage device for storing model parameters for use in mapping raw text representations of text objects to a compact vector space; a circuit for creating a compact vector using model parameters, the compact vector representing a text object; a circuit for generating a similarity score by applying a similarity function to two compact vectors; a circuit for applying a loss function to the similarity score and to a label, the label identifying a similarity of the text objects associated with the two compact vectors; and a circuit for modifying the model parameters in a manner that minimizes an error value generated by the loss function. - View Dependent Claims (15, 16, 17, 18)
-
-
19. One or more computer-readable media having computer-executable instructions, which when executed perform steps, comprising:
-
mapping raw text representations of text objects to a compact vector space using the model parameters; computing similarity scores based upon compact vectors for two text objects; calculating error values using a loss function operating on the computed similarity scores and labels associated with pairs of text objects, wherein the labels indicate a degree of similarity of the pairs of text objects; and adjusting the model parameters to minimize the error values. - View Dependent Claims (20)
-
Specification