Generating and using a knowledge-enhanced model
First Claim
1. A method implemented by one or more computing devices, the method comprising:
- sampling first click-through data from a repository, the first click-through data identifying queries submitted by users to a search engine and specific result items that the users clicked from search results provided by the search engine in response to the queries;
sampling structured knowledge data from one or more structured knowledge resources, the structured knowledge data providing semantic distances between various nouns identified in the one or more structured knowledge resources;
processing the structured knowledge data to obtain second click-through the data second click-through data representing respective semantic distances between semantically related nouns as corresponding click values; and
training a model using the first click-through data and the second click-through data as training data, the model being trained using a machine-learning training process,wherein the model is configured to process input linguistic items and identify output linguistic items that are related to the input linguistic items.
2 Assignments
0 Petitions
Accused Products
Abstract
Functionality is described herein for generating a model on the basis of user-behavioral data and knowledge data. In one case, the user-behavioral data identifies queries submitted by users, together with selections made by the users in response to the queries. The knowledge data represents relationships among linguistic items, as expressed by one or more structured knowledge resources. The functionality leverages the knowledge data to supply information regarding semantic relationships which may not be adequately captured by the user-behavioral data, to thereby produce a more robust and accurate model (compared to a model produced on the basis of only user-behavioral data). Functionality is also described herein for applying the model, once trained. In one case, the model may correspond to a deep learning model.
-
Citations
20 Claims
-
1. A method implemented by one or more computing devices, the method comprising:
-
sampling first click-through data from a repository, the first click-through data identifying queries submitted by users to a search engine and specific result items that the users clicked from search results provided by the search engine in response to the queries; sampling structured knowledge data from one or more structured knowledge resources, the structured knowledge data providing semantic distances between various nouns identified in the one or more structured knowledge resources; processing the structured knowledge data to obtain second click-through the data second click-through data representing respective semantic distances between semantically related nouns as corresponding click values; and training a model using the first click-through data and the second click-through data as training data, the model being trained using a machine-learning training process, wherein the model is configured to process input linguistic items and identify output linguistic items that are related to the input linguistic items. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer readable storage medium storing computer readable instructions, the computer readable instructions providing a semantic transformation module when executed by one or more processing devices, the computer readable instructions comprising:
-
logic configured to; use a deep learning model to map an input linguistic item into a concept vector in a high-level conceptual space; and identify, in the high-level conceptual space, one or more output linguistic items that are related to the input linguistic item, the deep learning model capturing semantic relationships learned in a machine-learning training process performed on training instances of user-behavioral data and other training instances of structured knowledge data, the training instances of user-behavioral data identifying user-submitted linguistic items submitted by users to a search engine together with user clicks made by the users on user-clicked result items provided by the search engine in response to the user-submitted linguistic items, and the other training instances of structured knowledge data representing semantic distances between nouns expressed by one or more structured knowledge resources as corresponding click values.
-
-
12. A computer system, comprising:
-
a processing device; and a storage resource storing instructions which, when executed by the processing device, cause the processing device to implement; a search engine configured to receive an input linguistic item, and configured to identify at least one output item that has been determined to be relevant to the input linguistic item, the search engine being configured to identify said at least one output item using a model configured to map the input linguistic item and the at least one output item into a semantic space, the model being trained by a machine-learning training process based at least on user-behavioral training data and structured knowledge training data, the user-behavioral training data identifying user-submitted linguistic items submitted by users together with user clicks made by the users on specific result items provided by a search engine in response to the user-submitted linguistic items, and the structured knowledge training data representing, as corresponding click values, semantic distances between semantically-related nouns, expressed by one or more structured knowledge resources. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20)
-
Specification