SELF-LEARNING STATISTICAL NATURAL LANGUAGE PROCESSING FOR AUTOMATIC PRODUCTION OF VIRTUAL PERSONAL ASSISTANTS
First Claim
1. A computing device for interpreting natural language requests, the computing device comprising:
- a semantic compiler module to generate a semantic model as a function of a corpus of predefined requests, wherein the semantic model includes a plurality of mappings between a natural language request and a semantic representation of the natural language request, wherein the semantic representation identifies a user intent and zero or more slots associated with the user intent; and
a request decoder module to;
(i) generate, using the semantic model, a lattice of candidate alternatives indicative of a natural language request, wherein each candidate alternative corresponds to a token of the natural language request;
(ii) assign a composite confidence weight to each candidate alternative as a function of the semantic model;
(iii) determine an optimal route through the candidate alternative lattice based on the associated confidence weight; and
(iv) generate a semantic representation of the natural language request as a function of the candidate alternatives of the optimal route;
wherein to generate the semantic model comprises to;
(i) identify a contextual semantic feature in the corpus using an unsupervised algorithm, wherein the contextual semantic feature comprises a sequence of lexical sets associated with a user intent and zero or more slots associated with the user intent;
(ii) determine a first probability of the contextual semantic feature given the user intent; and
(iii) determine a normalized probability of the user intent as a function of a rate of occurrence of the contextual semantic feature in the corpus; and
wherein to identify the contextual semantic feature using the unsupervised algorithm comprises to;
(i) identify predefined named entities and relationships in a first group of predefined sample queries in the corpus;
(ii) cluster the predefined sample queries using an unsupervised clustering algorithm to generate a plurality of clusters; and
(iii) assign a user intent and slots to each cluster of the plurality of clusters.
0 Assignments
0 Petitions
Accused Products
Abstract
Technologies for natural language request processing include a computing device having a semantic compiler to generate a semantic model based on a corpus of sample requests. The semantic compiler may generate the semantic model by extracting contextual semantic features or processing ontologies. The computing device generates a semantic representation of a natural language request by generating a lattice of candidate alternative representations, assigning a composite weight to each candidate, and finding the best route through the lattice. The composite weight may include semantic weights, phonetic weights, and/or linguistic weights. The semantic representation identifies a user intent and slots associated with the natural language request. The computing device may perform one or more dialog interactions based on the semantic request, including generating a request for additional information or suggesting additional user intents. The computing device may support automated analysis and tuning to improve request processing. Other embodiments are described and claimed.
-
Citations
24 Claims
-
1. A computing device for interpreting natural language requests, the computing device comprising:
-
a semantic compiler module to generate a semantic model as a function of a corpus of predefined requests, wherein the semantic model includes a plurality of mappings between a natural language request and a semantic representation of the natural language request, wherein the semantic representation identifies a user intent and zero or more slots associated with the user intent; and a request decoder module to;
(i) generate, using the semantic model, a lattice of candidate alternatives indicative of a natural language request, wherein each candidate alternative corresponds to a token of the natural language request;
(ii) assign a composite confidence weight to each candidate alternative as a function of the semantic model;
(iii) determine an optimal route through the candidate alternative lattice based on the associated confidence weight; and
(iv) generate a semantic representation of the natural language request as a function of the candidate alternatives of the optimal route;wherein to generate the semantic model comprises to;
(i) identify a contextual semantic feature in the corpus using an unsupervised algorithm, wherein the contextual semantic feature comprises a sequence of lexical sets associated with a user intent and zero or more slots associated with the user intent;
(ii) determine a first probability of the contextual semantic feature given the user intent; and
(iii) determine a normalized probability of the user intent as a function of a rate of occurrence of the contextual semantic feature in the corpus; andwherein to identify the contextual semantic feature using the unsupervised algorithm comprises to;
(i) identify predefined named entities and relationships in a first group of predefined sample queries in the corpus;
(ii) cluster the predefined sample queries using an unsupervised clustering algorithm to generate a plurality of clusters; and
(iii) assign a user intent and slots to each cluster of the plurality of clusters. - View Dependent Claims (5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
2-4. -4. (canceled)
-
14. A method for interpreting natural language requests, the method comprising:
-
generating, by a computing device, a semantic model as a function of a corpus of predefined requests, wherein the semantic model includes a plurality of mappings between a natural language request and a semantic representation of the natural language request, wherein the semantic representation identifies a user intent and zero or more slots associated with the user intent; generating, by the computing device using the semantic model, a lattice of candidate alternatives indicative of a natural language request, wherein each candidate alternative corresponds to a token of the natural language request; assigning, by the computing device, a composite confidence weight to each candidate alternative as a function of the semantic model; determining, by the computing device, an optimal route through the candidate alternative lattice based on the associated confidence weight; and generating, by the computing device, a semantic representation of the natural language request as a function of the candidate alternatives of the optimal route; wherein generating the semantic model comprises;
(i) identifying a contextual semantic feature in the corpus using an unsupervised algorithm, wherein the contextual semantic feature comprises a sequence of lexical sets associated with a user intent and zero or more slots associated with the user intent;
(ii) determining a first probability of the contextual semantic feature given the user intent; and
(iii) determining a normalized probability of the user intent as a function of a rate of occurrence of the contextual semantic feature in the corpus; andwherein identifying the contextual semantic feature using the unsupervised algorithm comprises;
(i) identifying predefined named entities and relationships in a first group of predefined sample queries in the corpus;
(ii) clustering the predefined sample queries using an unsupervised clustering algorithm to generate a plurality of clusters; and
(iii) assigning a user intent and slots to each cluster of the plurality of clusters. - View Dependent Claims (16, 17, 18, 19)
-
-
15. (canceled)
-
20. One or more computer-readable storage media comprising a plurality of instructions that in response to being executed cause a computing device to:
-
generate a semantic model as a function of a corpus of predefined requests, wherein the semantic model includes a plurality of mappings between a natural language request and a semantic representation of the natural language request, wherein the semantic representation identifies a user intent and zero or more slots associated with the user intent; generate, using the semantic model, a lattice of candidate alternatives indicative of a natural language request, wherein each candidate alternative corresponds to a token of the natural language request; assign a composite confidence weight to each candidate alternative as a function of the semantic model; determine an optimal route through the candidate alternative lattice based on the associated confidence weight; and generate a semantic representation of the natural language request as a function of the candidate alternatives of the optimal route; wherein to generate the semantic model comprises to;
(i) identify a contextual semantic feature in the corpus using an unsupervised algorithm, wherein the contextual semantic feature comprises a sequence of lexical sets associated with a user intent and zero or more slots associated with the user intent;
(ii) determine a first probability of the contextual semantic feature given the user intent; and
(iii) determine a normalized probability of the user intent as a function of a rate of occurrence of the contextual semantic feature in the corpus; andwherein to identify the contextual semantic feature using the unsupervised algorithm comprises to;
(i) identify predefined named entities and relationships in a first group of predefined sample queries in the corpus;
(ii) cluster the predefined sample queries using an unsupervised clustering algorithm to generate a plurality of clusters; and
(iii) assign a user intent and slots to each cluster of the plurality of clusters. - View Dependent Claims (21, 22, 23, 24)
-
Specification