Self-learning statistical natural language processing for automatic production of virtual personal assistants

US 9,798,719 B2
Filed: 10/24/2016
Issued: 10/24/2017
Est. Priority Date: 07/25/2013
Status: Expired due to Fees

First Claim

Patent Images

1. A computing device for interpreting natural language requests, the computing device comprising:

a semantic compiler module to generate a semantic model as a function of a corpus of predefined requests, wherein the semantic model includes a plurality of mappings between a natural language request and a semantic representation of the natural language request, wherein the semantic representation identifies a user intent and zero or more slots associated with the user intent; and

a request decoder module to;

(i) generate, using the semantic model, a lattice of candidate alternatives indicative of a natural language request, wherein each candidate alternative corresponds to a token of the natural language request;

(ii) assign a composite confidence weight to each candidate alternative as a function of the semantic model;

(iii) determine an optimal route through the lattice of candidate alternatives based on the assigned composite confidence weight; and

(iv) generate a semantic representation of the natural language request as a function of the candidate alternatives of the optimal route;

wherein to generate the semantic model comprises to;

(i) identify a contextual semantic feature in the corpus using an unsupervised algorithm, wherein the contextual semantic feature comprises a sequence of lexical sets associated with a user intent and zero or more slots associated with the user intent;

(ii) determine a first probability of the contextual semantic feature given the user intent; and

(iii) determine a normalized probability of the user intent as a function of a rate of occurrence of the contextual semantic feature in the corpus; and

wherein to identify the contextual semantic feature using the unsupervised algorithm comprises to;

(i) identify predefined named entities and relationships in a first group of predefined sample queries in the corpus;

(ii) cluster the predefined sample queries using an unsupervised clustering algorithm to generate a plurality of clusters; and

(iii) assign a user intent and slots to each cluster of the plurality of clusters.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Technologies for natural language request processing include a computing device having a semantic compiler to generate a semantic model based on a corpus of sample requests. The semantic compiler may generate the semantic model by extracting contextual semantic features or processing ontologies. The computing device generates a semantic representation of a natural language request by generating a lattice of candidate alternative representations, assigning a composite weight to each candidate, and finding the best route through the lattice. The composite weight may include semantic weights, phonetic weights, and/or linguistic weights. The semantic representation identifies a user intent and slots associated with the natural language request. The computing device may perform one or more dialog interactions based on the semantic request, including generating a request for additional information or suggesting additional user intents. The computing device may support automated analysis and tuning to improve request processing. Other embodiments are described and claimed.

37 Citations

View as Search Results

20 Claims

1. A computing device for interpreting natural language requests, the computing device comprising:
- a semantic compiler module to generate a semantic model as a function of a corpus of predefined requests, wherein the semantic model includes a plurality of mappings between a natural language request and a semantic representation of the natural language request, wherein the semantic representation identifies a user intent and zero or more slots associated with the user intent; and
  
  a request decoder module to;
  
  (i) generate, using the semantic model, a lattice of candidate alternatives indicative of a natural language request, wherein each candidate alternative corresponds to a token of the natural language request;
  
  (ii) assign a composite confidence weight to each candidate alternative as a function of the semantic model;
  
  (iii) determine an optimal route through the lattice of candidate alternatives based on the assigned composite confidence weight; and
  
  (iv) generate a semantic representation of the natural language request as a function of the candidate alternatives of the optimal route;
  
  wherein to generate the semantic model comprises to;
  
  (i) identify a contextual semantic feature in the corpus using an unsupervised algorithm, wherein the contextual semantic feature comprises a sequence of lexical sets associated with a user intent and zero or more slots associated with the user intent;
  
  (ii) determine a first probability of the contextual semantic feature given the user intent; and
  
  (iii) determine a normalized probability of the user intent as a function of a rate of occurrence of the contextual semantic feature in the corpus; and
  
  wherein to identify the contextual semantic feature using the unsupervised algorithm comprises to;
  
  (i) identify predefined named entities and relationships in a first group of predefined sample queries in the corpus;
  
  (ii) cluster the predefined sample queries using an unsupervised clustering algorithm to generate a plurality of clusters; and
  
  (iii) assign a user intent and slots to each cluster of the plurality of clusters.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The computing device of claim 1, wherein to generate the semantic model comprises to:
    - generate an ontological index as a function of a predefined ontology associated with the user intent, wherein the predefined ontology includes a plurality of objects describing a knowledge domain.
  - 3. The computing device of claim 2, wherein to generate the ontological index comprises to:
    - assign a vector space to an object type of the predefined ontology, wherein the vector space includes a plurality of coordinates, wherein each coordinate encodes a lexical token representing an associated object of the predefined ontology; and
      
      map a slot of the user intent associated with the predefined ontology to the vector space.
  - 4. The computing device of claim 1, wherein the request decoder module is further to:
    - receive a representation of speech data indicative of the natural language request; and
      
      convert the representation of speech data to a first lattice of candidate alternatives indicative of the natural language request, wherein to convert the representation of speech data comprises to convert the representation of speech data using a language model generated as a function of a domain-biased web corpus;
      
      wherein to generate the lattice of candidate alternatives comprises to generate the lattice of candidate alternatives in response to conversion of the representation of speech data to the first lattice of candidate alternatives.
  - 5. The computing device of claim 1, wherein to generate the lattice of candidate alternatives comprises to:
    - generate a candidate alternative corresponding to a user intent and associated slots of a mapping of the semantic model; and
      
      generate a candidate alternative using a language model, as a function of phonetic similarity to the natural language request.
  - 6. The computing device of claim 1, wherein to assign the composite confidence weight further comprises to assign a confidence weight as a function of a language model.
  - 7. The computing device of claim 6, wherein to assign the confidence weight as a function of the language model comprises to assign a phonetic similarity score, a general language model confidence score, a domain language model score, or a syntactic confidence score.
  - 8. The computing device of claim 1, further comprising a dialog management module to:
    - determine whether the semantic representation of the natural language request includes sufficient information to perform a user intent of the semantic representation;
      
      perform the user intent in response to a determination that the semantic representation includes sufficient information;
      
      generate a response as a function of the semantic representation;
      
      generate a natural language representation of the response using the semantic model; and
      
      record a user dialog session including the natural language request and the natural language representation of the response into a corpus of recorded user dialog sessions.
  - 9. The computing device of claim 1, further comprising a tuning module to update the semantic model in response to generating the semantic representation.
  - 10. The computing device of claim 9, wherein to update the semantic model comprises to:
    - determine the semantic representation was generated with no ambiguities;
      
      identify a token of the natural language request that was not decoded;
      
      oridentify an ambiguous decoding of a slot of the semantic representation.

11. A method for interpreting natural language requests, the method comprising:
- generating, by a computing device, a semantic model as a function of a corpus of predefined requests, wherein the semantic model includes a plurality of mappings between a natural language request and a semantic representation of the natural language request, wherein the semantic representation identifies a user intent and zero or more slots associated with the user intent;
  
  generating, by the computing device using the semantic model, a lattice of candidate alternatives indicative of a natural language request, wherein each candidate alternative corresponds to a token of the natural language request;
  
  assigning, by the computing device, a composite confidence weight to each candidate alternative as a function of the semantic model;
  
  determining, by the computing device, an optimal route through the lattice of candidate alternatives based on the assigned composite confidence weight; and
  
  generating, by the computing device, a semantic representation of the natural language request as a function of the candidate alternatives of the optimal route;
  
  wherein generating the semantic model comprises;
  
  (i) identifying a contextual semantic feature in the corpus using an unsupervised algorithm, wherein the contextual semantic feature comprises a sequence of lexical sets associated with a user intent and zero or more slots associated with the user intent;
  
  (ii) determining a first probability of the contextual semantic feature given the user intent; and
  
  (iii) determining a normalized probability of the user intent as a function of a rate of occurrence of the contextual semantic feature in the corpus; and
  
  wherein identifying the contextual semantic feature using the unsupervised algorithm comprises;
  
  (i) identifying predefined named entities and relationships in a first group of predefined sample queries in the corpus;
  
  (ii) clustering the predefined sample queries using an unsupervised clustering algorithm to generate a plurality of clusters; and
  
  (iii) assigning a user intent and slots to each cluster of the plurality of clusters.
- View Dependent Claims (12, 13, 14, 15)
- - 12. The method of claim 11, wherein generating the semantic model comprises:
    - generating an ontological index as a function of a predefined ontology associated with the user intent, wherein the predefined ontology includes a plurality of objects describing a knowledge domain.
  - 13. The method of claim 11, wherein generating the lattice of candidate alternatives comprises:
    - generating a candidate alternative corresponding to a user intent and associated slots of a mapping of the semantic model; and
      
      generating a candidate alternative using a language model, as a function of phonetic similarity to the natural language request.
  - 14. The method of claim 11, further comprising:
    - determining, by the computing device, whether the semantic representation of the natural language request includes sufficient information to perform a user intent of the semantic representation;
      
      performing, by the computing device, the user intent in response to determining that the semantic representation includes sufficient information;
      
      generating, by the computing device, a response as a function of the semantic representation;
      
      generating, by the computing device, a natural language representation of the response using the semantic model; and
      
      recording, by the computing device, a user dialog session including the natural language request and the natural language representation of the response into a corpus of recorded user dialog sessions.
  - 15. The method of claim 11, further comprising:
    - updating, by the computing device, the semantic model in response to generating the semantic representation.

16. One or more computer-readable storage media comprising a plurality of instructions that in response to being executed cause a computing device to:
- generate a semantic model as a function of a corpus of predefined requests, wherein the semantic model includes a plurality of mappings between a natural language request and a semantic representation of the natural language request, wherein the semantic representation identifies a user intent and zero or more slots associated with the user intent;
  
  generate, using the semantic model, a lattice of candidate alternatives indicative of a natural language request, wherein each candidate alternative corresponds to a token of the natural language request;
  
  assign a composite confidence weight to each candidate alternative as a function of the semantic model;
  
  determine an optimal route through the lattice of candidate alternatives based on the associated confidence weight; and
  
  generate a semantic representation of the natural language request as a function of the candidate alternatives of the optimal route;
  
  wherein to generate the semantic model comprises to;
  
  (i) identify a contextual semantic feature in the corpus using an unsupervised algorithm, wherein the contextual semantic feature comprises a sequence of lexical sets associated with a user intent and zero or more slots associated with the user intent;
  
  (ii) determine a first probability of the contextual semantic feature given the user intent; and
  
  (iii) determine a normalized probability of the user intent as a function of a rate of occurrence of the contextual semantic feature in the corpus; and
  
  wherein to identify the contextual semantic feature using the unsupervised algorithm comprises to;
  
  (i) identify predefined named entities and relationships in a first group of predefined sample queries in the corpus;
  
  (ii) cluster the predefined sample queries using an unsupervised clustering algorithm to generate a plurality of clusters; and
  
  (iii) assign a user intent and slots to each cluster of the plurality of clusters.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The one or more computer-readable storage media of claim 16, wherein to generate the semantic model comprises to:
    - generate an ontological index as a function of a predefined ontology associated with the user intent, wherein the predefined ontology includes a plurality of objects describing a knowledge domain.
  - 18. The one or more computer-readable storage media of claim 16, wherein to generate the lattice of candidate alternatives comprises to:
    - generate a candidate alternative corresponding to a user intent and associated slots of a mapping of the semantic model; and
      
      generate a candidate alternative using a language model, as a function of phonetic similarity to the natural language request.
  - 19. The one or more computer-readable storage media of claim 16, further comprising a plurality of instructions that in response to being executed cause the computing device to:
    - determine whether the semantic representation of the natural language request includes sufficient information to perform a user intent of the semantic representation;
      
      perform the user intent in response to determining that the semantic representation includes sufficient information;
      
      generate a response as a function of the semantic representation;
      
      generate a natural language representation of the response using the semantic model; and
      
      record a user dialog session including the natural language request and the natural language representation of the response into a corpus of recorded user dialog sessions.
  - 20. The one or more computer-readable storage media of claim 16, further comprising a plurality of instructions that in response to being executed cause the computing device to update the semantic model in response to generating the semantic representation.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Intel Corporation
Original Assignee
Intel Corporation
Inventors
Karov, Yael, Breakstone, Micha, Shilon, Reshef, Keller, Orgad, Shellef, Eric
Primary Examiner(s)
Riley, Marcus T

Application Number

US15/332,084
Publication Number

US 20170039181A1
Time in Patent Office

365 Days
Field of Search

None
US Class Current
CPC Class Codes

G06F 16/36   Creation of semantic tools,...

G06F 16/90332   Natural language query form...

G06F 40/211   Syntactic parsing, e.g. bas...

G06F 40/284   Lexical analysis, e.g. toke...

G06F 40/30   Semantic analysis

G06F 40/35   Discourse or dialogue repre...

Self-learning statistical natural language processing for automatic production of virtual personal assistants

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

37 Citations

20 Claims

Specification

Use Cases

Quick Links

Others

Self-learning statistical natural language processing for automatic production of virtual personal assistants

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

37 Citations

20 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others