×

Parsing rule generalization by n-gram span clustering

  • US 9,092,505 B1
  • Filed: 06/25/2013
  • Issued: 07/28/2015
  • Est. Priority Date: 06/25/2013
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method performed by a data processing apparatus, comprising:

  • accessing command sentences stored in a data store, wherein each command sentence is a collection of n-grams and each command sentence includes at least one n-gram that is a non-terminal n-gram that maps to a non-terminal type, and wherein the command sentences include non-terminal n-grams that collectively map to a plurality of different non-terminal types;

    for each of the non-terminal types;

    identifying n-gram spans, each n-gram span being a proper subset of a set of n-grams that constitute a command sentence and including a non-terminal n-gram of the non-terminal type and one or more terminal n-grams that do not map to a non-terminal type;

    determining clusters of the n-gram spans, each cluster including n-gram spans meeting a measure of similarity of n-grams spans that belong to the cluster; and

    for each cluster of n-gram spans, determining, from the n-gram spans belonging to the cluster, a new non-terminal type to which the terminal n-grams of the n-gram spans map.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×