×

GENERATING DOCUMENT TEMPLATES THAT ARE ROBUST TO STRUCTURAL VARIATIONS

  • US 20090276506A1
  • Filed: 05/02/2008
  • Published: 11/05/2009
  • Est. Priority Date: 05/02/2008
  • Status: Active Grant
First Claim
Patent Images

1. A network device configured to manage document templates, comprising:

  • a transceiver to send and receive data over a network; and

    a processor that is operative to enable actions for;

    receiving a tree-based regular expression that represents the template;

    below a given level in the tree-based regular expression, performing;

    forming clusters of sub-trees of the tree-based regular expression via a cost measure;

    generating a nested pattern regular expression based on the clusters;

    merging sub-trees based on the nested pattern regular expression;

    replacing sub-trees in the tree-based regular expression at the given level with the merged sub-trees; and

    repeating, for a next higher level of the tree-based regular expression that is closer to a root of the corresponding tree, the actions of forming clusters, generating a nested pattern regular expression, merging sub-trees, and replacing sub-trees in the tree-based regular expression.

View all claims
  • 9 Assignments
Timeline View
Assignment View
    ×
    ×