Solution recommendation based on incomplete data sets

US 7,415,449 B2
Filed: 01/30/2006
Issued: 08/19/2008
Est. Priority Date: 01/30/2006
Status: Active Grant

First Claim

Patent Images

1. A method that provides at least one print solution, wherein the solution is based at least in part upon data received from a software application, web interface, or questionnaire, comprising:

receiving at least one print process data set that includes at least one of a color requirement, a media characteristic, a print volume, a printer speed, a finishing characteristic, a desired process output, a current process output and a process capacity;

mapping the at least one data set into one or more vectors in a case constraint space to create a case log vector;

mapping the case log vector into a semantic vector with reduced dimensionality via a latent semantic index transformation to eliminate excessive data from the case log vector such that only relevant data remains;

classifying the semantic vector into an existing case cluster whose cluster centroid vector has the largest cosine product with the semantic vector;

returning one or more representative workflows of the existing case cluster as one or more recommended print process workflow solutions; and

storing at least one record of at least one previous case, wherein the at least one record includes one or more print process related constraints, at least one generated print process workflow and at least one interested print process workflow, wherein the at least one generated print process workflow is provided by the recommendation system and the at least one interested print process workflow is selected from one of the generated print process workflows.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In accordance with one aspect of the present exemplary embodiment, a system determines a solution based on received data. An intake component receives an incomplete data set from one or more sources. A recommendation system transforms the incomplete data set into a semantic data set via latent semantic indexing, classifies the semantic data set into an existing cluster and provides one or more solutions of the existing cluster as one or more recommendations.

33 Citations

View as Search Results

9 Claims

1. A method that provides at least one print solution, wherein the solution is based at least in part upon data received from a software application, web interface, or questionnaire, comprising:
- receiving at least one print process data set that includes at least one of a color requirement, a media characteristic, a print volume, a printer speed, a finishing characteristic, a desired process output, a current process output and a process capacity;
  
  mapping the at least one data set into one or more vectors in a case constraint space to create a case log vector;
  
  mapping the case log vector into a semantic vector with reduced dimensionality via a latent semantic index transformation to eliminate excessive data from the case log vector such that only relevant data remains;
  
  classifying the semantic vector into an existing case cluster whose cluster centroid vector has the largest cosine product with the semantic vector;
  
  returning one or more representative workflows of the existing case cluster as one or more recommended print process workflow solutions; and
  
  storing at least one record of at least one previous case, wherein the at least one record includes one or more print process related constraints, at least one generated print process workflow and at least one interested print process workflow, wherein the at least one generated print process workflow is provided by the recommendation system and the at least one interested print process workflow is selected from one of the generated print process workflows.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, wherein clustering the at least one semantic case log vector into at least one group, further includes:
    - clustering n vectors into K clusters by using a K-means algorithm with refined initial centroids;
      
      evaluating the performance of the K clusters by Bayesian Information Criterion (BIC) scores; and
      
      storing the K cluster centroids and their evaluation score as a clustering scheme.
  - 3. The method of claim 2, wherein clustering n vectors into K clusters further includes:
    - a) randomly choosing K initial vectors and setting them as K initial centroid vectors;
      
      b) associating each vector x of the n input vectors with a centroid vector with the largest cosine product;
      
      c) updating each centroid vector by taking the average of all vectors associated with the centroid vector with the largest cosine product;
      
      d) performing b and c until the centroid vectors do not changee) storing the computed K cluster centroids in C_i, a K-vector data structure;
      
      f) evaluating the total distortion score for C_iby computing the sum of the cosine products between each input vector and its associated centroid vector; and
      
      g) returning C_iwith the lowest total distortion score.
  - 4. The method of claim 3, wherein evaluating the performance of the K clusters by Bayesian Information Criterion (BIC) scores further includes:
    - evaluating the K-clustering scheme C_iby one or more BIC scores, which gives the posterior probability of the input points, wherein a BIC score is defined as BIC(C_i)=2L(C_i)−
      
      npar•
      
      log n, where L(C_i) is the posterior log-likelihood of C_inpar, and L(C_i) is defined as
5. The method of claim 2, further including:
- returning the clustering scheme with the best evaluation score.
6. The method of claim 2, further including:
- calculating at least one representative solution for each cluster in the final clustering scheme.

7. A method for providing representative workflows based at least in part upon one or more case logs and storing those workflows on a computer-readable medium, comprising:
- mapping a new case into a vector in a case constraint space to produce a case log vector;
  
  utilizing a latent semantic indexing transformation matrix to map the case log vector into a semantic vector with reduced dimensionality;
  
  clustering the semantic vectors into groups based on their mutual correlations, wherein the number of case log vectors is n and the maximum number of case clusters is K_max, for K=1 to K_max, the clustering algorithm a) clusters n vectors into K clusters by using K-means algorithm with refined initial centroids, b) evaluates the performance of the K clusters by Bayesian Information Criterion (BIC) scores, and c) stores the above K cluster centroids and their evaluation score as a clustering scheme, wherein the clustering scheme with the best evaluation score is output;
  
  classifying the semantic vector into a particular case cluster, which is determined by the case cluster whose cluster centroid vector has the largest cosine product with the semantic vector; and
  
  identifying one or more recommended solutions via correlation to one or more predefined data clusters, wherein the solution is a workflow that defines a process automated by at least one automation device.
- View Dependent Claims (8, 9)
- - 8. The system according to claim 7, wherein mapping the one or more case logs into one or more case log vectors is based at least in part on one or more attribute values and the importance of each attribute, and maps the case log vectors into semantic case log vectors.
  - 9. The system according to claim 7, the case log vectors are mapped into semantic case log vectors wherein d randomly sampled case logs are represented by a t×
    - d matrix M, where each column vector is the vector corresponding to the j th case log, and wherein r is the rank of M, where the singular value decomposition (SVD) of M is M=T×
      
      S×
      
      D′
      
      , where T, a t×
      
      r matrix, and D, a d×
      
      r matrix, have orthonormal column vectors and S is a diagonal matrix with singular values ranked in a descending order, and wherein T_k, S_k, D_k, are the resulted matrices by keeping only the first k columns of T, S, D, where k should be the number of semantic concepts in the case logs which produces a case log vector x that is folded into the k-dimensional semantic space by x_k=x×
      
      T_k, which maps x, a t-dimensional case log vector, into x_k, a k-dimensional semantic case log vector.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Xerox Corporation (Xerox Holdings Corp.)
Original Assignee
Xerox Corporation (Xerox Holdings Corp.)
Inventors
Sun, Tong, Shepherd, Michael D., Zhong, Ming, Coté, Alan T.
Primary Examiner(s)
Holmes; Michael B

Application Number

US11/342,755
Publication Number

US 20070179924A1
Time in Patent Office

932 Days
Field of Search

706/55
US Class Current

706/55
CPC Class Codes

G06F 16/353 into predefined classes

Solution recommendation based on incomplete data sets

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

33 Citations

9 Claims

Specification

Solutions

Use Cases

Quick Links

Solution recommendation based on incomplete data sets

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

33 Citations

9 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links