×

SYSTEM, METHOD AND COMPUTER EXECUTABLE PROGRAM FOR INFORMATION TRACKING FROM HETEROGENEOUS SOURCES

  • US 20090006377A1
  • Filed: 01/23/2008
  • Published: 01/01/2009
  • Est. Priority Date: 01/23/2007
  • Status: Active Grant
First Claim
Patent Images

1. ) A system for information clustering, said system comprising;

  • a data accumulation part for accumulating and clustering documents in a document repository, said documents including loosely related clusters between said documents being time sliced so as to define chunks of said documents;

    a vector space generation part for generating document-keyword vectors, said document-keyword vectors consisting of sparse numeral values depending on presence of keywords in said documents;

    a dimension reduction part for reducing dimensions of said keywords to create a dimension reduction matrix of said document-keyword matrix;

    a centroid vector determination part for generating a centroid vector of said cluster, said cluster being retrieved from said document-keyword vector using a principal component in a same line of said dimension reduction matrix, said centroid vectors being defined from keywords and weight of documents within said cluster; and

    an item repository for storing said centroid vectors together with said keywords and said weights of said centroid vector.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×