×

Method of vector analysis for a document

  • US 7,562,066 B2
  • Filed: 11/15/2001
  • Issued: 07/14/2009
  • Est. Priority Date: 11/20/2000
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method for extracting important document segments from an input document, comprising:

  • detecting k terms that occur in said input document;

    segmenting said input document into N document segments, each segment being a predetermined part of said document;

    generating document segment vectors, each vector including as its element values occurrence of frequencies of said terms occurring in said document segments, wherein a n-th document vector dn (n=1, . . . , N) is represented by (dn1, dn2, . . . , dnk) and dni represents the occurrence frequency of an i-th term among a total k terms in a n-th document segment;

    determining eigenvalues and eigenvectors of a square sum matrix A, where said square sum matrix A is a k×

    k matrix, k>

    1, and each component Aab of said square sum matrix A indicates a degree of co-occurrence of a-th and b-th terms (a,b=1, . . . , k) in said input document and is calculated by;

    A a



    b
    =

    n = 1 N


    d n



    a


    d n



    b
    ,
    and a rank of said square sum matrix is represented by R;

    selecting, from said eigenvectors, a plural of (L) eigenvectors to be used for determining importance;

    calculating a weighted sum of squared projections of said document segment vectors onto the selected eigenvectors; and

    selecting documents having significant importance based on said calculated weighted sum of square projections of the document segment vectors;

    wherein the vector dn after the projection is represented by zn=(zn1, zn2, . . . znL), a projection value of dn to a m-th eigenvector is given by znm

    mtdn, where φ

    m represents the m-th eigenvector and t represents transpose;

    a sum of squared projections onto a L dimensional subspace being given by;


    Σ

    mL=IZnm2
    or
    Σ

    mL=Iλ

    mznm2, where λ

    m represents the eigenvalue of the m-th eigenvector”

    .

View all claims
  • 8 Assignments
Timeline View
Assignment View
    ×
    ×