×

Automatically summarising topics in a collection of electronic documents

  • US 20040205457A1
  • Filed: 10/31/2001
  • Published: 10/14/2004
  • Est. Priority Date: 10/31/2001
  • Status: Abandoned Application
First Claim
Patent Images

1. A method of detecting and summarising at least one topic in at least one document of a document set, each document in said document set having a plurality of terms and a plurality of sentences comprising said plurality of terms, wherein said plurality of terms and said plurality of sentences are represented as a plurality of vectors in a two-dimensional space, said method comprising the steps of:

  • pre-processing said at least one document to extract a plurality of significant terms and to create a plurality of basic terms;

    formatting said at least one document and said plurality of basic terms;

    reducing said plurality of basic terms;

    reducing said plurality of sentences;

    creating a matrix of said reduced plurality of basic terms and said reduced plurality of sentences;

    utilising said matrix to correlate said plurality of basic terms;

    transforming a two-dimensional coordinate associated with each of said correlated plurality of basic terms to an n-dimensional coordinate;

    clustering said reduced plurality of sentence vectors in said n-dimensional space; and

    associating magnitudes of said reduced plurality of sentence vectors with said at least one topic.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×