×

Automatic generation of document summaries through use of structured text

  • US 7,509,572 B1
  • Filed: 07/16/1999
  • Issued: 03/24/2009
  • Est. Priority Date: 07/16/1999
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer readable medium comprising a plurality of instructions stored thereon when executed, generate summaries for a plurality of documents comprising text, the computer readable medium comprising sets of instructions for:

  • a) for each document in the plurality of documents, generating text structure tags for the document including generating text structure tags in accordance with Text Encoding Initiative (TEI), the text structure tags identifying a plurality of argumentative text types, wherein a text type comprises a type of argumentative content for an associated portion of a document, the types of argumentative content comprising an argument premise giving support, evidence, or reasoning for or against a conclusion or the conclusion comprising a resulting determination made using one or more argument premises;

    b) for each document in the plurality of documents, encoding the document to generate a tree structure comprising a plurality of nodes, wherein the nodes correspond with the text types and hierarchical relationships among the nodes reflect argumentative relationships among the text types, and wherein encoding the document comprises mapping a base hierarchical structure, utilizing DTD of the eXtensible Markup Language (“

    XML”

    ), to reflect said hierarchical relationships; and

    processing the document to generate the tree structure in accordance with the base hierarchical structure;

    c) selecting a plurality of tree structures for the plurality of documents;

    d) combining, as a single logical tree structure, the plurality of tree structures; and

    e) generating a summary for the plurality of documents by;

    i) receiving from a user a selection of one or more particular text types for summarization, the one or more particular text types comprising the argument premise text type; and

    ii) identifying, based upon the text type tags, a set of nodes from the plurality of tree structures corresponding to the one or more selected text types including one or more nodes corresponding to the argument premise text type; and

    iii) extracting portions of text from the plurality of documents that correspond to the identified set of nodes selected to form the summary of the plurality of documents.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×