Method for summarizing natural language text

US 7,925,496 B1
Filed: 04/23/2007
Issued: 04/12/2011
Est. Priority Date: 04/23/2007
Status: Expired due to Fees

First Claim

Patent Images

1. A computer-implemented method comprising the steps of:

eliminating, from a portion of a body of text, the most-frequently occurring words that do not relate to the specific context of the portion of the body of text;

searching a data storage having a plurality of words and at least one summary rule associated with each word stored therein, for the remaining words within the portion of the body of text;

selecting, from the data storage, one of the summary rules associated with each remaining word found in the data storage, whereinif more than one summary rule is associated with one of the remaining words, a validated summary rule is selected over a non-validated summary rule,if more than one validated summary rule is associated with one of the remaining words, the validated summary rule that covers the most-specific context of the remaining word is selected, andif equally-specific validated summary rules are associated with one of the remaining words, the most-frequently-used validated summary rule that covers the most-specific context of the remaining word is selected;

creating a computer-generated summary by applying the selected summary rules to the body of text;

determining that a user has made user corrections to the computer-generated summary;

creating one or more new summary rules based upon the user corrections bymodifying one or more existing associations of a word and a summary rule stored in the data storage to reflect the user corrections if the user added one or more words to the computer-generated summary that are stored in the data storage, removed one or more words from the computer-generated summary, or performed spelling/grammar corrections to the computer-generated summary, andcreating a new association of a word and a summary rule if the user added one or more words to the computer-generated summary that are not stored in the data storage, the new association associating the non-stored word with a word sequence from a portion of the computer-generated summary; and

storing the one or more new summary rules in the data storage.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method includes the steps of comparing a first body of text with a user-created summary of the first body of text, creating rules based on the comparison of the first body of text with the user-created summary of the first body of text, selecting one or more summary rules for generating a computer-created summary of a second body of text, and applying the selected summary rules to the second body of text to generate a computer-created summary of the second body of text. The first body of text may be a user-corrected summary of a computer-created summary of the first body of text. The rules may be selected based on previous use, frequency of use, context of the body of text, or most-specific applicability. The rules may be iteratively applied to generate a summary. A method is also provided for generating a heading for a summary of text.

43 Citations

View as Search Results

6 Claims

1. A computer-implemented method comprising the steps of:
- eliminating, from a portion of a body of text, the most-frequently occurring words that do not relate to the specific context of the portion of the body of text;
  
  searching a data storage having a plurality of words and at least one summary rule associated with each word stored therein, for the remaining words within the portion of the body of text;
  
  selecting, from the data storage, one of the summary rules associated with each remaining word found in the data storage, whereinif more than one summary rule is associated with one of the remaining words, a validated summary rule is selected over a non-validated summary rule,if more than one validated summary rule is associated with one of the remaining words, the validated summary rule that covers the most-specific context of the remaining word is selected, andif equally-specific validated summary rules are associated with one of the remaining words, the most-frequently-used validated summary rule that covers the most-specific context of the remaining word is selected;
  
  creating a computer-generated summary by applying the selected summary rules to the body of text;
  
  determining that a user has made user corrections to the computer-generated summary;
  
  creating one or more new summary rules based upon the user corrections bymodifying one or more existing associations of a word and a summary rule stored in the data storage to reflect the user corrections if the user added one or more words to the computer-generated summary that are stored in the data storage, removed one or more words from the computer-generated summary, or performed spelling/grammar corrections to the computer-generated summary, andcreating a new association of a word and a summary rule if the user added one or more words to the computer-generated summary that are not stored in the data storage, the new association associating the non-stored word with a word sequence from a portion of the computer-generated summary; and
  
  storing the one or more new summary rules in the data storage.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The computer-implemented method of claim 1 further comprising the step of, prior to eliminating the most-frequently occurring words from the portion of the body of text, normalizing the words of the portion of the body of text.
  - 3. The computer-implemented method of claim 1 further comprising the step of, prior to eliminating the most-frequently occurring words from the portion of the body of text, eliminating commonly used expressions from the portion of the body of text.
  - 4. The computer-implemented method of claim 1, wherein the portion of the body of text is a sentence.
  - 5. The computer-implemented method of claim 1, wherein the portion of the body of text is a paragraph.

6. A non-transitory computer-readable medium having a method stored thereon, the method represented by computer readable programming code, the method comprising the steps of:
- eliminating, from a portion of a body of text, the most-frequently occurring words that do not relate to the specific context of the portion of the body of text;
  
  searching a data storage having a plurality of words and at least one summary rule associated with each word stored therein, for the remaining words within the portion of the body of text;
  
  selecting, from the data storage, one of the summary rules associated with each remaining word, whereinif more than one summary rule is associated with one of the remaining words, a validated summary rule is selected over a non-validated summary rule,if more than one validated summary rule is associated with one of the remaining words, the validated summary rule that covers the most-specific context of the remaining word is selected, andif equally-specific validated summary rules are associated with one of the remaining words, the most-frequently-used validated summary rule that covers the most-specific context of the remaining word is selected;
  
  creating a computer-generated summary by applying the selected summary rules to the body of text;
  
  determining that a user has corrected the computer-generated summary;
  
  creating one or more new summary rules based upon the user corrections by,modifying one or more existing associations of a word and a summary rule stored in the data storage to reflect the user corrections if the user added one or more words to the computer-generated summary that are stored in the data storage, removed one or more words from the computer-generated summary, or performed spelling/grammar corrections to the computer-generated summary, andcreating a new association of a word and a summary rule if the user added one or more words to the computer-generated summary that are not stored in the data storage, the new association associating the non-stored word with a word sequence from a portion of the computer-generated summary; and
  
  storing the one or more new summary rules in the data storage.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
the united states of america as represented by the secretary of the navy
Original Assignee
the united states of america as represented by the secretary of the navy
Inventors
Rubin, Stuart Harvey
Primary Examiner(s)
Saint Cyr; Leonard

Application Number

US11/789,129
Time in Patent Office

1,450 Days
Field of Search

704 7- 10
US Class Current

704/7
CPC Class Codes

G06F 16/345 Summarisation for human users

Method for summarizing natural language text

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

43 Citations

6 Claims

Specification

Solutions

Use Cases

Quick Links

Method for summarizing natural language text

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

43 Citations

6 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links