×

Method and system for filtering website content

  • US 7,549,119 B2
  • Filed: 11/18/2004
  • Issued: 06/16/2009
  • Est. Priority Date: 11/18/2004
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer implemented method for filtering content submitted by a user for dissemination over a communication forum, the method comprising the steps of:

  • (a) intercepting the content submitted by the user at the time of submission by the user to the communication forum;

    (b) preprocessing a copy of said intercepted content through a preprocessing subroutine to yield a modified content by reducing said intercepted content to its least common denominator, wherein said preprocessing step further comprises the steps of;

    (b1) analyzing said intercepted content for HTML tags, wherein when there are no HTML tags, performing steps (b2) through (b7), and when there are HTML tags, performing steps (b8) through (b12);

    (b2) converting each white space to a space, wherein said white space is a one of a space, a tab, a return, an end of line character, and any other character that is displayed on a display device as said white space to a viewer;

    (b3) removing each punctuation character at an end of a word, wherein said word is a string of characters;

    (b4) converting each uppercase letter into a corresponding lowercase letter;

    (b5) performing a character mapping on the results of said steps (b2), (b3), and (b4) of the intercepted content;

    (b6) utilizing the results of said step (b5), changing a three or more of any consecutively repeated character to two of said consecutively repeated character or to a one of said consecutively repeated character based upon a predefined list; and

    (b7) deleting any remaining spaces at the end of said intercepted content;

    (b8) separating said HTML tags from a non-HTML text of said intercepted(b9) concatenating said non-HTML text with a space where said HTML tag was located in said intercepted content;

    (b10) sending said concatenated non-HTML text to said converting step (b2) for continued processing;

    (b11) copying a text inside said HTML tags to a file; and

    (b12) processing said text inside each said HTML tags through steps (b2), (b4), and (b7);

    (c) breaking said modified content down through a content breakdown subroutine into a plurality of strings of words, wherein each successive string of words drops the first word from the previous string of words;

    (d) processing each of said plurality of strings of words through a recursive comparison subroutine to attempt to identify at least one undesirable term that matches a previously identified undesirable term stored in a secondary database of undesirable terms, wherein each of said previously identified undesirable terms is a word or a phrase; and

    (e) when said at least one undesirable term is identified, blocking the content submitted by the user to the communication forum from appearing on the communication forum.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×