System and method for automatic tagging of ducuments
First Claim
1. A method for automatically tagging text in an input text document, the method taking as input a list of user-defined tags and a list of keywords corresponding to the tags, the method comprising the steps of:
- a. modifying the input text document; and
b. tagging the input text document by repeatedly selecting a tag from the list of user-defined tags, and tagging text in the input text document that has keywords corresponding to this selected tag.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention provides a system and method for automatically tagging documents with a given set of user-defined tags. The present invention takes as input the document to be tagged, and also a list of tags along with keywords belonging to these tags. The present invention then selects a tag, and scans the document for sentences that have keywords corresponding to the selected tag. Sentences that match the keywords are tagged with the selected tag. Once the whole document has been scanned, the present invention selects the next tag and repeats the whole process. This process is repeated until all tags have been seen.
57 Citations
13 Claims
-
1. A method for automatically tagging text in an input text document, the method taking as input a list of user-defined tags and a list of keywords corresponding to the tags, the method comprising the steps of:
-
a. modifying the input text document; and
b. tagging the input text document by repeatedly selecting a tag from the list of user-defined tags, and tagging text in the input text document that has keywords corresponding to this selected tag. - View Dependent Claims (2, 3, 4)
-
-
5. A system for automatically tagging text in an input text document, the system taking as input a list of user-defined tags and a list of keywords corresponding to the tags, the system comprising:
-
a. a modifier portion for modifying the input text document; and
b. a tagger portion for tagging the input text document. - View Dependent Claims (6)
-
-
7. A computer program product for use with a computer, the computer program product comprising a computer usable medium having a computer readable program code embodied therein for automatically tagging text in an input text document, the computer program product taking as input a list of user-defined tags and a list of keywords corresponding to the tags, the computer program code performing the steps of:
-
a. modifying the input text document; and
b. tagging the input text document by repeatedly selecting a tag from the list of user-defined tags, and tagging text in the input text document that has keywords corresponding to this selected tag. - View Dependent Claims (8, 9, 10)
-
-
11. A method for automatically tagging text in an input text document, the method taking as input a list of user-defined tags and a list of keywords corresponding to the tags, the method comprising the steps of:
-
a. modifying the input text document to increase informational content and minimized overlapping tags;
wherein modifying the input text document to increase informational content and minimized overlapping tags comprises;
i. checking spelling of words in the input text document;
ii. removing stop words from the input text document;
iii. replacing synonyms of words in the input text document; and
iv. decomposing sentences and parts of speech in the input text document; and
b. tagging the input text document with XML tags;
wherein tagging the input text document with XML tags comprises;
i. selecting a tag from the list of user-defined tags;
ii. searching the input text document for text containing keywords corresponding to the selected tag;
iii. tagging text in the input text document with tags, if the text has keywords corresponding to the selected tag;
iv. iteratively repeating steps i and ii until all tags in the list of user-defined tags have been selected; and
v. displaying the tagged input text document.
-
-
12. A system for automatically tagging text in an input text document, the system taking as input a list of user-defined tags and a list of keywords corresponding to the tags, the system comprising:
-
a. a modifier portion for modifying the input text document to increase informational content and minimize overlapping tags;
wherein the modifier portion;
i. checks the spelling of words in the input text document;
ii. removes stop words from the input text document;
iii. replaces synonyms of words in the input text document; and
iv. decomposes sentences and parts of speech in the input text document; and
b. a tagger portion for tagging the input text document with XML tags;
wherein the tagger portion;
i. selects a tag from the list of user-defined tags;
ii. searches the input text document for text containing keywords corresponding to the selected tag;
iii. tags text in the input text document with tags, if the text has keywords corresponding to the selected tag;
iv. iteratively repeats steps a and b until all tags in the list of user-defined tags have been selected; and
v. displays the tagged input text document.
-
-
13. A computer program product for use with a computer, the computer program product comprising a computer usable medium having a computer readable program code embodied therein for for automatically tagging text in an input text document, the computer program product taking as input a list of user-defined tags and a list of keywords corresponding to the tags, the computer program code performing the steps of:
-
a. modifying the input text document to increase informational content and minimized overlapping tags;
wherein modifying the input text document to increase informational content and minimized overlapping tags comprises;
i. checking spelling of words in the input text document;
ii. removing stop words from the input text document;
iii. replacing synonyms of words in the input text document; and
iv. decomposing sentences and parts of speech in the input text document; and
b. tagging the input text document with XML tags;
wherein tagging the input text document with XML tags comprises;
i. selecting a tag from the list of user-defined tags;
ii. searching the input text document for text containing keywords corresponding to the selected tag;
iii. tagging text in the input text document with tags, if the text has keywords corresponding to the selected tag;
iv. iteratively repeating steps i and ii until all tags in the list of user-defined tags have been selected; and
v. displaying the tagged input text document.
-
Specification