AUTOMATICALLY FINDING ACRONYMS AND SYNONYMS IN A CORPUS
First Claim
1. A method in a computer system for identifying acronym and synonym pairs for a selected target corpus, the method comprising:
- analyzing each sentence in the target corpus to identify possible acronym and synonym pairs;
determining an occurrence frequency of each identified possible acronym and synonym pair;
determining a maximum possible length for each identified possible acronym and synonym pair;
ranking each identified possible acronym and synonym pair based on the occurrence frequency and maximum possible length; and
displaying the ranked acronym and synonym pairs to the user.
1 Assignment
0 Petitions
Accused Products
Abstract
Acronym and synonym pairs can be identified and retrieved automatically in a corpus and/or across an enterprise based on customer settings globally or for a single instance. Possible acronym and synonym term pairs can be identified using a rule such as a heuristic, user-defined rule. Rules selected by the user can be used to rank acronym and synonym pairs using factors such as occurrence frequency and maximum term length. A rule interpreter engine executes the user defined rule set to properly identify and retrieve the user selected acronym and synonym pairs through the utilization of a shallow pause read step. Finally, the user selected acronym and synonym pairs are ranked according to the user preferences, and can be displayed or held for subsequent use in searching.
193 Citations
19 Claims
-
1. A method in a computer system for identifying acronym and synonym pairs for a selected target corpus, the method comprising:
-
analyzing each sentence in the target corpus to identify possible acronym and synonym pairs; determining an occurrence frequency of each identified possible acronym and synonym pair; determining a maximum possible length for each identified possible acronym and synonym pair; ranking each identified possible acronym and synonym pair based on the occurrence frequency and maximum possible length; and displaying the ranked acronym and synonym pairs to the user. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer program product embedded in a computer readable medium for identifying acronym and synonym pairs for a selected target corpus, comprising:
-
program code for analyzing each sentence in the target corpus to identify possible acronym and synonym pairs; program code for determining an occurrence frequency of each identified possible acronym and synonym pair; program code for determining a maximum possible length for each identified possible acronym and synonym pair; program code for ranking each identified possible acronym and synonym pair based on the occurrence frequency and maximum possible length; and program code for displaying the ranked acronym and synonym pairs to the user. - View Dependent Claims (13, 14, 15)
-
-
16. A system for identifying acronym and synonym pairs for a selected target corpus, the system comprising a processor operable to execute instructions and a data storage medium for storing the instructions that, when executed by the processor, cause the processor to:
-
analyze each sentence in the target corpus to identify possible acronym and synonym pairs; determine an occurrence frequency of each identified possible acronym and synonym pair; determine a maximum possible length for each identified possible acronym and synonym pair; rank each identified possible acronym and synonym pair based on the occurrence frequency and maximum possible length; and display the ranked acronym and synonym pairs to the user. - View Dependent Claims (17, 18, 19)
-
Specification