Method, system, and appartus for selecting an acronym expansion
First Claim
1. A method of selecting an expansion for an acronym in a document of a set of linked documents, the method comprising:
- obtaining for each occurrence of the acronym in the set of linked documents one or more possible acronym expansions and an associated probability that the one or more possible acronym expansions is the correct acronym expansion;
identifying a sub-set of documents from the set of linked documents in which the acronym occurs;
recalculating the associated probabilities for a first occurrence of an acronym in the sub-set of documents based, in part, on the associated probabilities of other occurrences of the acronym in the sub-set of documents and the distance between the first occurrence and the other occurrences;
normalizing the recalculated associated probabilities for the first occurrence such that the sum of the recalculated associated probabilities equals one; and
iteratively recalculating the associated probabilities for the first occurrence of the acronym using any previously recalculated associated probabilities of the other occurrences of the acronym.
2 Assignments
0 Petitions
Accused Products
Abstract
According to one embodiment of the present invention, there is provided a method of selecting an expansion for an acronym in a document of a set of linked documents. The method comprises obtaining for each occurrence of the acronym in the set of linked documents one or more possible acronym expansions and an associated probability that the one or more possible acronym expansions is the correct acronym expansion. The further comprises identifying a sub-set of documents from the set of linked documents in which the acronym occurs. The method further comprises recalculating the associated probabilities for a first occurrence of an acronym in the sub-set of documents based, in part, on the associated probabilities of other occurrences of the acronym in the sub-set of documents and the distance between the first occurrence and the other occurrences.
15 Citations
20 Claims
-
1. A method of selecting an expansion for an acronym in a document of a set of linked documents, the method comprising:
-
obtaining for each occurrence of the acronym in the set of linked documents one or more possible acronym expansions and an associated probability that the one or more possible acronym expansions is the correct acronym expansion; identifying a sub-set of documents from the set of linked documents in which the acronym occurs; recalculating the associated probabilities for a first occurrence of an acronym in the sub-set of documents based, in part, on the associated probabilities of other occurrences of the acronym in the sub-set of documents and the distance between the first occurrence and the other occurrences; normalizing the recalculated associated probabilities for the first occurrence such that the sum of the recalculated associated probabilities equals one; and iteratively recalculating the associated probabilities for the first occurrence of the acronym using any previously recalculated associated probabilities of the other occurrences of the acronym. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 16, 17, 18)
-
-
11. A computing device, comprising:
-
a processor, a memory in communication with the processor, and computer executable instructions stored in the memory and executable on the processor to determine an acronym expansion of an acronym in a web page of a web site by; obtaining, through a content-based analysis, for each occurrence of the acronym in the web site, one or more possible acronym expansions and an associated probability that the one or more acronym expansions is the correct acronym expansion; identifying a set of web pages from the web site in which the acronym occurs; calculating a new set of probabilities for a first occurrence of the acronym in the identified set of web pages, the calculation being made in part based on the associated probabilities of other occurrences of the acronym in other ones of the set of web pages and the number of links linking the other ones of the set of web pages; normalizing the new set of probabilities for the first occurrence such that the sum of the recalculated associated probabilities equals one; and iteratively recalculating the associated probabilities for the first occurrence of the acronym using any previously recalculated associated probabilities of the other occurrences of the acronym. - View Dependent Claims (12, 13, 14)
-
-
15. A non-transitory machine-readable storage carrying computer-implementable instructions that, when interpreted by a computer, cause the computer to:
-
obtain for each occurrence of the acronym in the set of linked documents one or more possible acronym expansions and an associated probability that the one or more possible acronym expansions is the correct acronym expansion; identify a sub-set of documents from the set of linked documents in which the acronym occurs; recalculate the associated probabilities for a first occurrence of an acronym in the sub-set of documents based, in part, on the associated probabilities of other occurrences of the acronym in the sub-set of documents and the distance between the first occurrence and the other occurrences; normalize the recalculated associated probabilities for the first occurrence such that the sum of the recalculated associated probabilities equals one; and iteratively recalculate the associated probabilities for the first occurrence of the acronym using any previously recalculated associated probabilities of the other occurrences of the acronym. - View Dependent Claims (19, 20)
-
Specification