Key term extraction
First Claim
1. A method of managing content, the method comprising:
- obtaining a set of candidate terms from the content; and
filtering the set of candidate terms based on a set of general exclusion conditions, wherein the set of general exclusion conditions includes at least one of;
an exclusion condition for excluding all candidate terms that appear in a set of common terms or an exclusion condition for excluding all near duplicate candidate terms.
1 Assignment
0 Petitions
Accused Products
Abstract
An improved solution for extracting and identifying key terms in content is provided. In the solution, one or more documents can be obtained and one or more candidate terms can be obtained from the content of each document. Subsequently, one or more exclusion conditions can be used to filter the candidate terms thereby generating a set of key terms. In particular, all common terms and/or all near duplicate candidate terms can be excluded from the set of key terms. The set of key terms can be used in translating the content, generating a glossary/index for the content, detecting and/or correcting incorrect terms and usages of terms in the content, building a terminology repository, and/or the like. In one embodiment, the key term extraction is performed to facilitate the translation of product documentation into a second language in order to release the product in one or more nations. In this case, candidate terms that have already been translated into the second language can also be excluded from the key terms.
38 Citations
20 Claims
-
1. A method of managing content, the method comprising:
-
obtaining a set of candidate terms from the content; and
filtering the set of candidate terms based on a set of general exclusion conditions, wherein the set of general exclusion conditions includes at least one of;
an exclusion condition for excluding all candidate terms that appear in a set of common terms or an exclusion condition for excluding all near duplicate candidate terms. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for managing content, the system comprising:
-
a system for obtaining a set of candidate terms from the content; and
a system for filtering the set of candidate terms based on a set of general exclusion conditions, wherein the set of general exclusion conditions includes at least one of;
an exclusion condition for excluding all candidate terms that appear in a set of common terms or an exclusion condition for excluding all near duplicate candidate terms. - View Dependent Claims (9, 10, 11, 12)
-
-
13. A method of managing the translation of content, the method comprising:
-
obtaining a set of candidate terms from the content for translation from a source language to a target language;
obtaining a set of general exclusion conditions, wherein the set of general exclusion conditions includes at least one of;
an exclusion condition for excluding all candidate terms that appear in a set of common terms or an exclusion condition for excluding all near duplicate candidate terms;
filtering the set of candidate terms based on the set of general exclusion conditions;
translating the filtered set of candidate terms; and
providing the translated set of candidate terms and the content for use by a translator. - View Dependent Claims (14, 15, 16, 19, 20)
-
-
17. A system for managing the translation of content, the system comprising:
-
a system for obtaining a set of candidate terms from the content for translation from a source language to a target language;
a system for filtering the set of candidate terms based on a set of general exclusion conditions, wherein the set of general exclusion conditions includes at least one of;
an exclusion condition for excluding all candidate terms that appear in a set of common terms or an exclusion condition for excluding all near duplicate candidate terms;
a system for translating the filtered set of candidate terms; and
a system for providing the translated set of candidate terms and the content for use by a translator. - View Dependent Claims (18)
-
Specification