Selecting terms in a document
First Claim
Patent Images
1. A system for determining a mapping between a textual representation in a document and a concept, comprising:
- a communications interface configured to receive a document; and
a processor configured to;
identify a set of candidate textual representations in the document;
determine, the set of candidate textual representation included in the set, a set of associated concepts included in a taxonomy of concepts; and
sum a plurality of category vectors to generate a document vector, each category vector associated with an associated concept of the set of associated concepts and indicating correspondence of related concepts to the associated concept;
compute a set of document similarity scores for the set of associated concepts according to a correspondence of the category vectors corresponding thereto and the document vector;
select at least one representative concept of the associated concepts according to the set of document similarity scores;
provide as output the representative concept and a candidate textual representation of the set of candidate textual representations corresponding thereto; and
a memory coupled to the processor and configured to provide the processor with instructions.
3 Assignments
0 Petitions
Accused Products
Abstract
Determining a mapping between a textual representation in a document and a concept is disclosed. A document is received. A set of candidate textual representations in the document is identified. For at least one candidate textual representation included in the set, an associated concept included in a taxonomy of concepts is determined. The candidate textual representation and the associated concept are provided as output.
79 Citations
20 Claims
-
1. A system for determining a mapping between a textual representation in a document and a concept, comprising:
-
a communications interface configured to receive a document; and a processor configured to; identify a set of candidate textual representations in the document; determine, the set of candidate textual representation included in the set, a set of associated concepts included in a taxonomy of concepts; and sum a plurality of category vectors to generate a document vector, each category vector associated with an associated concept of the set of associated concepts and indicating correspondence of related concepts to the associated concept; compute a set of document similarity scores for the set of associated concepts according to a correspondence of the category vectors corresponding thereto and the document vector; select at least one representative concept of the associated concepts according to the set of document similarity scores; provide as output the representative concept and a candidate textual representation of the set of candidate textual representations corresponding thereto; and a memory coupled to the processor and configured to provide the processor with instructions. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A method for determining a mapping between a textual representation in a document and a concept, comprising:
-
receiving a document; identifying a set of candidate textual representations in the document; determining, for the set of candidate textual representations, a set of associated concepts included in a taxonomy of concepts; summing a plurality of category vectors to generate a document vector, each category vector associated with an associated concept of the set of associated concepts and indicating correspondence of related concepts to the associated concept; computing a set of document similarity scores for the set of associated concepts according to a correspondence of the category vectors corresponding thereto and the document vector; selecting at least one representative concept of the associated concepts according to the set of document similarity scores; and providing as output the at least one representative concept and a candidate textual representation of the set of candidate textual representations corresponding thereto. - View Dependent Claims (18, 19)
-
-
20. A computer program product for determining a mapping between a textual representation in a document and a concept, the computer program product being embodied in a computer readable storage medium and comprising computer instructions for:
-
receiving a document; identifying a set of candidate textual representations in the document; determining, for the set of candidate textual representation included in the set, a set of associated concept included in a taxonomy of concepts; summing a plurality of category vectors to generate a document vector, each category vector associated with an associated concept of the set of associated concepts and indicating correspondence of related concepts to the associated concept; computing a set of document similarity scores for the set of associated concepts according to a correspondence of the category vectors corresponding thereto and the document vector; selecting at least one representative concept of the associated concepts according to the set of document similarity scores; and providing as output the at least one representative concept and a candidate textual representation of the set of candidate textual representations corresponding thereto.
-
Specification