Information processing to search for related expressions
First Claim
1. An information processing apparatus, comprising:
- a processor configured to;
obtain an input expression;
extract, based on the input expression, a plurality of expressions and a plurality of expression feature amounts from a plurality of documents,wherein the plurality of expression feature amounts is associated with the plurality of expressions,wherein a first expression of the plurality of expressions is distinguished from a second expression of the plurality of expressions based on respective expression feature amounts of the first expression and the second expression, andwherein the first expression and the second expression have a same notation;
cluster the plurality of expressions based on the plurality of expression feature amounts;
calculate a plurality of assignment degree vectors for the plurality of expressions based on an average value of first assignment degree vectors of the input expression,wherein each of the plurality of assignment degree vectors comprises a plurality of assignment degrees of a respective expression of the plurality of expressions, andwherein the assignment degree vectors include features common to the plurality of expressions;
determine related expression candidates, from the plurality of expressions, based on the input expression;
calculate a first score for each of the related expression candidates based on comparison between the related expression candidates and the input expression;
extract, from the plurality of expressions, a set of related expressions of a plurality of related expressions based on the calculated first score,wherein the set of related expressions has second assignment degree vectors;
integrate the set of related expressions into at least one expression, wherein each of the set of related expressions has the same notation;
determine a combined score for the at least one expression based on an addition of scores of the set of related expressions;
determine combined assignment degree vectors for the at least one expression based on a weighted addition of the second assignment degree vectors of the set of related expressions;
extract, as a synonym of the input expression, an expression from the set of related expressions based on each of the combined score for the at least one expression and the combined assignment degree vectors for the at least one expression;
recommend at least one item based on the extracted expression of the set of related expressions; and
transmit information to a user device, wherein the information comprises the at least one item, the at least one expression, the related expression, the combined assignment degree vectors, and the combined score to the user device,wherein the transmitted information is displayed on a display screen of the user device.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed is an information processing apparatus including an expression extraction unit, a feature extraction unit, a clustering unit, a related expression extraction unit, and an output unit. The expression extraction unit extracts a plurality of expressions from a plurality of documents. The feature extraction unit extracts feature amounts of the extracted respective expressions while distinguishing the expressions having the same notation. The clustering unit clusters the extracted respective expressions together while distinguishing the expressions having the same notation and calculates assignment degree vectors having assignment degrees of the respective expressions to two or more respective clusters as components. The related expression extraction unit extracts related expressions having the assignment degree vectors similar to those of a provided input expression while distinguishing the expressions having the same notation. The output unit outputs the related expressions and identification information for identifying the related expressions.
-
Citations
13 Claims
-
1. An information processing apparatus, comprising:
a processor configured to; obtain an input expression; extract, based on the input expression, a plurality of expressions and a plurality of expression feature amounts from a plurality of documents, wherein the plurality of expression feature amounts is associated with the plurality of expressions, wherein a first expression of the plurality of expressions is distinguished from a second expression of the plurality of expressions based on respective expression feature amounts of the first expression and the second expression, and wherein the first expression and the second expression have a same notation; cluster the plurality of expressions based on the plurality of expression feature amounts; calculate a plurality of assignment degree vectors for the plurality of expressions based on an average value of first assignment degree vectors of the input expression, wherein each of the plurality of assignment degree vectors comprises a plurality of assignment degrees of a respective expression of the plurality of expressions, and wherein the assignment degree vectors include features common to the plurality of expressions; determine related expression candidates, from the plurality of expressions, based on the input expression; calculate a first score for each of the related expression candidates based on comparison between the related expression candidates and the input expression; extract, from the plurality of expressions, a set of related expressions of a plurality of related expressions based on the calculated first score, wherein the set of related expressions has second assignment degree vectors; integrate the set of related expressions into at least one expression, wherein each of the set of related expressions has the same notation; determine a combined score for the at least one expression based on an addition of scores of the set of related expressions; determine combined assignment degree vectors for the at least one expression based on a weighted addition of the second assignment degree vectors of the set of related expressions; extract, as a synonym of the input expression, an expression from the set of related expressions based on each of the combined score for the at least one expression and the combined assignment degree vectors for the at least one expression; recommend at least one item based on the extracted expression of the set of related expressions; and transmit information to a user device, wherein the information comprises the at least one item, the at least one expression, the related expression, the combined assignment degree vectors, and the combined score to the user device, wherein the transmitted information is displayed on a display screen of the user device. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
12. An information processing method, comprising:
-
extracting, based on an input expression, a plurality of expressions and a plurality of expression feature amounts from a plurality of documents, wherein the plurality of expression feature amounts is associated with the plurality of expressions, wherein a first expression of the plurality of expressions is distinguished from a second expression of the plurality of expressions based on respective expression feature amounts of the first expression and the second expression, and wherein the first expression and the second expression have a same notation; clustering the plurality of expressions based on the plurality of expression feature amounts; calculating a plurality of assignment degree vectors for the plurality of expressions based on an average value of first assignment degree vectors of the input expression, wherein each of the plurality of assignment degree vectors comprises a plurality of assignment degrees of a respective expression of the plurality of expressions, and wherein the assignment degree vectors include features common to the plurality of expressions; determining related expression candidates, from the plurality of expressions, based on the input expression; calculating a score for each of the related expression candidates based on comparison between the related expression candidates and the input expression; extracting, from the plurality of expressions, a set of related expressions of a plurality of related expressions based on the calculated score, wherein the set of related expressions has second assignment degree vectors; integrating the set of related expressions into at least one expression, wherein each of the set of related expressions has the same notation; determining a combined score for the at least one expression based on an addition of scores of the set of related expressions; determining combined assignment degree vectors for the at least one expression based on a weighted addition of the second assignment degree vectors of the set of related expressions; extracting, as a synonym of the input expression, an expression from the set of related expressions based on each of the combined score for the at least one expression and the combined assignment degree vectors for the at least one expression; recommending at least one item based on the extracted expression of the set of related expressions; and transmitting information to a user device, wherein the information comprises the at least one item, the at least one expression, the related expression, the combined assignment degree vectors, and the combined score, wherein the transmitted information is displayed on a display screen of the user device.
-
-
13. A non-transitory computer-readable medium having stored thereon computer executable instructions that, when executed by a processor, cause a computer to execute operations, the operations comprising:
-
extracting, based on an input expression, a plurality of expressions and a plurality of expression feature amounts from a plurality of documents, wherein the plurality of expression feature amounts are associated with the plurality of expressions, wherein a first expression of the plurality of expressions is distinguished from a second expression of the plurality of expressions based on respective expression feature amounts of the first expression and the second expression, and wherein the first expression and the second expression have a same notation; clustering the plurality of expressions based on the plurality of expression feature amounts; calculating a plurality of assignment degree vectors for the plurality of expressions based on an average value of first assignment degree vectors of the input expression, wherein each of the plurality of assignment degree vectors comprises a plurality of assignment degrees of a respective expression of the plurality of expressions; and wherein the assignment degree vectors include features common to the plurality of expressions; determining related expression candidates, from the plurality of expressions, based on the input expression; calculating a score for each of the related expression candidates based on comparison between the related expression candidates and the input expression; extracting, from the plurality of expressions, a set of related expressions of a plurality of expressions based on the calculated score, wherein the set of related expressions has second assignment degree vectors; integrating the set of related expressions into at least one expression, wherein each of the set of related expressions has the same notation; determining a combined score for the at least one expression based on an addition of scores of the set of related expressions; determining combined assignment degree vectors for the at least one expression based on a weighted addition of the second assignment degree vectors of the set of related expressions; extracting, as a synonym of the input expression, an expression from the set of related expressions based on each of the combined score for the at least one expression and the combined assignment degree vectors for the at least one expression; recommending at least one item based on the extracted expression of the set of related expressions; and transmitting information to a user device, wherein the information comprises the at least one item, the at least one expressions, the related expression, the combined assignment degree vectors, the combined score, wherein the transmitted information is displayed on a display screen of the user device.
-
Specification