Establishing “is a” relationships for a taxonomy
First Claim
1. A method establishing is-a relationships, the method comprising:
- receiving, by a server, a query from a user, the query including a string and one or more context words;
generating a first co-occurrence vector indicating co-occurrence statistics for terms in a reference corpus with respect to the string;
generating a second co-occurrence vector indicating terms and co-occurrence statistics for terms in current social media postings;
determining a difference between the first and second co-occurrence vectors;
determining that the difference between the first and second co-occurrence vectors exceeds a threshold condition;
in response to determining that the difference between the first and second co-occurrence vectors exceeds a threshold condition, requesting from the user additional context words;
retrieving homonym concepts corresponding to the string and concept vectors for each homonym concept;
comparing the concept vectors to the one or more context words;
selecting at least one homonym concept according to the comparison;
retrieving from a category-concept mapping database at least one category for the selected at least one homonym concept, the category-concept mapping database establishing is-a relationships between a plurality of categories and a plurality of concepts; and
transmitting, by the server, the retrieved at least one category for display to the user.
2 Assignments
0 Petitions
Accused Products
Abstract
Disclosed are methods for returning to a user an answer to the question “what is <string>.” Concepts and classes to which the concepts belong are determined from a corpus, such as taxonomy. The concepts are mapped to categories according to the structure of the taxonomy. Homonyms for words are collected and scored according to likeliness of use. Concept vectors are assembled for the identified concepts based on articles in the corpus and social media usage. Words are evaluated for generic-ness and a generic score is associated therewith. In responding to a query, the generic-ness of the terms of the query is evaluated and additional context solicited if the terms are generic. Candidate homonym concepts for a string in the query are selected according to context vectors for the homonym concepts. One or more homonym concepts are selected and the one or more categories corresponding to these concepts are returned.
22 Citations
21 Claims
-
1. A method establishing is-a relationships, the method comprising:
-
receiving, by a server, a query from a user, the query including a string and one or more context words; generating a first co-occurrence vector indicating co-occurrence statistics for terms in a reference corpus with respect to the string; generating a second co-occurrence vector indicating terms and co-occurrence statistics for terms in current social media postings; determining a difference between the first and second co-occurrence vectors; determining that the difference between the first and second co-occurrence vectors exceeds a threshold condition; in response to determining that the difference between the first and second co-occurrence vectors exceeds a threshold condition, requesting from the user additional context words; retrieving homonym concepts corresponding to the string and concept vectors for each homonym concept; comparing the concept vectors to the one or more context words; selecting at least one homonym concept according to the comparison; retrieving from a category-concept mapping database at least one category for the selected at least one homonym concept, the category-concept mapping database establishing is-a relationships between a plurality of categories and a plurality of concepts; and transmitting, by the server, the retrieved at least one category for display to the user. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for establishing is-a relationships, the system comprising one or more processors and one or more memory devices operably coupled to the one or more processors and storing executable and operational data effective to cause the one or more processors to:
-
receive a query from a user, the query including a string and one or more context words; generate a first co-occurrence vector indicating co-occurrence statistics for terms in a reference corpus with respect to the string; generate a second co-occurrence vector indicating terms and co-occurrence statistics for terms in current social media postings; determine a difference between the first and second co-occurrence vectors; determine that the difference between the first and second co-occurrence vectors exceeds a threshold condition; in response to determining that the difference between the first and second co-occurrence vectors exceeds a threshold condition, request from the user additional context words; retrieve homonym concepts corresponding to the string and concept vectors for each homonym concept; compare the concept vectors to the one or more context words; select at least one homonym concept according to the comparison; retrieve from a category-concept mapping database at least one category for the selected at least one homonym concept, the category-concept mapping database establishing is-a relationships between a plurality of categories and a plurality of concepts; and transmit the retrieved at least one category for display to the user. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer program product for establishing is-a relationships, the computer program product being embodied in a non-transitory computer readable storage medium and comprising computer instructions for:
-
receiving a query from a user, the query including a string and one or more context words; generating a first co-occurrence vector indicating co-occurrence statistics for terms in a reference corpus with respect to the string; generating a second co-occurrence vector indicating terms and co-occurrence statistics for terms in current social media postings; determining a difference between the first and second co-occurrence vectors; determining that the difference between the first and second co-occurrence vectors exceeds a threshold condition; in response to determining that the difference between the first and second co-occurrence vectors exceeds a threshold condition, requesting from the user additional context words; retrieving homonym concepts corresponding to the string and concept vectors for each homonym concept; comparing the concept vectors to the one or more context words; selecting at least one homonym concept according to the comparison; retrieving from a category-concept mapping database at least one category for the selected at least one homonym concept, the category-concept mapping database establishing is-a relationships between a plurality of categories and a plurality of concepts; and transmitting the retrieved at least one category for display to the user. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification