Intelligent query system and method using phrase-code frequency-inverse phrase-code document frequency module
First Claim
Patent Images
1. An intelligent query method, comprising the steps of:
- providing a plurality of multimedia documents each containing a plurality of content items using an electronic document feed;
categorizing each of the documents into at least one of a plurality of pre-defined content based taxonomies, each taxonomy having an associated substantive content and corresponding taxonomy elements; and
using a computer-implemented categorization engine;
filtering the plurality of content items in each document into at least two groups;
discarding at least one of the groups of content items for each document;
for each document, correlating each of the non-discarded content items in the document with the taxonomy elements corresponding to the at least one taxonomy in which the document is categorized;
storing the correlated taxonomy elements and non-discarded content items for each document in an electronic database; and
calculating a correlation value between the non-discarded content items for each document and the correlated taxonomy elements;
wherein calculating the correlation value comprises applying a phrase code frequency inverse phrase code document frequency (PCF-IPCDF) scoring model to the correlated taxonomy elements; and
wherein the PCF-IPCDF scoring model comprises a phrase-code frequency (PCF), the PCF being the number of times a phrase p appears in multimedia documents containing a code c, multiplied by an inverse phrase code document frequency (IPCDF), the IPCDF being a logarithm of;
the number of documents coded with c divided by the number of documents for which the phrase p and code c appear together.
4 Assignments
0 Petitions
Accused Products
Abstract
An intelligent query system and method used in a search and retrieval system provides an end-user the most relevant, meaningful, up-to-date, and precise search results. The system and method allows an end-user to benefit from an experienced recommendation that is tailored to a specific industry. The system and method recognizes that the phrases “strike outs” and “home run” are much more strongly correlated with “BASE” as opposed to “EQUITIES.” When a search is conducted or a lookup is done in a map, the system and method recommends the strongest correlation as “BASE.”
-
Citations
2 Claims
-
1. An intelligent query method, comprising the steps of:
-
providing a plurality of multimedia documents each containing a plurality of content items using an electronic document feed; categorizing each of the documents into at least one of a plurality of pre-defined content based taxonomies, each taxonomy having an associated substantive content and corresponding taxonomy elements; and using a computer-implemented categorization engine; filtering the plurality of content items in each document into at least two groups; discarding at least one of the groups of content items for each document; for each document, correlating each of the non-discarded content items in the document with the taxonomy elements corresponding to the at least one taxonomy in which the document is categorized; storing the correlated taxonomy elements and non-discarded content items for each document in an electronic database; and calculating a correlation value between the non-discarded content items for each document and the correlated taxonomy elements; wherein calculating the correlation value comprises applying a phrase code frequency inverse phrase code document frequency (PCF-IPCDF) scoring model to the correlated taxonomy elements; and wherein the PCF-IPCDF scoring model comprises a phrase-code frequency (PCF), the PCF being the number of times a phrase p appears in multimedia documents containing a code c, multiplied by an inverse phrase code document frequency (IPCDF), the IPCDF being a logarithm of;
the number of documents coded with c divided by the number of documents for which the phrase p and code c appear together. - View Dependent Claims (2)
-
Specification