Intelligent search and retrieval system and method
First Claim
1. An intelligent search and retrieval method, comprising the steps of:
- providing a query profiler having a taxonomy database, the taxonomy database including a plurality of taxonomy codes which have explicitly defined contextual relationship and are semantically related;
receiving a query from a user;
accessing the taxonomy database of the query profiler to identify the taxonomy codes that are relevant to the query;
wherein taxonomy codes are identified using a phrase-code frequency-inverse phrase-code document frequency (pcf-ipcdf) score;
wherein phrase-code frequency, pcf(p,c), is defined as a number of times a phrase p appears in one or more categorized documents containing a code c;
wherein inverse phrase-code document frequency, ipcdf, is defined as the logarithm of;
a number of the documents coded with code c, D(c), divided by a number of the documents for which the phrase p and code c appear together, df(p,c);
wherein the pcf-ipcdf score, s(p,c), is defined as pcf(p,c) multiplied by ipcdf(p,c);
augmenting the query using the taxonomy codes;
generating feedback information to the user for query refinement, the feedback information including a plurality of query terms associated with the query and to be selected by the user;
presenting the feedback information to the user;
receiving one of the query terms from the user; and
identifying a source of the query term and presenting to the user;
wherein the taxonomy database is generated by;
parsing the natural language from the one or more categorized documents;
parsing one or more associated taxonomy codes into a data structure;
filtering unnecessary or undesirable code elements from the data structure;
extracting phrases from the text of the one or more categorized documents;
sorting and collating the extracted phrases into a counted phrase list; and
mapping the counted phrase list and the one or more associated taxonomy codes into a data table.
5 Assignments
0 Petitions
Accused Products
Abstract
An intelligent search and retrieval system and method is provided to allow an end-user effortless access yet most relevant, meaningful, up-to-date, and precise search results as quickly and efficiently as possible. The method may include providing a query profiler having a taxonomy database; receiving a query from a user; accessing the taxonomy database of the query profiler to identify a plurality of codes that are relevant to the query; augmenting the query using the codes to generate feedback information to the user for query refinement, the feedback information including a plurality of query terms associated with the query and to be selected by the user; presenting the feedback information to the user; receiving one of the query terms from the user; and identifying a source of the query term and presenting to the user. The system may include a query profiler having a taxonomy database to be accessed upon receiving a query from a user, which identifies a plurality of codes that are relevant to the query; means for augmenting the query using the codes to generate feedback information to the user for query refinement, the feedback information including a plurality of query terms associated with the query and to be selected by the user; and means for identifying a source of the query term, upon receiving one of the query terms from the user.
62 Citations
3 Claims
-
1. An intelligent search and retrieval method, comprising the steps of:
-
providing a query profiler having a taxonomy database, the taxonomy database including a plurality of taxonomy codes which have explicitly defined contextual relationship and are semantically related; receiving a query from a user; accessing the taxonomy database of the query profiler to identify the taxonomy codes that are relevant to the query; wherein taxonomy codes are identified using a phrase-code frequency-inverse phrase-code document frequency (pcf-ipcdf) score; wherein phrase-code frequency, pcf(p,c), is defined as a number of times a phrase p appears in one or more categorized documents containing a code c; wherein inverse phrase-code document frequency, ipcdf, is defined as the logarithm of;
a number of the documents coded with code c, D(c), divided by a number of the documents for which the phrase p and code c appear together, df(p,c);wherein the pcf-ipcdf score, s(p,c), is defined as pcf(p,c) multiplied by ipcdf(p,c); augmenting the query using the taxonomy codes; generating feedback information to the user for query refinement, the feedback information including a plurality of query terms associated with the query and to be selected by the user; presenting the feedback information to the user; receiving one of the query terms from the user; and identifying a source of the query term and presenting to the user; wherein the taxonomy database is generated by; parsing the natural language from the one or more categorized documents; parsing one or more associated taxonomy codes into a data structure; filtering unnecessary or undesirable code elements from the data structure; extracting phrases from the text of the one or more categorized documents; sorting and collating the extracted phrases into a counted phrase list; and mapping the counted phrase list and the one or more associated taxonomy codes into a data table. - View Dependent Claims (2, 3)
-
Specification