Method for normalizing document metadata to improve search results using an alias relationship directory service
First Claim
1. In a computerized environment, a method of normalizing document data to improve results of search requests, the method comprising the acts of:
- receiving a first document containing first document data and a link to a second document, wherein the second document contains second document data;
parsing the first document data into one or more first document segments;
identifying at least one of the one or more first document segments as a first document alias that correlates with a first document datum found in an alias directory service;
associating the received first document with the first document alias so that, upon request for the first document datum through a search engine, the received first document is returned to a requester by association of the first document datum with the first document alias;
parsing the second document data into one or more second document segments;
identifying at least one of the one or more second document segments as a second document alias that correlates with a first document datum found in the alias directory service; and
associating the received second document with the second document alias so that, upon request for the second document datum through the search engine, the received second document is returned to the requester by association of the second document datum with the second document alias.
2 Assignments
0 Petitions
Accused Products
Abstract
The present invention provides methods, systems, and computer program products for normalizing document search terms through use of an alias database, as may be found in an alias relationship file, such as a directory service. A gatherer module receives as input (or crawls through) several documents in series or in parallel and can recognize data segments as related to one of the aliases in the alias relationship file. The gatherer then associates the document appropriately so that a search engine may find all documents associated with a search term, regardless of whether the term has undergone several name changes (various aliases) over the course of time. Accordingly, a user may then search for a person'"'"'s name, and receive as a search result all documents listing the person'"'"'s name, as well as documents listing, for example, only the person'"'"'s email address.
61 Citations
24 Claims
-
1. In a computerized environment, a method of normalizing document data to improve results of search requests, the method comprising the acts of:
-
receiving a first document containing first document data and a link to a second document, wherein the second document contains second document data; parsing the first document data into one or more first document segments; identifying at least one of the one or more first document segments as a first document alias that correlates with a first document datum found in an alias directory service; associating the received first document with the first document alias so that, upon request for the first document datum through a search engine, the received first document is returned to a requester by association of the first document datum with the first document alias; parsing the second document data into one or more second document segments; identifying at least one of the one or more second document segments as a second document alias that correlates with a first document datum found in the alias directory service; and associating the received second document with the second document alias so that, upon request for the second document datum through the search engine, the received second document is returned to the requester by association of the second document datum with the second document alias. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. In a computerized environment, a method of normalizing document data to improve results of search requests, the method comprising:
-
an act of receiving a first document containing first document data and a link to a second document, wherein the second document contains second document data; an act of parsing the first document data into one or more first document segments; an act of receiving the second document containing second document data; an act of parsing the second document data into one or more second document segments; and a step for normalizing document metadata used as a reference by a search engine by maintaining one or more relationships between a search term and an alternate search term, a search term property and/or alternative search term property. - View Dependent Claims (10, 11, 12)
-
-
13. A computer program product having computer-executable instructions for performing a method of normalizing document data to improve results of search requests, the method comprising the acts of:
-
receiving a first document containing first document data and a link to a second document, wherein the second document contains second document data; parsing the first document data into one or more first document segments; identifying at least one of the one or more first document segments as a first document alias that correlates with a first document datum found in an alias directory service; associating the received first document with the first document alias so that, upon request for the first document datum through a search engine, the received first document is returned to a requester by association of the first document datum with the first document alias; parsing the second document data into one or more second document segments; identifying at least one of the one or more second document segments as a second document alias that correlates with a first document datum found in the alias directory service; and associating the received second document with the second document alias so that, upon request for the second document datum through the search engine, the received second document is returned to the requester by association of the second document datum with the second document alias. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
-
21. A computer program product having computer-executable instructions for performing a method of normalizing document data to improve results of search requests, the method comprising:
-
an act of receiving a first document containing first document data and a link to a second document, wherein the second document contains second document data; an act of parsing the first document data into one or more first document segments; an act of receiving the second document containing second document data; an act of parsing the second document data into one or more second document segments; and a step for normalizing document metadata used as a reference by a search engine by maintaining one or more relationships between a search term and an alternate search term, a search term property and/or alternative search term property. - View Dependent Claims (22, 23, 24)
-
Specification