Content data indexing
First Claim
1. A computer system for searching and retrieving information from at least one content source containing multiple content entities, comprising:
- a build process for storing content information associated with the multiple content entities in a searchable content database comprising at least one index, wherein the content information includes text content information for the multiple content entities and further includes association information identifying relationships between different content entities and wherein the build process includes;
creating an alt word table of the at least one content source; and
apart from the alt word table, identifying and creating one or more associations between different content entities of the at least one content source; and
a run-time process operative to receive at least one search term entered by a user and to process the at least one search term against the at least one index in the searchable content database to attempt to identify a first content entity as a first match in a search result, wherein the run-time process operates after the build process is complete, and wherein;
if said alt word table includes at least one alternate word associated with the at least one search term, the run-time process is further operative to identify a second content entity as a second match between the alternate word and the index and to return at least one search result corresponding to the second match; and
if search result corresponding to the first content entity or the second content entity has an associated other content entity, the run-time process is further operative to identify a third content entity as a third match between the at least one search term or alternate word and the third content entity, even though the third content entity does not include the at least one search term or the alternate word.
3 Assignments
0 Petitions
Accused Products
Abstract
A full text indexing system is provided for processing content associated with data applications such as encyclopedia and dictionary applications. A build process collects data from various sources, processes the data into constituent parts, including alternative word sets, and stores the constituent parts in structured database tables. A run-time process is used to query the database tables and the results in order to provide effective matches in an efficient manner. Run-time processing is optimized by preprocessing all steps that are query-independent during the build process. A double word table representing all possible word pair combinations for each index entry and an alternative word table are used to further optimize run-time processing.
-
Citations
22 Claims
-
1. A computer system for searching and retrieving information from at least one content source containing multiple content entities, comprising:
-
a build process for storing content information associated with the multiple content entities in a searchable content database comprising at least one index, wherein the content information includes text content information for the multiple content entities and further includes association information identifying relationships between different content entities and wherein the build process includes; creating an alt word table of the at least one content source; and apart from the alt word table, identifying and creating one or more associations between different content entities of the at least one content source; and a run-time process operative to receive at least one search term entered by a user and to process the at least one search term against the at least one index in the searchable content database to attempt to identify a first content entity as a first match in a search result, wherein the run-time process operates after the build process is complete, and wherein; if said alt word table includes at least one alternate word associated with the at least one search term, the run-time process is further operative to identify a second content entity as a second match between the alternate word and the index and to return at least one search result corresponding to the second match; and if search result corresponding to the first content entity or the second content entity has an associated other content entity, the run-time process is further operative to identify a third content entity as a third match between the at least one search term or alternate word and the third content entity, even though the third content entity does not include the at least one search term or the alternate word. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method for searching and retrieving content from at least one content source that includes multiple content entities, the method comprising the steps of:
-
building a search index table comprising; index entries corresponding to content information contained in the content source, the search index table including a double word table that includes unordered, unique word pairs, said double word table having at least one word pair corresponding to at least one of the index entries; and association information corresponding to associations between different content entities of the at least one content source; receiving a search term entered by a user; processing the search term against a portion of the search index table including a word pair corresponding to the search term to determine whether a first match corresponding to a first content entity is available; processing association information corresponding to the first content entity to identify a second match corresponding to a second content entity; and returning a search result identifying the first and second content entities, in response to a determination that the first and second matches are available, wherein the second content entity is a match despite not including the at least one search term. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22)
-
Specification