Indexing of digitized entities
First Claim
1. A server/client system for searching for digitized non-text entities in a data collection comprising:
- an indexing input device for collecting basic information pertaining to at least one distinctive feature and at least one locator for each digitized non-text entity in a set of non-text entities from the data collection, the indexing input device including an index generator for receiving the basic information and producing in response thereto at least one rank parameter for each digitized non-text entity,an index database for storing index information relating to the digitized non-text entities in the set,a search engine for receiving search directives and in response thereto performing searches in the index database, anda user client interface for receiving a search request from at least one user client terminal, forwarding the search request as a search directive to the search engine, receiving a hit list of digitized non-text entities and returning a result of a corresponding search in the index database to the at least one user client terminal,wherein the index database is organized such that the index information for a particular digitized non-text entity comprises the at least one rank parameter, which is indicative of a degree of relevance for at least one distinctive feature associated with said digitized non-text entity, wherein the at least one rank parameter is based on a first rank component that is generated by ranking each distinctive feature associated with each digitized non-text entity based on a relative number of occurrences of the distinctive feature in association with multiple copies of the digitized non-text entity in the data collection, and wherein the at least one rank parameter is further based on a second rank component that is generated by ranking each individual distinctive feature related to the digitized non-text entity based on a position of the distinctive feature in a descriptive field associated with the digitized non-text entity, wherein the rank parameter is a combination of the first rank component with the second rank component.
0 Assignments
0 Petitions
Accused Products
Abstract
The invention relates to indexing of digitized entities in a large and comparatively unstructured data collection, for instance the Internet, such that text-based searches with respect to the data collection can be ordered via a user client terminal. Index information is generated for each digitized entity, which contains distinctive features being ranked according to a rank parameter. The rank parameter indicates a degree of relevance of particular distinctive feature with respect to a given digitized entity and is derived from fields or tags associated with one or more copies of the digitized entity in the data collection. The index information is stored in a searchable database, which is accessible via a user client interface and a search engine. The derived distinctive features and the rank parameter thus provides a possibility to carry out text-based searches in respect of non-text digitized entities, such as images, audio files and video sequences and obtain a highly relevant search result.
-
Citations
10 Claims
-
1. A server/client system for searching for digitized non-text entities in a data collection comprising:
-
an indexing input device for collecting basic information pertaining to at least one distinctive feature and at least one locator for each digitized non-text entity in a set of non-text entities from the data collection, the indexing input device including an index generator for receiving the basic information and producing in response thereto at least one rank parameter for each digitized non-text entity, an index database for storing index information relating to the digitized non-text entities in the set, a search engine for receiving search directives and in response thereto performing searches in the index database, and a user client interface for receiving a search request from at least one user client terminal, forwarding the search request as a search directive to the search engine, receiving a hit list of digitized non-text entities and returning a result of a corresponding search in the index database to the at least one user client terminal, wherein the index database is organized such that the index information for a particular digitized non-text entity comprises the at least one rank parameter, which is indicative of a degree of relevance for at least one distinctive feature associated with said digitized non-text entity, wherein the at least one rank parameter is based on a first rank component that is generated by ranking each distinctive feature associated with each digitized non-text entity based on a relative number of occurrences of the distinctive feature in association with multiple copies of the digitized non-text entity in the data collection, and wherein the at least one rank parameter is further based on a second rank component that is generated by ranking each individual distinctive feature related to the digitized non-text entity based on a position of the distinctive feature in a descriptive field associated with the digitized non-text entity, wherein the rank parameter is a combination of the first rank component with the second rank component. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
Specification