SYSTEM AND METHOD FOR CONTENT SELECTION FOR WEB PAGE INDEXING
First Claim
1. A method for indexing a document comprising:
- selecting indexable content of the document; and
indexing the indexable content;
wherein selecting the indexable content comprises;
dividing the document into a plurality of document elements;
determining an attention history for a plurality of the document elements; and
determining one or more document elements that meet an attention history requirement.
5 Assignments
0 Petitions
Accused Products
Abstract
An indexing system for documents such as web pages divides a document into elements, such as document object model elements. User attention data from prior interactions with the document are analyzed to determine those elements of a document that satisfy a threshold requirement of user attention. Elements meeting the user attention threshold requirement are added to a set of indexable content for the document. Furthermore, document sections are determined based on attention data and each section is indexed separately. Indexing is per section and based only on the indexable content, thereby enhancing the index relevance, increasing the efficiency of search engines and reducing spamdexing.
34 Citations
20 Claims
-
1. A method for indexing a document comprising:
-
selecting indexable content of the document; and indexing the indexable content; wherein selecting the indexable content comprises; dividing the document into a plurality of document elements; determining an attention history for a plurality of the document elements; and determining one or more document elements that meet an attention history requirement. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A system for indexing web pages comprising:
-
a content selection module that processes interaction data for a web page to select indexable content of the web page; and an indexing module that indexes the indexable content of the web page. - View Dependent Claims (14, 15, 16, 17)
-
-
18. A computer-readable medium comprising computer-executable instructions for execution by a processor, that, when executed, cause the processor to:
-
determine interaction data for a web page; determine a document object model element of the web page associated with the interaction data; and add the document object model element to a set of indexable content for the web page. - View Dependent Claims (19, 20)
-
Specification