EXTRACTION OF ANCHOR EXPLANATORY TEXT BY MINING REPEATED PATTERNS
First Claim
1. A computer system for identifying explanatory text for a display page, comprising:
- a find repeated patterns component that identifies repeated patterns of elements within a display page, a repeated pattern having a reference to a referenced display page along with text associated with the reference; and
an extract text component that extracts from a repeated pattern the text associated with the reference, wherein the extracted text represents explanatory text for the referenced display page.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and system for identifying explanatory text for a referenced web page based on a reference to the referenced web page contained in a repeated pattern of a referencing web page is provided. An anchor explanatory text (“AET”) system uses the hierarchical organization of the web page to identify a repeated pattern of hierarchical elements that contain references to other display pages. After the AET system identifies a repeated pattern, it identifies the dominant reference or anchor within each occurrence of the pattern. The AET system uses the explanatory text surrounding a dominant anchor as a description of the referenced web page.
-
Citations
20 Claims
-
1. A computer system for identifying explanatory text for a display page, comprising:
-
a find repeated patterns component that identifies repeated patterns of elements within a display page, a repeated pattern having a reference to a referenced display page along with text associated with the reference; and
an extract text component that extracts from a repeated pattern the text associated with the reference, wherein the extracted text represents explanatory text for the referenced display page. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-readable medium containing instructions for controlling a computer system to identify explanatory text for a referenced web page from a referencing web page, by a method comprising:
-
identifying repeated patterns of elements within the referencing web page, an element of a repeated pattern having a reference to a web page along with text surrounding the reference; and
for each occurrence of a repeated pattern, identifying a dominant reference to a web page; and
extracting the text surrounding the dominant reference as explanatory text for the referenced web page. - View Dependent Claims (13, 14, 15, 16, 17)
-
-
18. A computer system for identifying explanatory text for a referenced web page from a referencing web page, comprising:
-
a component that identifies repeated patterns of elements within the referencing web page, an occurrence of a repeated pattern having a reference to a web page along with text surrounding the reference;
a component that identifies a dominant reference for each repeated pattern; and
a component that extracts text surrounding the dominant reference as explanatory text for the referenced web page. - View Dependent Claims (19, 20)
-
Specification