Automatic classification of segmented portions of web pages
First Claim
Patent Images
1. A method comprising:
- with one or more special purpose computing devices;
for at least one of a plurality of segmented portions obtained from at least one of A plurality of displayable web pages as represented by one or more digital signals of one or more data files, using one or more machine learned models to;
identify one or more feature properties of said segmented portion, wherein at least one of said one or more feature properties affects a presentation of said segmented portion within a rendered version of at least one displayable web page and corresponds to one or more query dependent properties based, at least in part, on one or more historical queries for said at least one of a plurality of displayable web pages;
classify said segmented portion as being at least one of a plurality of segment types based, at least in part, on said one or more identified feature properties; and
generating one or more digital signals representative of at least part of an index for said plurality of segmented portions, said index being based, at least in part, on said segment type.
10 Assignments
0 Petitions
Accused Products
Abstract
Exemplary methods and apparatuses are provided which may be used for classifying and indexing segmented portions of web pages and providing related information for use in information extraction and/or information retrieval systems.
-
Citations
20 Claims
-
1. A method comprising:
-
with one or more special purpose computing devices; for at least one of a plurality of segmented portions obtained from at least one of A plurality of displayable web pages as represented by one or more digital signals of one or more data files, using one or more machine learned models to; identify one or more feature properties of said segmented portion, wherein at least one of said one or more feature properties affects a presentation of said segmented portion within a rendered version of at least one displayable web page and corresponds to one or more query dependent properties based, at least in part, on one or more historical queries for said at least one of a plurality of displayable web pages; classify said segmented portion as being at least one of a plurality of segment types based, at least in part, on said one or more identified feature properties; and generating one or more digital signals representative of at least part of an index for said plurality of segmented portions, said index being based, at least in part, on said segment type. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. An apparatus comprising:
-
memory having stored therein one or more digital signals representing at least one data file of at least one displayable web page; at least one processing unit coupled to said memory and programmed with instructions to; for at least one of a plurality of segmented portions obtained from said displayable web page, use one or more machine learned models to; identify one or more feature properties of said segmented portion, wherein at least one of said one or more feature properties affects a presentation of said segmented portion within a rendered version of at least one displayable web page and corresponds to one or more query dependent properties based, at least in part, on one or more historical queries for said at least one of a plurality of displayable web pages; classify said segmented portion as being at least one of a plurality of segment types based, at least in part, on said one or more identified feature properties; and establish an index for said plurality of segmented portions that is based, at least in part, on said segment type. - View Dependent Claims (13, 14, 15, 16)
-
-
17. An article comprising:
a non-transitory computer readable medium having computer implementable instructions stored thereon that are executable by one or more processing units in a computing device to; for at least one of a plurality of segmented portions obtained from at least one of a plurality of displayable web pages as represented by one or more digital signals of one or more data files, use one or more machine learned models to; identify one or more feature properties of said segmented portion, wherein at least one of said one or more feature properties affects a presentation of said segmented portion within a rendered version of at least one displayable web page and corresponds to one or more query dependent properties based, at least in part, on one or more historical queries for said at least one of a plurality of displayable web pages; classify said segmented portion as being at least one of a plurality of segment types based, at least in part, on said one or more identified feature properties; and maintain an index for said plurality of segmented portions that is based, at least in part, on said segment type. - View Dependent Claims (18, 19, 20)
Specification