Updating taxonomy based on webpage
First Claim
Patent Images
1. A computer-implemented method comprising:
- extracting, by a computing device, structured content from a web page;
determining multiple subcategories of a known category by applying category rules to the structured content from the web page, the category rules being customized for a known structure of the web page, the multiple subcategories including multiple known subcategories of the known category and a new subcategory, the new subcategory being based on a new item within the known structure of the web page; and
updating a stored taxonomy by adding the new subcategory to the stored taxonomy.
2 Assignments
0 Petitions
Accused Products
Abstract
According to an example implementation, a computer-implemented method may include extracting, by a computing device, structured content from a website, determining a recent taxonomy by applying category rules to the structured content, the recent taxonomy including multiple categories and a new category, and updating a stored taxonomy based on the determined recent taxonomy by adding the new category to the stored taxonomy.
21 Citations
19 Claims
-
1. A computer-implemented method comprising:
-
extracting, by a computing device, structured content from a web page; determining multiple subcategories of a known category by applying category rules to the structured content from the web page, the category rules being customized for a known structure of the web page, the multiple subcategories including multiple known subcategories of the known category and a new subcategory, the new subcategory being based on a new item within the known structure of the web page; and updating a stored taxonomy by adding the new subcategory to the stored taxonomy. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A non-transitory computer-readable storage medium including executable code tangibly embodied thereon, the executable code being configured to, when executed by at least one processor, cause a data processing apparatus to:
-
extract structured content from a website associated with a stored taxonomy; determine a recent taxonomy by applying category rules to the structured content, the category rules being customized for a known structure of a page of the website and dictating that items within an area of the website are subcategories of a known category, the recent taxonomy including multiple known subcategories of the known category and a new subcategory of the known category, the new subcategory being based on a new object within the known structure of the page; and update the stored taxonomy based on the determined recent taxonomy by adding the new subcategory of the known category to the stored taxonomy.
-
-
17. A non-transitory computer-readable storage medium comprising instructions stored thereon that, when executed by at least one processor, are configured to cause a computing device to at least:
-
extract, by a computing device, structured content from a web page, the web page being associated with a known category; determine multiple subcategories of the known category by applying category rules to the structured content from the web page, the category rules being customized for a known structure of the web page, the multiple subcategories including multiple known subcategories of the known category and a new subcategory, the new subcategory being based on a new item within the known structure of the web page; and update a stored taxonomy by adding the new subcategory to the stored taxonomy.
-
-
18. A computer-implemented method comprising:
-
extracting, by a computing device, structured content from a web page; determining multiple subcategories of a known category by applying category rules to the structured content from the web page, the category rules being customized for a known structure of the web page, the multiple subcategories including multiple known subcategories of the known category and a new subcategory, the new subcategory being based on a new object within the known structure of the web page; and updating a stored taxonomy by adding the new subcategory to the stored taxonomy.
-
-
19. A computer-implemented method comprising:
-
extracting, by a computing device, structured content from a web page; determining multiple subcategories of a known category by applying category rules to the structured content from the web page, the category rules being customized for a known structure of the web page, the multiple subcategories including multiple known subcategories of the known category and a new subcategory, the new subcategory being based on a new option within the known structure of the web page; and updating a stored taxonomy by adding the new subcategory to the stored taxonomy.
-
Specification