Search engine recency using content preview
First Claim
1. A method comprising:
- receiving, by a computing device, crawling data from a crawler crawling a content preview source, the content preview source comprising data and a link to a target document;
determining, by the computing device, a plurality of features for the content preview source;
determining, by the computing device, a measure of quality in connection with the content preview source using a quality measurement statistical model and the plurality of features determined for the content preview source;
making, by the computing device and using the determined measure of quality, a determination to create a content preview document, the determination to create the content preview document comprising comparing the measure of quality to a quality threshold to determine that the determined measure of quality exceeds the quality threshold;
creating, by the computing device in connection with crawling the content preview source, a content preview document in response to the determination that the determined measure of quality exceeds the quality threshold, creation of the content preview document performed by the computing device in connection with crawling the content preview source comprising;
extracting, from the content preview source, the data and the link to the target document; and
creating, using the data extracted from the content preview source and without using the target document, the content preview document, the created content preview document is different from the target document;
making, by the computing device and in response to the determination that the determined measure of quality exceeds the quality threshold, the created content preview document available for searching by a search engine in an index prior to the target document being made available for searching by the search engine in the index.
7 Assignments
0 Petitions
Accused Products
Abstract
Disclosed herein is use of a preview of content from a target document, as provided by a content preview source such as a Really Simple Syndication (RSS) feed, by a search engine. The content preview source includes the preview of the target document'"'"'s content and a reference, e.g., a Universal Resource Locator (URL) or other link. A content preview document is generated using data extracted from the content preview source. The content preview document is made available in a searchable index used by a search engine to respond to a search query. A fetch operation is scheduled to fetch the target document using the reference provided in the content preview source. Once fetched, the data extracted from the content preview source can be associated with the target document, and can be used in presenting the target document in search results.
41 Citations
20 Claims
-
1. A method comprising:
-
receiving, by a computing device, crawling data from a crawler crawling a content preview source, the content preview source comprising data and a link to a target document; determining, by the computing device, a plurality of features for the content preview source; determining, by the computing device, a measure of quality in connection with the content preview source using a quality measurement statistical model and the plurality of features determined for the content preview source; making, by the computing device and using the determined measure of quality, a determination to create a content preview document, the determination to create the content preview document comprising comparing the measure of quality to a quality threshold to determine that the determined measure of quality exceeds the quality threshold; creating, by the computing device in connection with crawling the content preview source, a content preview document in response to the determination that the determined measure of quality exceeds the quality threshold, creation of the content preview document performed by the computing device in connection with crawling the content preview source comprising; extracting, from the content preview source, the data and the link to the target document; and creating, using the data extracted from the content preview source and without using the target document, the content preview document, the created content preview document is different from the target document; making, by the computing device and in response to the determination that the determined measure of quality exceeds the quality threshold, the created content preview document available for searching by a search engine in an index prior to the target document being made available for searching by the search engine in the index. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A non-transitory computer readable storage medium tangibly encoded with computer-executable instructions that when executed by a processor associated with a computing device perform a method comprising:
-
receiving crawling data from a crawler crawling a content preview source, the content preview source comprising data and a link to a target document; determining a plurality of features for the content preview source; determining a measure of quality in connection with the content preview source using a quality measurement statistical model and the plurality of features determined for the content preview source; making, using the determined measure of quality, a determination to create a content preview document, the determination to create the content preview document comprising comparing the measure of quality to a quality threshold to determine that the determined measure of quality exceeds the quality threshold; creating, in connection with crawling the content preview source, a content preview document in response to the determination that the determined measure of quality exceeds the quality threshold, creation of the content preview document performed in connection with crawling the content preview source comprising; extracting, from the content preview source, the data and the link to the target document; and creating, using the data extracted from the content preview source and without using the target document, the content preview document, the created content preview document is different from the target document; making, in response to the determination that the determined measure of quality exceeds the quality threshold, the created content preview document available for searching by a search engine in an index prior to the target document being made available for searching by the search engine in the index. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
-
20. A computing device comprising:
-
a processor; and a non-transitory storage medium for tangibly storing thereon program logic for execution by the processor, the program logic comprising; receiving logic executed by the processor for receiving crawling data from a crawler crawling a content preview source, the content preview source comprising data and a link to a target document; determining logic executed by the processor for determining a plurality of features for the content preview source; determining logic executed by the processor for determining a measure of quality in connection with the content preview source using a quality measurement statistical model and the plurality of features determined for the content preview source; making logic executed by the processor for making, using the determined measure of quality, a determination to create a content preview document, the determination to create the content preview document comprising comparing the measure of quality to a quality threshold to determine that the determined measure of quality exceeds the quality threshold; creating logic executed by the processor for creating, in connection with crawling the content preview source, a content preview document in response to the determination that the determined measure of quality exceeds the quality threshold, creation of the content preview document performed in connection with crawling the content preview source comprising; extracting logic executed by the processor for extracting, from the content preview source, the data and the link to the target document; and creating logic executed by the processor for creating, using the data extracted from the content preview source and without using the target document, the content preview document, the created content preview document is different from the target document; making logic executed by the processor for making, in response to the determination that the determined measure of quality exceeds the quality threshold, the created content preview document available for searching by a search engine in an index prior to the target document being made available for searching by the search engine in the index.
-
Specification