Separating content from noisy context in template-based documents for search indexing
First Claim
Patent Images
1. A method comprising:
- selecting, by a processing device, a plurality of documents for comparison;
identifying, by the processing device, an identical element comprising information common to each of the plurality of documents; and
removing, by the processing device, the identical element from each of the plurality of documents to form modifications to the plurality of documents, prior to a subsequent indexing process of the plurality of documents.
1 Assignment
0 Petitions
Accused Products
Abstract
In one embodiment, a mechanism for separating content from noisy context in template-based documents for search indexing is disclosed. In one embodiment, a method includes selecting a plurality of documents for index comparison, identifying one or more identical elements found in each of the plurality of documents, and removing the one or more identical elements from consideration in an indexing process of the plurality of documents.
24 Citations
20 Claims
-
1. A method comprising:
-
selecting, by a processing device, a plurality of documents for comparison; identifying, by the processing device, an identical element comprising information common to each of the plurality of documents; and removing, by the processing device, the identical element from each of the plurality of documents to form modifications to the plurality of documents, prior to a subsequent indexing process of the plurality of documents. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer system comprising:
-
a memory; and a processing device coupled to the memory, the processing device to; identify an identical element comprising information common to each of the plurality of documents; and remove the identical element from each of the plurality of documents to form modifications to the plurality of documents, prior to a subsequent indexing process of the plurality of documents. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory machine-readable storage medium programmed to include instructions executable by a processing device to cause the processing device to perform operations comprising:
-
selecting, by the processing device, a plurality of documents for index comparison; identifying, by the processing device, an identical element comprising information common to each of the plurality of documents; and removing, by the processing device, the identical element from each of the plurality of documents to form modifications to the plurality of documents, prior to a subsequent indexing process of the plurality of documents. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification