Identifying a document by performing spectral analysis on the contents of the document
First Claim
Patent Images
1. A method, comprising:
- creating an index, with one or more entries, for a rendered document at a computer system, wherein each of the one or more index entries contains a word and one or more values, wherein each value represents a characteristic of the word;
selecting a subset of the one or more index entries from the index using the computer system;
creating an ordered sequence of values from the selected subset of the one or more index entries, wherein each value comprises a position of the entry relative to another entry in the selected subset of the one or more index entries;
building an identifier for the rendered document at the computer system based on the ordered sequence of the values, wherein each index entry in the subset satisfies an identifier rule for a document identifier; and
locating an electronic counterpart to the rendered document based on the identifier for the rendered document using the computer system.
4 Assignments
0 Petitions
Accused Products
Abstract
A system and method for identifying a document based on a spectral analysis of the text of the document is described. In some examples, the system generates a document identifier for a rendered document based on assigning values to words in the rendered document, such as values associated with the frequency of use of the word by the rendered document, the absolute or relative position of the word in the rendered document, and so on. The system may use the document identifier to generate a group of documents having similar document identifiers, and choose a likely match from the group of documents based on predictive analysis.
1091 Citations
19 Claims
-
1. A method, comprising:
-
creating an index, with one or more entries, for a rendered document at a computer system, wherein each of the one or more index entries contains a word and one or more values, wherein each value represents a characteristic of the word; selecting a subset of the one or more index entries from the index using the computer system; creating an ordered sequence of values from the selected subset of the one or more index entries, wherein each value comprises a position of the entry relative to another entry in the selected subset of the one or more index entries; building an identifier for the rendered document at the computer system based on the ordered sequence of the values, wherein each index entry in the subset satisfies an identifier rule for a document identifier; and locating an electronic counterpart to the rendered document based on the identifier for the rendered document using the computer system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. An article of manufacture comprising a non-transitory computer-readable medium having instructions stored thereon that, if the instructions are executed by a computer, cause the computer to perform functions comprising:
-
creating an index, with one or more entries, for a rendered document, wherein each of the one or more index entries contains a word and one or more values, wherein each value represents a characteristic of the word; selecting a subset of the one or more index entries from the index; creating an ordered sequence of values from the selected subset of one or more index entries, wherein each value comprises a position of the entry relative to another entry in the selected subset of the one or more index entries; building an identifier for the rendered document based on the ordered sequence of the values, wherein each index entry in the subset satisfies an identifier rule for a document identifier; and locating an electronic counterpart to the rendered document based on the identifier for the rendered document. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
-
Specification