×

Computerized methods of data compression and analysis

  • US 10,387,377 B2
  • Filed: 05/19/2017
  • Issued: 08/20/2019
  • Est. Priority Date: 05/19/2017
  • Status: Active Grant
First Claim
Patent Images

1. A computerized method of compressing symbolic information organized into a plurality of documents, each document having a plurality of symbols, the method comprising:

  • (a) with a first document of the plurality of documents as an input document, automatically with a computer;

    (i) identifying a plurality of symbol pairs, each symbol pair consisting of two sequential symbols in the input document; and

    (ii) for each unique symbol pair of the plurality of symbol pairs, updating a count identifying the number of appearances of the unique symbol pair;

    (b) performing part (a) on each of the other documents of the plurality of documents, wherein the respective counts for the symbol pairs identifies the number of previous appearances of that symbol pair in any of the plurality of documents; and

    (c) after part (b), for at least one of the plurality of documents, producing a compressed document by causing the compressed document to include, at each position associated with one of the plurality of symbol pairs from the input document, a replacement symbol associated by a compression dictionary with the unique symbol pair matching the one of the plurality of symbol pairs, if the count for the unique symbol pair exceeds a threshold.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×