×

Method and system for constructing a document redundancy graph

  • US 8,914,720 B2
  • Filed: 07/31/2009
  • Issued: 12/16/2014
  • Est. Priority Date: 07/31/2009
  • Status: Active Grant
First Claim
Patent Images

1. A method for constructing a document redundancy graph, said method comprising:

  • representing each paragraph associated with a document set as a node among a plurality of nodes, wherein each node among said plurality of nodes with respect to said redundancy graph represents a unique cluster of information related to said each paragraph;

    providing said each paragraph with a unique paragraph identifier;

    constructing a hash table of all paragraph identifiers comprising identifiers of all paragraphs reachable from said each paragraph;

    merging said plurality of nodes associated with redundant information by configuring said hash table with respect to a pair of paragraph identifiers in association with a probability value, wherein said probability value sorts a plurality of information matches in an order of decreasing certainty of common content, wherein a pair of said paragraph identifiers associated with an increased certainty of common content are selected to merge; and

    combining said plurality of nodes unique to a single document by expressing a pair of nodes with overlapping common content as a combined node, wherein said combined node comprises an empty intersection of said pair of nodes and comparing each paragraph identifier among said pair of paragraph identifiers to a probability value associated with an entry in said hash table in an order wherein said hash table eliminates inconsistency associated with said plurality of information matches.

View all claims
  • 6 Assignments
Timeline View
Assignment View
    ×
    ×