Identifying Relationships Among Database Records
First Claim
1. A method for identifying one or more relationships among a plurality of records, comprising:
- accessing a search record comprising a plurality of search tokens, a search token associated with a search token count;
accessing a plurality of corpus records, a corpus record comprising a plurality of corpus tokens, a corpus token associated with a corpus token count;
repeating the following for each search token of at least a subset of the plurality of search tokens;
identifying one or more corpus tokens corresponding to the each search token; and
comparing the each search token with the one or more corresponding corpus tokens to yield one or more comparisons; and
determining a relationship between the search record and at least one corpus record in accordance with the one or more comparisons.
2 Assignments
0 Petitions
Accused Products
Abstract
Identifying relationships among records includes accessing a search record and corpus records. The search record comprises search tokens, where a search token is associated with a search token count. A corpus record comprises corpus tokens, where a corpus token is associated with a corpus token count. The following are repeated for each of at least a subset of the search tokens: identifying corpus tokens corresponding to the search token, and comparing the search token with the identified corpus tokens to yield comparisons. A relationship between the search record and at least one corpus record is determined in accordance with the comparisons.
30 Citations
71 Claims
-
1. A method for identifying one or more relationships among a plurality of records, comprising:
-
accessing a search record comprising a plurality of search tokens, a search token associated with a search token count; accessing a plurality of corpus records, a corpus record comprising a plurality of corpus tokens, a corpus token associated with a corpus token count; repeating the following for each search token of at least a subset of the plurality of search tokens; identifying one or more corpus tokens corresponding to the each search token; and comparing the each search token with the one or more corresponding corpus tokens to yield one or more comparisons; and determining a relationship between the search record and at least one corpus record in accordance with the one or more comparisons. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system for identifying one or more relationships among a plurality of records, comprising:
-
a memory operable to; store a plurality of corpus records, a corpus record comprising a plurality of corpus tokens, a corpus token associated with a corpus token count; and a processor coupled to the memory and operable to; access a search record comprising a plurality of search tokens, a search token associated with a search token count; repeat the following for each search token of at least a subset of the plurality of search tokens; identify one or more corpus tokens corresponding to the each search token; and compare the each search token with the one or more corresponding corpus tokens to yield one or more comparisons; and determine a relationship between the search record and at least one corpus record in accordance with the one or more comparisons. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. Logic for identifying one or more relationships among a plurality of records, the logic encoded in a computer-readable storage media and operable to:
-
access a search record comprising a plurality of search tokens, a search token associated with a search token count; access a plurality of corpus records, a corpus record comprising a plurality of corpus tokens, a corpus token associated with a corpus token count; repeat the following for each search token of at least a subset of the plurality of search tokens; identify one or more corpus tokens corresponding to the each search token; and compare the each search token with the one or more corresponding corpus tokens to yield one or more comparisons; and determine a relationship between the search record and at least one corpus record in accordance with the one or more comparisons. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
-
-
25. A system for identifying one or more relationships among a plurality of records, comprising:
-
means for accessing a search record comprising a plurality of search tokens, a search token associated with a search token count; means for accessing a plurality of corpus records, a corpus record comprising a plurality of corpus tokens, a corpus token associated with a corpus token count; means for repeating the following for each search token of at least a subset of the plurality of search tokens; identifying one or more corpus tokens corresponding to the each search token; and comparing the each search token with the one or more corresponding corpus tokens to yield one or more comparisons; and means for determining a relationship between the search record and at least one corpus record in accordance with the one or more comparisons.
-
-
26. A method for identifying one or more relationships among a plurality of records, comprising:
-
accessing a search record comprising a plurality of search tokens, a search token associated with a search token count; accessing a plurality of corpus records, a corpus record comprising a plurality of corpus tokens, a corpus token associated with a corpus token count, the search token count and the corpus token count each comprising one of; an integer value;
ora binary value; accessing a token-based index, the token-based index identifying one or more corpus records having a particular token count for a particular corpus token, each particular token count comprising one of; an integer value;
ora binary value; repeating the following for each search token of at least a subset of the plurality of search tokens; identifying one or more corpus tokens corresponding to the each search token; and comparing the each search token with the one or more corresponding corpus tokens to yield one or more comparisons by; performing one of; comparing the each search token with the corresponding corpus tokens according to a symmetrical differential scoring formula;
orcomparing each search token with the corresponding corpus tokens according to an asymmetrical subset scoring formula; comparing the search token count of the each search token with the one or more corpus token counts of the one or more corresponding corpus tokens; and filtering the one or more corresponding corpus tokens according to information content of the one or more corresponding corpus tokens; determining a relationship between the search record and at least one corpus record in accordance with the one or more comparisons; establishing a weight for each corresponding corpus token of the one or more corresponding corpus tokens to yield one or more weights, the weight reflecting an information content of the each corresponding corpus token; and calculating one or more partial scores for the one or more corresponding corpus tokens using the one or more weights.
-
-
27. A method for identifying one or more relationships among a plurality of records, comprising:
-
accessing a search record comprising a plurality of search tokens, a search token associated with a search token count; accessing a plurality of corpus records, a corpus record comprising a plurality of corpus tokens, a corpus token associated with a corpus token count; filtering the plurality of corpus tokens according to information content of the plurality of corpus tokens to yield one or more discriminating tokens; and determining a relationship between the search record and at least one corpus record according to the one or more discriminating tokens. - View Dependent Claims (28, 29, 30, 31, 32, 33, 34)
-
-
35. A system for identifying one or more relationships among a plurality of records, comprising:
-
a memory operable to; store a plurality of corpus records, a corpus record comprising a plurality of corpus tokens, a corpus token associated with a corpus token count; and a processor coupled to the memory and operable to; access a search record comprising a plurality of search tokens, a search token associated with a search token count; filter the plurality of corpus tokens according to information content of the plurality of corpus tokens to yield one or more discriminating tokens; and determine a relationship between the search record and at least one corpus record according to the one or more discriminating tokens. - View Dependent Claims (36, 37, 38, 39, 40, 41, 42)
-
-
43. Logic for identifying one or more relationships among a plurality of records, the logic encoded in a computer-readable storage media and operable to:
-
access a search record comprising a plurality of search tokens, a search token associated with a search token count; access a plurality of corpus records, a corpus record comprising a plurality of corpus tokens, a corpus token associated with a corpus token count; filter the plurality of corpus tokens according to information content of the plurality of corpus tokens to yield one or more discriminating tokens; and determine a relationship between the search record and at least one corpus record according to the one or more discriminating tokens. - View Dependent Claims (44, 45, 46, 47, 48, 49, 50)
-
-
51. A system for identifying one or more relationships among a plurality of records, comprising:
-
means for accessing a search record comprising a plurality of search tokens, a search token associated with a search token count; means for accessing a plurality of corpus records, a corpus record comprising a plurality of corpus tokens, a corpus token associated with a corpus token count; means for filtering the plurality of corpus tokens according to information content of the plurality of corpus tokens to yield one or more discriminating tokens; and means for determining a relationship between the search record and at least one corpus record according to the one or more discriminating tokens.
-
-
52. A method for identifying one or more relationships among a plurality of records, comprising:
-
accessing a search record comprising a plurality of search tokens, a search token associated with a search token count; accessing a plurality of corpus records, a corpus record comprising a plurality of corpus tokens, a corpus token associated with a corpus token count; filtering the plurality of corpus tokens according to information content of the plurality of corpus tokens to yield one or more discriminating tokens by; identifying one or more corpus tokens each corresponding to a search token of the plurality of search tokens; determining a first portion of the one or more discriminating tokens from the one or more identified corpus tokens according to the information content of the one or more identified corpus tokens; sorting the one or more identified corpus tokens according to the information content of the one or more identified corpus tokens to yield a token order from a higher information content to a lower information content; comparing at least a subset of the one or more identified corpus tokens to the corresponding search token in the token order; determining a second portion of the one or more discriminating tokens according to a plurality of predetermined discriminating tokens; determining a third portion of the one or more discriminating tokens according to an information content threshold; removing one or more non-discriminating tokens from an index of the plurality of corpus records; removing the one or more non-discriminating tokens from the plurality of search tokens; and excluding the one or more non-discriminating tokens from an index of the plurality of corpus records; and determining a relationship between the search record and at least one corpus record according to the one or more discriminating tokens.
-
-
53. A method for identifying one or more relationships among a plurality of records, comprising:
-
accessing a search record comprising a plurality of search tokens, a search token associated with a search token count; accessing a plurality of corpus records, a corpus record comprising a plurality of corpus tokens, a corpus token associated with a corpus token count; comparing the plurality of search tokens with at least a subset the plurality of corpus tokens; and calculating a score operable to distinguish a first corpus record that is a subset of the search record from a second corpus record that is approximately equivalent to the search record. - View Dependent Claims (54)
-
-
55. A system for identifying one or more relationships among a plurality of records, comprising:
-
a memory operable to; store a plurality of corpus records, a corpus record comprising a plurality of corpus tokens, a corpus token associated with a corpus token count; and a processor coupled to the memory and operable to; access a search record comprising a plurality of search tokens, a search token associated with a search token count; compare the plurality of search tokens with at least a subset the plurality of corpus tokens; and calculate a score operable to distinguish a first corpus record that is a subset of the search record from a second corpus record that is approximately equivalent to the search record. - View Dependent Claims (56)
-
-
57. Logic for identifying one or more relationships among a plurality of records, the logic encoded in a computer-readable storage media and operable to:
-
access a search record comprising a plurality of search tokens, a search token associated with a search token count; access a plurality of corpus records, a corpus record comprising a plurality of corpus tokens, a corpus token associated with a corpus token count; compare the plurality of search tokens with at least a subset the plurality of corpus tokens; and calculate a score operable to distinguish a first corpus record that is a subset of the search record from a second corpus record that is approximately equivalent to the search record. - View Dependent Claims (58)
-
-
59. A system for identifying one or more relationships among a plurality of records, comprising:
-
means for accessing a search record comprising a plurality of search tokens, a search token associated with a search token count; means for accessing a plurality of corpus records, a corpus record comprising a plurality of corpus tokens, a corpus token associated with a corpus token count; means for comparing the plurality of search tokens with at least a subset the plurality of corpus tokens; and means for calculating a score operable to distinguish a first corpus record that is a subset of the search record from a second corpus record that is approximately equivalent to the search record.
-
-
60. A method for identifying one or more relationships among a plurality of records, comprising:
-
accessing a search record comprising a plurality of search tokens, a search token associated with a search token count; accessing a plurality of corpus records, a corpus record comprising a plurality of corpus tokens, a corpus token associated with a corpus token count; comparing the plurality of search tokens with at least a subset the plurality of corpus tokens; and calculating a score operable to distinguish a first corpus record that is a subset of the search record from a second corpus record that is approximately equivalent to the search record, by; calculating the score according to a symmetrical differential scoring formula.
-
-
61. A method for identifying one or more relationships among a plurality of records, comprising:
-
accessing a plurality of corpus records, a corpus record comprising a plurality of corpus tokens; repeating the following for one or more iterations to yield one or more final groups; sorting a current group of corpus records to yield a plurality of next groups by performing the following for each corpus record of at least a subset of the current group; designating the each corpus record as a search record comprising a plurality of search tokens; and comparing the plurality of search tokens with the plurality of corresponding corpus tokens of each of the other corpus records, the comparisons indicating a degree of similarity between the search record and the each of the other corpus records; and forming the plurality of next groups in accordance with the comparisons; and identifying at least similar corpus records according the one or more final groups. - View Dependent Claims (62, 63)
-
-
64. A system for identifying one or more relationships among a plurality of records, comprising:
-
a memory operable to; store a plurality of corpus records, a corpus record comprising a plurality of corpus tokens; and a processor coupled to the memory and operable to; repeat the following for one or more iterations to yield one or more final groups; sort a current group of corpus records to yield a plurality of next groups by performing the following for each corpus record of at least a subset of the current group; designate the each corpus record as a search record comprising a plurality of search tokens; and compare the plurality of search tokens with the plurality of corresponding corpus tokens of each of the other corpus records, the comparisons indicating a degree of similarity between the search record and the each of the other corpus records; and form the plurality of next groups in accordance with the comparisons; and identify at least similar corpus records according the one or more final groups. - View Dependent Claims (65, 66)
-
-
67. Logic for identifying one or more relationships among a plurality of records, the logic encoded in a computer-readable storage media and operable to:
-
access a plurality of corpus records, a corpus record comprising a plurality of corpus tokens; repeat the following for one or more iterations to yield one or more final groups; sort a current group of corpus records to yield a plurality of next groups by performing the following for each corpus record of at least a subset of the current group; designate the each corpus record as a search record comprising a plurality of search tokens; and compare the plurality of search tokens with the plurality of corresponding corpus tokens of each of the other corpus records, the comparisons indicating a degree of similarity between the search record and the each of the other corpus records; and form the plurality of next groups in accordance with the comparisons; and identify at least similar corpus records according the one or more final groups. - View Dependent Claims (68, 69)
-
-
70. A system for identifying one or more relationships among a plurality of records, comprising:
-
means for accessing a plurality of corpus records, a corpus record comprising a plurality of corpus tokens; means for repeating the following for one or more iterations to yield one or more final groups; sorting a current group of corpus records to yield a plurality of next groups by performing the following for each corpus record of at least a subset of the current group; designating the each corpus record as a search record comprising a plurality of search tokens; and comparing the plurality of search tokens with the plurality of corresponding corpus tokens of each of the other corpus records, the comparisons indicating a degree of similarity between the search record and the each of the other corpus records; and forming the plurality of next groups in accordance with the comparisons; and means for identifying at least similar corpus records according the one or more final groups.
-
-
71. A method for identifying one or more relationships among a plurality of records, comprising:
-
accessing a plurality of corpus records, a corpus record comprising a plurality of corpus tokens; repeating the following for one or more iterations to yield one or more final groups; sorting a current group of corpus records to yield a plurality of next groups by performing the following for each corpus record of at least a subset of the current group; designating the each corpus record as a search record comprising a plurality of search tokens, a search token of the plurality of search tokens comprising an ordered set of a plurality of words; and comparing the plurality of search tokens with the plurality of corresponding corpus tokens of each of the other corpus records, the comparisons indicating a degree of similarity between the search record and the each of the other corpus records; and forming the plurality of next groups in accordance with the comparisons; identifying at least similar corpus records according the one or more final groups; and sorting the plurality of corpus records according to document size.
-
Specification