Method and system for creating and updating an entity name alias table
First Claim
1. A computing system implemented method for creating an entity name alias table comprising the following, which when executed individually or collectively by any set of one or more processors perform a process including:
- obtaining initial identification data including data representing known entity names associated with one or more entities;
defining a search time window;
obtaining historical entity name search data indicating entity name searches entered by a searching party in the defined search time window;
analyzing the historical entity name search data to identify a pair of potentially related entity name searches, each pair of potentially related entity name searches including two associated entity names, wherein analyzing the historical entity name search data to identify a pair of potentially related entity name searches includes;
transforming at least part of the historical entity name search data into entity name search data strings representing entity names associated with entity name searches;
determining a raw string distance between entity name search data strings;
determining a normalized string distance between entity name search data strings, wherein the raw string distance between a first entity name search data string and a second entity name search data string is determined as the number of characters in the first entity name search data string that are different from the characters in the second entity name search data string, wherein the normalized string distance between the first and second entity name search data strings is equal to the raw string distance between the first and second entity name search data strings divided by the square root of the length of the first entity name search data string multiplied by the length of the second entity name search data string;
defining a threshold normalized string distance;
identifying a pair of entity name searches as potentially related entity name searches if the entity name search data strings representing the entity names of the pair of entity name searches have a normalized string distance less than the threshold normalized string distance;
identifying a matched known entity name in the initial identification data that matches a first entity name of the two associated entity names; and
adding the second entity name of the two associated entity names to an entity name alias table associated with the matched known entity name.
1 Assignment
0 Petitions
Accused Products
Abstract
A list of known names of entities is obtained along with entity name search data entered by searching parties in an attempt to identify one or more entities. The historical entity name search data entered by each individual searching party in a defined search time window is aggregated and analyzed to identify pairs of potentially related entity name searches that represent two attempts by the searching party to identify the same entity. The data representing the potentially related entity name searches is analyzed to identify a matched entity name in the list of known names that matches one of the entity names of the pair of potentially related entity name searches. Both of the entity names of the pair of potentially related entity name searches are then added to an alias list associated with the matched entity name.
-
Citations
16 Claims
-
1. A computing system implemented method for creating an entity name alias table comprising the following, which when executed individually or collectively by any set of one or more processors perform a process including:
-
obtaining initial identification data including data representing known entity names associated with one or more entities; defining a search time window; obtaining historical entity name search data indicating entity name searches entered by a searching party in the defined search time window; analyzing the historical entity name search data to identify a pair of potentially related entity name searches, each pair of potentially related entity name searches including two associated entity names, wherein analyzing the historical entity name search data to identify a pair of potentially related entity name searches includes; transforming at least part of the historical entity name search data into entity name search data strings representing entity names associated with entity name searches; determining a raw string distance between entity name search data strings; determining a normalized string distance between entity name search data strings, wherein the raw string distance between a first entity name search data string and a second entity name search data string is determined as the number of characters in the first entity name search data string that are different from the characters in the second entity name search data string, wherein the normalized string distance between the first and second entity name search data strings is equal to the raw string distance between the first and second entity name search data strings divided by the square root of the length of the first entity name search data string multiplied by the length of the second entity name search data string; defining a threshold normalized string distance; identifying a pair of entity name searches as potentially related entity name searches if the entity name search data strings representing the entity names of the pair of entity name searches have a normalized string distance less than the threshold normalized string distance; identifying a matched known entity name in the initial identification data that matches a first entity name of the two associated entity names; and adding the second entity name of the two associated entity names to an entity name alias table associated with the matched known entity name. - View Dependent Claims (2, 3, 4)
-
-
5. A computing system implemented method for creating a financial institution name alias table comprising the following, which when executed individually or collectively by any set of one or more processors perform a process including:
-
obtaining initial identification data including data representing known financial institution names associated with one or more financial institutions; defining a search time window; obtaining historical financial institution name search data indicating financial institution name searches entered by a searching party in the defined search time window; analyzing the historical financial institution name search data to identify a pair of potentially related financial institution name searches, each pair of potentially related financial institution name searches including two associated financial institution names, wherein analyzing the historical financial institution name search data to identify a pair of potentially related financial institution name searches includes; transforming at least part of the historical financial institution name search data into financial institution name search data strings representing financial institution names associated with financial institution name searches; determining a raw string distance between financial institution name search data strings, wherein the raw string distance between a first financial institution name search data string and a second financial institution name search data string is determined as the number of characters in the first financial institution name search data string that are different from the characters in the second financial institution name search data string; determining a normalized string distance between financial institution name search data strings, wherein the normalized string distance between the first and second financial institution name search data strings is equal to the raw string distance between the first and second financial institution name search data strings divided by the square root of the length of the first financial institution name search data string multiplied by the length of the second financial institution name search data string; defining a threshold normalized string distance; identifying a pair of financial institution name searches as potentially related financial institution name searches if the financial institution name search data strings representing the financial institution names of the pair of financial institution name searches have a normalized string distance less than the threshold normalized string distance; identifying a matched known financial institution name in the initial identification data that matches a first financial institution name of the two associated financial institution names; and adding the second financial institution name of the two associated financial institution names to a financial institution name alias table associated with the matched known financial institution name. - View Dependent Claims (6, 7, 8)
-
-
9. A computer program product for creating an entity name alias table comprising:
-
a nontransitory computer readable medium; and computer program code, encoded on the computer readable medium, comprising computer readable instructions which, when executed via any set of one or more processors, perform the following; obtaining initial identification data including data representing known entity names associated with one or more entities; defining a search time window; obtaining historical entity name search data indicating entity name searches entered by a searching party in the defined search time window; analyzing the historical entity name search data to identify a pair of potentially related entity name searches, each pair of potentially related entity name searches including two associated entity names, wherein analyzing the historical entity name search data to identify a pair of potentially related entity name searches includes; transforming at least part of the historical entity name search data into entity name search data strings representing entity names associated with entity name searches; determining a raw string distance between entity name search data strings, wherein the raw string distance between a first entity name search data string and a second entity name search data string is determined as the number of characters in the first entity name search data string that are different from the characters in the second entity name search data string; determining a normalized string distance between entity name search data strings, wherein the normalized string distance between the first and second entity name search data strings is equal to the raw string distance between the first and second entity name search data strings divided by the square root of the length of the first entity name search data string multiplied by the length of the second entity name search data string; defining a threshold normalized string distance; identifying a pair of entity name searches as potentially related entity name searches if the entity name search data strings representing the entity names of the pair of entity name searches have a normalized string distance less than the threshold normalized string distance; identifying a matched known entity name in the initial identification data that matches a first entity name of the two associated entity names; and adding the second entity name of the two associated entity names to an entity name alias table associated with the matched known entity name. - View Dependent Claims (10, 11, 12)
-
-
13. A system for creating an entity name alias table comprising:
-
at least one processor; and at least one memory coupled to the at least one processor, the at least one memory having stored therein instructions which when executed by any set of the one or more processors, perform a process for creating an entity name alias table, the process for creating an entity name alias table including; obtaining initial identification data including data representing known entity names associated with one or more entities; defining a search time window; obtaining historical entity name search data indicating entity name searches entered by a searching party in the defined search time window; analyzing the historical entity name search data to identify a pair of potentially related entity name searches, each pair of potentially related entity name searches including two associated entity names, wherein analyzing the historical entity name search data to identify a pair of potentially related entity name searches includes; transforming at least part of the historical entity name search data into entity name search data strings representing entity names associated with entity name searches; determining a raw string distance between entity name search data strings wherein the raw string distance between a first entity name search data string and a second entity name search data string is determined as the number of characters in the first entity name search data string that are different from the characters in the second entity name search data string; determining a normalized string distance between entity name search data strings, wherein the normalized string distance between the first and second entity name search data strings is equal to the raw string distance between the first and second entity name search data strings divided by the square root of the length of the first entity name search data string multiplied by the length of the second entity name search data string; defining a threshold normalized string distance; identifying a pair of entity name searches as potentially related entity name searches if the entity name search data strings representing the entity names of the pair of entity name searches have a normalized string distance less than the threshold normalized string distance; identifying a matched known entity name in the initial identification data that matches a first entity name of the two associated entity names; and adding the second entity name of the two associated entity names to an entity name alias table associated with the matched known entity name. - View Dependent Claims (14, 15, 16)
-
Specification