Automated assistance for generating relevant and valuable search results for an entity of interest

US 10,235,461 B2
Filed: 05/02/2017
Issued: 03/19/2019
Est. Priority Date: 05/02/2017
Status: Active Grant

First Claim

Patent Images

1. A system for identifying relevant information for an entity comprising:

one or more processors; and

memory storing instructions that, when executed by the one or more processors, cause the system to;

generate a plurality of search queries comprising a seed entity and a plurality of entities associated with the seed entity, the plurality of entities including at least one first entity and at least one second entity, the at least one first entity being associated with the seed entity based on a hard link between the at least one first entity and the seed entity, the at least one second entity being associated with the seed entity based on a soft link between the at least one second entity and the seed entity, the soft link being generated based on one or more prior search queries;

conduct searches, with the search queries, in one or more data sources to obtain a plurality of search results, wherein each search result comprises a hit cluster, each hit cluster including properties of a hit entity and properties of one or more entities associated with the hit entity; and

determine a score for each of the hit clusters, taking as input (a) likelihood of match between the seed entity and the hit entity or between an entity associated with the seed entity and an entity associated with the hit entity, (b) presence of a new entity in the search result not present in the search queries and a difference between the new entity and an entity present in the search queries, and (c) characteristic of the new entity in the search result.

View all claims

10 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and methods are provided for identifying relevant information for an entity, referred to as a seed entity. A plurality of search queries can be generated each comprising a property of a seed entity or one of the entities associated with the seed entity (seed-linked entities). Preferably, a collection of search queries includes ones representing different properties of the seed entity and properties of different seed-linked entities. Optionally, the collection of search queries is optimized to reduce search burden. Searches can then be conducted with the search queries in one or more data sources to obtain a plurality of search results, wherein each search result comprises a hit entity and one or more entities associated with the hit entity (hit-linked entity). For each of the search results, a score can be determined taking as input (a) likelihood of match between the seed entity and the hit entity or between a seed-linked entity and a hit-linked entity, (b) presence of a new entity in the search result not present in the search queries or a difference between the new entity and an entity present in the search queries, and (c) characteristic of the new entity in the search result. Based on the scores, high priority search results can be presented a user for further analysis.

Citations

20 Claims

1. A system for identifying relevant information for an entity comprising:
- one or more processors; and
  
  memory storing instructions that, when executed by the one or more processors, cause the system to;
  
  generate a plurality of search queries comprising a seed entity and a plurality of entities associated with the seed entity, the plurality of entities including at least one first entity and at least one second entity, the at least one first entity being associated with the seed entity based on a hard link between the at least one first entity and the seed entity, the at least one second entity being associated with the seed entity based on a soft link between the at least one second entity and the seed entity, the soft link being generated based on one or more prior search queries;
  
  conduct searches, with the search queries, in one or more data sources to obtain a plurality of search results, wherein each search result comprises a hit cluster, each hit cluster including properties of a hit entity and properties of one or more entities associated with the hit entity; and
  
  determine a score for each of the hit clusters, taking as input (a) likelihood of match between the seed entity and the hit entity or between an entity associated with the seed entity and an entity associated with the hit entity, (b) presence of a new entity in the search result not present in the search queries and a difference between the new entity and an entity present in the search queries, and (c) characteristic of the new entity in the search result.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 12)
- - 2. The system of claim 1, wherein the instructions further cause the system to provide one or more search results based on the scores to a user for analysis.
  - 3. The system of claim 1, wherein at least one of the search queries comprises a third entity associated with one of the entities associated with the seed entity, wherein the third entity is at least not known as directly associated with the seed entity.
  - 4. The system of claim 3, wherein the third entity is identified from a pre-search with a search query that comprises the seed entity and the one or more entities associated with the seed entity.
  - 5. The system of claim 1, wherein the seed entity or the one or more entities associated with the seed entity is represented by a property of the respective entity, wherein the property is selected from the group consisting of name, address, date of birth, social security number, city of birth, image, social networking account, phone number and email address.
  - 6. The system of claim 1, wherein the instructions further cause the system to eliminate search queries less likely to return desired search results.
  - 7. The system of claim 1, wherein determination of likelihood of match comprises the use of a data compression method to determine a likelihood that the entities match with each other by chance.
  - 8. The system of claim 7, wherein the data compression method comprises the use of Huffman coding.
  - 9. The system of claim 1, wherein when an entity is represented by a person'"'"'s name, the determination of likelihood of match comprises determination of frequency of use of the name.
  - 10. The system of claim 1, wherein the characteristic of the entity is compared to a predefined list of characteristics of entities to determine the value of the characteristic.
  - 12. The method of claim 1, further comprising providing one or more search results based on the scores to a user for analysis.

11. A computer-implemented method comprising:
- generating, on a suitably programmed computing device, a plurality of search queries comprising a seed entity and a plurality of entities associated with the seed entity, the plurality of entities including at least one first entity and at least one second entity, the at least one first entity being associated with the seed entity based on a hard link between the at least one first entity and the seed entity, the at least one second entity being associated with the seed entity based on a soft link between the at least one second entity and the seed entity, the soft link being generated based on one or more prior search queries;
  
  conducting searches, with the search queries, in one or more data sources to obtain a plurality of search results, wherein each search result comprises a hit cluster, each hit cluster including properties of a hit entity and properties of one or more entities associated with the hit entity; and
  
  determining a score for each of the hit clusters, taking as input (a) likelihood of match between the seed entity and the hit entity or between an entity associated with the seed entity and an entity associated with the hit entity, (b) presence of a new entity in the search result not present in the search queries and a difference between the new entity and an entity present in the search queries, and (c) characteristic of the new entity in the search result.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
- - 13. The method of claim 11, wherein at least one of the search queries comprises a third entity associated with one of the entities associated with the seed entity, wherein the third entity is at least not known as directly associated with the seed entity.
  - 14. The method of claim 11, wherein the seed entity or the one or more entities associated with the seed entity is represented by a property of the respective entity, wherein the property is selected from the group consisting of name, address, date of birth, social security number, city of birth, image, social networking account, phone number and email address.
  - 15. The method of claim 11, further comprising eliminating search queries less likely to return desired search results.
  - 16. The method of claim 11, wherein determination of likelihood of match comprises the use of a data compression method to determine a likelihood that the entities match with each other by chance.
  - 17. The method of claim 16, wherein the data compression method comprises the use of Huffman coding.
  - 18. The method of claim 11, wherein when an entity is represented by a person'"'"'s name, the determination of likelihood of match comprises determination of frequency of use of the name.
  - 19. The method of claim 11, wherein the characteristic of the entity is compared to a predefined list of characteristics of entities to determine the value of the characteristic.

20. A non-transitory computer readable medium comprising instructions that, when executed, cause one or more processors to perform:
- generating, on a suitably programmed computing device, a plurality of search queries comprising a seed entity and a plurality of entities associated with the seed entity, the plurality of entities including at least one first entity and at least one second entity, the at least one first entity being associated with the seed entity based on a hard link between the at least one first entity and the seed entity, the at least one second entity being associated with the seed entity based on a soft link between the at least one second entity and the seed entity, the soft link being generated based on one or more prior search queries;
  
  conducting searches, with the search queries, in one or more data sources to obtain a plurality of search results, wherein each search result comprises a hit cluster, each hit cluster including properties of a hit entity and properties of one or more entities associated with the hit entity; and
  
  determining a score for each of the hit clusters, taking as input (a) likelihood of match between the seed entity and the hit entity or between an entity associated with the seed entity and an entity associated with the hit entity, (b) presence of a new entity in the search result not present in the search queries and a difference between the new entity and an entity present in the search queries, and (c) characteristic of the new entity in the search result.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Palantir Technologies Incorporated
Original Assignee
Palantir Technologies Incorporated
Inventors
Elkherj, Matthew, Einspahr, Ashley, Bunge, Breanna, Hammett, Chris, Crawford Tom, Erika, Beard, Mitchell, Beiermeister, Ryan, Sinton, Seelig, Hao, Sharon, Ayers, William, Robinson, Seth
Primary Examiner(s)
Nguyen, Kim T

Application Number

US15/584,423
Publication Number

US 20180322198A1
Time in Patent Office

686 Days
Field of Search

707707
US Class Current
CPC Class Codes

G06F 16/38   Retrieval characterised by ...

G06F 16/951   Indexing; Web crawling tech...

G06F 16/9538   Presentation of query results

Automated assistance for generating relevant and valuable search results for an entity of interest

First Claim

10 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Automated assistance for generating relevant and valuable search results for an entity of interest

First Claim

10 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links