Information relation generation

US 10,198,431 B2
Filed: 08/22/2011
Issued: 02/05/2019
Est. Priority Date: 09/28/2010
Status: Active Grant

First Claim

Patent Images

1. A method for mining a relationship of at least a first and a second named entity comprising:

identifying a sentence with at least the first and the second named entity in a document;

defining, by a processor, a first instance comprising the first and the second named entity, a type of named entity for each of the first and the second named entity, and text in the sentence between the first and the second named entity;

applying, by a processor, latent Dirichlet allocation (LDA) to the document, the LDA including an input of the first instance, and then determining a distribution of types of relationship as an output, the types of relationship comprising labels of how the first named entity relates to the second named entity; and

selecting one of the types of the relationship as the relationship for the first and the second named entity,wherein applying the LDA comprises applying a supervised maximum entropy discrimination LDA with the characteristic types of relationships as observed response variables of an output for supervision of the supervised maximum entropy discrimination LDA.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

For generating a word space, manual thresholding of word scores is used. Rather than requiring the user to select the threshold arbitrarily or review each word, the user is iteratively requested to indicate the relevance of a given word. Words with greater or lesser scores are labeled in the same way depending upon the response. For determining the relationship between named entities, Latent Dirichlet Allocation (LDA) is performed on text associated with the name entities rather than on an entire document. LDA for relationship mining may include context information and/or supervised learning.

Citations

9 Claims

1. A method for mining a relationship of at least a first and a second named entity comprising:
- identifying a sentence with at least the first and the second named entity in a document;
  
  defining, by a processor, a first instance comprising the first and the second named entity, a type of named entity for each of the first and the second named entity, and text in the sentence between the first and the second named entity;
  
  applying, by a processor, latent Dirichlet allocation (LDA) to the document, the LDA including an input of the first instance, and then determining a distribution of types of relationship as an output, the types of relationship comprising labels of how the first named entity relates to the second named entity; and
  
  selecting one of the types of the relationship as the relationship for the first and the second named entity,wherein applying the LDA comprises applying a supervised maximum entropy discrimination LDA with the characteristic types of relationships as observed response variables of an output for supervision of the supervised maximum entropy discrimination LDA.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein identifying the sentence comprises pairing named entities in sentences of the document, and wherein defining comprises defining a plurality of instances including the first instance.
  - 3. The method of claim 2 wherein applying the LDA comprises identifying the relationship for each instance of the plurality of instances.
  - 4. The method of claim 1, wherein determining the distribution of types of relationship comprises discriminating between types of relationship by using a machine learning classifier.
  - 5. The method of claim 4, wherein for discriminating, the machine learnt classifier uses training data with known types of relationship and specific instances.
  - 6. The method of claim 4, wherein for the first instance, the type of relationship indicated is identified by a support vector machine.
  - 7. The method of claim 1, wherein the selection of one of the types of the relationship as the relationship for the first and the second named entities is based on an average over all possible models and latent topics, wherein latent topics are hidden semantic features discovered by topic models.

8. A method for mining a relationship of at least a first and a second named entity comprising:
- identifying a sentence with at least the first and the second named entity in a document;
  
  defining, by a processor, a first instance comprising the first and the second named entity, a type of named entity for each of the first and the second named entity, and text in the sentence between the first and the second named entity;
  
  applying, by a processor, latent Dirichlet allocation (LDA) to the document, the LDA including an input of the first instance, and then determining a distribution of types of relationship as an output, the types of relationship comprising labels of how the first named entity relates to the second named entity; and
  
  selecting one of the types of the relationship as the relationship for the first and the second named entity,wherein applying the LDA comprises applying a labeled LDA without a labeling prior probability.

9. A method for mining a relationship of at least a first named entity and a second named entity on a non-transitory computer readable storage media having stored therein data representing instructions executable by a programmed processor, the method comprising:
- applying latent Dirichlet allocation (LDA) to a document, which is stored on the non-transitory computer readable storage media, with identified sentences with the first and the second named entity, the LDA including an input of the first instance, the first instance comprising the first and the second named entity, a type of named entity for the first and the second named entity and text in a sentence between the first and the second named entity, and then determining a distribution of types of relationship as an output of the LDA, the types of relationships comprising labels of how the first named entity relates to the second named entity; and
  
  selecting one of the types of relationship as the relationship for the first and the second named entity,wherein applying the LDA comprises applying a supervised maximum entropy discrimination LDA with the characteristic types of relationships as observed response variables of an output for supervision of the supervised maximum entropy discrimination LDA.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Siemens Corp. (Siemens AG)
Original Assignee
Siemens Corp. (Siemens AG)
Inventors
Somasundaran, Swapna, Li, Dingcheng, Chakraborty, Amit
Primary Examiner(s)
Raab, Christopher J

Application Number

US13/214,291
Publication Number

US 20120078918A1
Time in Patent Office

2,724 Days
Field of Search

707736, 707748
US Class Current
CPC Class Codes

G06F 16/353   into predefined classes

G06F 16/36   Creation of semantic tools,...

G06F 40/284   Lexical analysis, e.g. toke...

G06F 40/295   Named entity recognition

Information relation generation

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

9 Claims

Specification

Solutions

Use Cases

Quick Links

Information relation generation

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

9 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links