AUTOMATED DOCUMENT ANALYSIS COMPRISING COMPANY NAME RECOGNITION
1 Assignment
0 Petitions
Accused Products
Abstract
At least two processing device-implemented company name recognition components, operating upon a body of text in a document, identify at least one company name occurrence in the body of text based at least in part on a company identifier list. The company name recognition techniques implemented by each of the at least two company name recognition components are different from each other. The at least one company name occurrence is used to update the company identifier list. The updated company identifier list is then used by the at least two company name recognition components to identify at least one additional name occurrence in the same body of text. This process of repeatedly identifying occurrences of company names in the body of text and updating the company identifier list is performed until such time that no further company name occurrences are identified in the body of text.
-
Citations
53 Claims
-
1-33. -33. (canceled)
-
34. A method for performing, by at least one processing device, automated document analysis of a document comprising a body of text, the method comprising:
-
accessing a token among a sequence of tokens constituting the body of the text in the document; comparing the token with company names included in a company identifier list; determining, based on the comparison, a potential match between the token and at least one company name in the company identifier list; analyzing, based on a determination of no match, whether the token constitutes a possessive form of the at least one company name; determining, based on a determination of no match, whether the token is a synonym or a substitute of the at least one company name; further determining, based on a determination of no match, whether the token includes a punctuation included in the at least one company name; and establishing, when the token constitutes at least one of a possessive form, a synonym or a substitute, and a punctuation, a match with the at least one company name. - View Dependent Claims (35, 36, 37, 38, 39, 40, 41)
-
-
42. A system comprising:
-
at least one processing device; and memory operatively connected to the at least one processing device, the memory comprising executable instructions that when executed by the at least one processing device cause the at least one processing device to; access a token among a sequence of tokens constituting the body of the text in the document; compare the token with company names included in a company identifier list; determine, based on the comparison, a potential match between the token and at least one company name in the company identifier list; analyze, based on a determination of no match, whether the token constitutes a possessive form of the at least one company name; determine, based on a determination of no match, whether the token is a synonym or a substitute of the at least one company name; further determine, based on a determination of no match, whether the token includes a punctuation included in the at least one company name; and establish, when the token constitutes at least one of a possessive form, a synonym or a substitute, and a punctuation, a match with the at least one company name. - View Dependent Claims (43, 44, 45, 46, 47, 48, 49)
-
-
50. A non-transitory computer readable medium comprising executable instructions that when executed by at least one processing device cause the at least one processing device to perform automated document analysis of a document comprising a body of text in which the at least one processing device is caused to:
-
access a token among a sequence of tokens constituting the body of the text in the document; compare the token with company names included in a company identifier list; determine, based on the comparison, a potential match between the token and at least one company name in the company identifier list; analyze, based on a determination of no match, whether the token constitutes a possessive form of the at least one company name; determine, based on a determination of no match, whether the token is a synonym or a substitute of the at least one company name; further determine, based on a determination of no match, whether the token includes a punctuation included in the at least one company name; and establish, when the token constitutes at least one of a possessive form, a synonym or a substitute, and a punctuation, a match with the at least one company name. - View Dependent Claims (51, 52, 53)
-
Specification