Risk analysis using unstructured data
First Claim
Patent Images
1. A system, comprising:
- a network interface operable to receive unstructured data from a plurality of data sources, the plurality of data sources comprising a competitor database, a vendor database, and a marketing database, wherein the unstructured data relates to a financial risk of an organization and comprises a plurality of text documents, each text document comprising a plurality of groups of words;
a processor communicatively coupled to the network interface and operable to;
deconstruct each group of words from the unstructured data into individual words;
convert the individual words from each group of words into a plurality of structured forms, each structured form corresponding to a single group of words;
determine a numerical value associated with each individual word according to;
a number of times the individual word appears in the group of words and an association of the group of words with a risk experienced by an organization, wherein each structured form is a vector that includes each individual word and the numerical value associated with the individual word;
compare each structured form to another structured form using a Bayesian inference;
categorize the individual words in each structured form into at least one category according to the comparison and the at least one category is selected from a set of categories consisting of organization name, geographical region, organization size, number of employees, number of countries represented, public organization, private organization, regulatory body, industry, and fine amount, the categories indicating the financial risk of the organization; and
quantify the individual words in each structured form according to at least the categorization of the individual words by weighting each individual word.
1 Assignment
0 Petitions
Accused Products
Abstract
Unstructured data is received from a plurality of sources to facilitate risk analysis. The unstructured data comprises a plurality of bodies of text. Each body of text from the unstructured data is deconstructed into individual terms. The individual terms from each body of text are converted into a structured form. The individual terms in the structured form are categorized according to a comparison of the structured form to another structured form. The individual terms in the structured form are quantified according to at least the categorization of the individual terms.
33 Citations
19 Claims
-
1. A system, comprising:
-
a network interface operable to receive unstructured data from a plurality of data sources, the plurality of data sources comprising a competitor database, a vendor database, and a marketing database, wherein the unstructured data relates to a financial risk of an organization and comprises a plurality of text documents, each text document comprising a plurality of groups of words; a processor communicatively coupled to the network interface and operable to; deconstruct each group of words from the unstructured data into individual words; convert the individual words from each group of words into a plurality of structured forms, each structured form corresponding to a single group of words; determine a numerical value associated with each individual word according to;
a number of times the individual word appears in the group of words and an association of the group of words with a risk experienced by an organization, wherein each structured form is a vector that includes each individual word and the numerical value associated with the individual word;compare each structured form to another structured form using a Bayesian inference; categorize the individual words in each structured form into at least one category according to the comparison and the at least one category is selected from a set of categories consisting of organization name, geographical region, organization size, number of employees, number of countries represented, public organization, private organization, regulatory body, industry, and fine amount, the categories indicating the financial risk of the organization; and quantify the individual words in each structured form according to at least the categorization of the individual words by weighting each individual word. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. Non-transitory computer readable medium comprising logic, the logic, when executed by a processor, operable to:
-
receive unstructured data from a plurality of data sources, the plurality of data sources comprising a competitor database, a vendor database, and a marketing database, wherein the unstructured data relates to a financial risk of an organization and comprises a plurality of text documents, each text document comprising a plurality of groups of words; deconstruct each group of words from the unstructured data into individual words; convert the individual words from each group of words into a plurality of structured forms, each structured form corresponding to a single group of words; determine a numerical value associated with each individual word according to;
a number of times the individual word appears in the group of words and an association of the group of words with a risk experienced by an organization, wherein each structured form is a vector that includes each individual word and the numerical value associated with the individual word;compare each structured form to another structured form using a Bayesian inference; categorize the individual words in each structured form into at least one category according to the comparison and the at least one category is selected from a set of categories consisting of organization name, geographical region, organization size, number of employees, number of countries represented, public organization, private organization, regulatory body, industry, and fine amount, the categories indicating the financial risk of the organization; and quantify the individual words in each structured form according to at least the categorization of the individual words by weighting each individual word. - View Dependent Claims (8, 9, 10, 11)
-
-
12. A method, comprising:
-
receiving unstructured data from a plurality of data sources, the plurality of data sources comprising a competitor database, a vendor database, and a marketing database, wherein the unstructured data relates to a financial risk of an organization and comprises a plurality of text documents, each text document comprising a plurality of groups of words; deconstructing, by a processor, each group of words from the unstructured data into individual words; converting, by the processor, the individual words from each group of words into a plurality of structured forms, each structured form corresponding to a single group of words; determining, by the processor, a numerical value associated with each individual word according to;
a number of times the individual word appears in the group of words and an association of the group of words with a risk experienced by an organization, wherein each structured form is a vector that includes each individual word and a quantification of the individual word;comparing, by the processor, each structured form to another structured form using a Bayesian inference; categorizing, by the processor, the individual words in each structured form into at least one category according to the comparison and the at least one category is selected from a set of categories consisting of organization name, geographical region, organization size, number of employees, number of countries represented, public organization, private organization, regulatory body, industry, and fine amount, the categories indicating the financial risk of the organization; and quantifying, by the processor, the individual words in each structured form according to at least the categorization of the individual words by weighting each individual word. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
-
Specification