Methods and systems for knowledge discovery
First Claim
Patent Images
1. A computer-implemented method for textual analysis comprising:
- a. determining, by a computer processor, a co-occurrence of a long form and an associated short form of a term in a document;
b. locating, by a computer processor, a plurality of occurrences of the associated short form; and
c. expanding, by a computer processor, the plurality of occurrences of the associated short form with the long form wherein the document has a more accurate representation of frequency of occurrence of the term;
d. receiving a context fingerprint for each of a plurality of concepts;
e. determining an overlap of context fingerprints among the plurality of the concepts;
f. determining a similarity score between the context fingerprints; and
g. predicting that two or more of the plurality of concepts have a relationship, wherein the overlap is above a first threshold and the similarity score is above a second threshold.
4 Assignments
0 Petitions
Accused Products
Abstract
Provided are methods and systems for knowledge discovery utilizing knowledge profiles.
-
Citations
34 Claims
-
1. A computer-implemented method for textual analysis comprising:
-
a. determining, by a computer processor, a co-occurrence of a long form and an associated short form of a term in a document; b. locating, by a computer processor, a plurality of occurrences of the associated short form; and c. expanding, by a computer processor, the plurality of occurrences of the associated short form with the long form wherein the document has a more accurate representation of frequency of occurrence of the term; d. receiving a context fingerprint for each of a plurality of concepts; e. determining an overlap of context fingerprints among the plurality of the concepts; f. determining a similarity score between the context fingerprints; and g. predicting that two or more of the plurality of concepts have a relationship, wherein the overlap is above a first threshold and the similarity score is above a second threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7, 22, 23, 24)
-
-
8. A system for textual analysis comprising:
-
a memory configured for storing text data; and a processor, coupled to the memory, configured for performing steps comprising, a. determining a co-occurrence of a long form and an associated short form of a term in a document, b. locating a plurality of occurrences of the associated short form, c. expanding the plurality of occurrences of the associated short form with the long form wherein the document has a more accurate representation of frequency of occurrence of the term; d. receiving a context fingerprint for each of a plurality of concepts; e. determining an overlap of context fingerprints among the plurality of the concepts; f. determining a similarity score between the context fingerprints; and g. predicting that two or more of the plurality of concepts have a relationship, wherein the overlap is above a first threshold and the similarity score is above a second threshold. - View Dependent Claims (9, 10, 11, 12, 13, 14, 25, 26, 27, 28, 29, 30, 31)
-
-
15. A non-transitory computer-readable storage medium with computer executable instructions embodied thereon for textual analysis, that when executed by a computer processor, causes said computer processor to perform steps comprising:
-
a. determining a co-occurrence of a long form and an associated short form of a term in a document; b. locating a plurality of occurrences of the associated short form; and c. expanding the plurality of occurrences of the associated short form with the long form wherein the document has a more accurate representation of frequency of occurrence of the term; d. receiving a context fingerprint for each of a plurality of concepts; e. determining an overlap of context fingerprints among the plurality of the concepts; f. determining a similarity score between the context fingerprints; and g. predicting that two or more of the plurality of concepts have a relationship, wherein the overlap is above a first threshold and the similarity score is above a second threshold. - View Dependent Claims (16, 17, 18, 19, 20, 21, 32, 33, 34)
-
Specification