Semantic document profiling
First Claim
Patent Images
1. A method of semantic profiling of documents comprising:
- receiving a document to be profiled, the document comprising a plurality of terms;
for each of at least a portion of the plurality of terms in the document;
determining a part of speech and a grammatical function of the term, obtaining senses of the term, selecting a sense as a most likely meaning of the term, and calculating an information value of the term; and
generating a semantic profile of the document comprising at least some of the calculated information values.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of semantic profiling of documents comprises receiving a document to be profiled, the document comprising a plurality of terms, for each of at least a portion of the plurality of terms in the document determining a part of speech and a grammatical function of the term, obtaining senses of the term, selecting a sense as a most likely meaning of the term, and calculating an information value of the term, and generating a semantic profile of the document comprising at least some of the calculated information values.
-
Citations
61 Claims
-
1. A method of semantic profiling of documents comprising:
-
receiving a document to be profiled, the document comprising a plurality of terms;
for each of at least a portion of the plurality of terms in the document;
determining a part of speech and a grammatical function of the term, obtaining senses of the term, selecting a sense as a most likely meaning of the term, and calculating an information value of the term; and
generating a semantic profile of the document comprising at least some of the calculated information values. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A method for performing document-based searching comprising:
-
receiving a search document, the search document comprising a plurality of terms;
for each of at least a portion of the plurality of terms in the search document;
determining a part of speech and a grammatical function of the term, obtaining senses of the term, selecting a sense as a most likely meaning of the term, and calculating an information value of the term; and
generating a semantic profile of the search document comprising at least some of the calculated information values; and
accessing a database comprising a plurality of semantic profiles of documents to retrieve documents having semantic profiles that are similar to the semantic profile of the search documents, each semantic profile in the database comprising a plurality of information values of terms included in the document. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37)
-
-
38. A method of summarizing a textual document comprising:
-
calculating an information value for each sentence of the document;
deleting from consideration for a summary sentences having an information value below a first threshold value to form retained sentences;
deleting from the retained sentences non-main clauses having information values below a second threshold value to form retained clauses;
normalizing the retained clauses to declarative form;
deleting modifiers having information values below a third threshold value from the normalized retained clauses to from kernel phrases;
selecting at least a portion of the kernel phrases; and
replacing at least portions of the kernel phrases with terms relating to similar concepts selected from a taxonomic hierarchy. - View Dependent Claims (39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61)
-
Specification