Finding relevant documents
First Claim
1. A computer-implemented method comprising:
- automatically extracting a plurality of groups of words from a set comprising a first document;
wherein in the plurality of groups, each group comprises a word;
automatically determining a plurality of first counts of a number of times said each group of words in said plurality matches said set;
automatically determining a plurality of second counts of the number of times said each group of words in said plurality matches a corpus of second documents;
automatically performing function fitting on at least first counts of said plurality of groups of words and corresponding second counts of said plurality of groups of words, to obtain a fitted function;
using at least one processor in automatically comparing a first count of said each group of words in the plurality of first counts to an evaluation of said fitted function at a second count of said each group of words in the plurality of second counts, to obtain a weight of said each group of words; and
automatically storing at least said weight in a computer memory coupled to said at least one processor.
1 Assignment
0 Petitions
Accused Products
Abstract
A programmed computer receives one or more documents that contain text that is relevant to a user (“interest documents”). The programmed computer automatically identifies groups of words that match the interest documents. The matching word groups are ranked by a weight that is assigned based on how infrequently a word group matches a reference corpus and how frequently the word group matches one or more interest document(s), in comparison to other word groups. A set of word groups are automatically identified based on ranking, and displayed to a user to select documents from a corpus. Selected documents are displayed to the user, e.g. with one or more group of words used in selecting the documents.
-
Citations
20 Claims
-
1. A computer-implemented method comprising:
-
automatically extracting a plurality of groups of words from a set comprising a first document; wherein in the plurality of groups, each group comprises a word; automatically determining a plurality of first counts of a number of times said each group of words in said plurality matches said set; automatically determining a plurality of second counts of the number of times said each group of words in said plurality matches a corpus of second documents; automatically performing function fitting on at least first counts of said plurality of groups of words and corresponding second counts of said plurality of groups of words, to obtain a fitted function; using at least one processor in automatically comparing a first count of said each group of words in the plurality of first counts to an evaluation of said fitted function at a second count of said each group of words in the plurality of second counts, to obtain a weight of said each group of words; and automatically storing at least said weight in a computer memory coupled to said at least one processor. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A non-transitory computer-readable storage medium comprising a plurality of instructions, said instructions comprising:
-
instructions to automatically extract multiple groups of words from a set comprising a first document; wherein in the multiple groups, each group comprises a word; instructions to automatically determine a plurality of first counts of a number of times said each group of words matches said set; instructions to automatically determine a plurality of second counts of the number of times said each group of words matches a corpus of second documents; instructions to automatically perform function fitting on at least first counts of said multiple groups of words and corresponding second counts of said multiple groups of words, to obtain a fitted function; instructions to at least one processor to automatically compare a first count of said each group of words in the plurality of first counts to an evaluation of said fitted function at a second count of said each group of words in the plurality of second counts, to obtain a weight of said each group; and instructions to automatically store at least said weight in a computer memory coupled to said at least one processor. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
-
20. An apparatus comprising:
-
means for automatically extracting multiple groups of words from a set comprising a first document; wherein in the multiple groups, each group comprises a word; means for automatically determining a plurality of first counts of a number of times said each group of words matches said set; means for automatically determining a plurality of second counts of the number of times said each group of words matches a corpus of second documents; means for performing function fitting on at least first counts of said multiple groups of words and corresponding second counts of said multiple groups of words, to obtain a fitted function; means for automatically comparing a first count of said each group of words in the plurality of first counts to an evaluation of said fitted function at a second count of said each group of words in the plurality of second counts, to obtain a weight of said each group; and means for automatically storing at least said weight in a computer memory.
-
Specification