Text searching and categorization tools
DCFirst Claim
Patent Images
1. Machine-readable media encoded with data, the data being interoperable with a machine to cause:
- accessing text data to be mined, the text data including text snippets;
encoding rules in a rule base, the encoding of a given one of the rules including a human user entering, via a computer screen displayed subject matter expert interface, freely typed text representing a given label and the encoding of the given one further including the human user entering, via a computer screen interface, freely typed text representing given synonyms, the given label and the given synonyms defining the given one of the rules at least in part;
submitting a search request to a search request handler;
the search request handler applying the rules from the rule base, including the given one, to the text data and automatically associating different labels to respective text snippets in the text data in accordance with the rule base;
displaying the text snippets and associated labels resulting from the application of the rules from the rule base on the subject matter expert interface; and
after the search request handler has at least once applied the rules, presenting to the human user, on the computer screen displayed subject matter expert interface, a revise option that can be selected by the human user through the subject matter expert interface to indicate a need to further encode the rules in the rule base, and the human user choosing the revise option and entering, via the computer screen displayed subject matter expert interface, freely typed text to thereby revise both the given label and the given synonyms;
wherein the rule input by the human user includes information encoding a rule from among the rules in the rule base to include a label and synonyms including a corresponding set of match terms, where a mined text snippet containing a match term in the corresponding set of match terms is associated with the label.
6 Assignments
Litigations
0 Petitions
Accused Products
Abstract
A disclosed process accesses text data that is to be mined. The text data includes text snippets. Rules are encoded in a rule base. A search request is submitted to a search request handler. A search request handler applies the rules from the rule base to the text and associates different labels to respective text snippets in the text data in accordance with the rule base.
-
Citations
21 Claims
-
1. Machine-readable media encoded with data, the data being interoperable with a machine to cause:
-
accessing text data to be mined, the text data including text snippets; encoding rules in a rule base, the encoding of a given one of the rules including a human user entering, via a computer screen displayed subject matter expert interface, freely typed text representing a given label and the encoding of the given one further including the human user entering, via a computer screen interface, freely typed text representing given synonyms, the given label and the given synonyms defining the given one of the rules at least in part; submitting a search request to a search request handler; the search request handler applying the rules from the rule base, including the given one, to the text data and automatically associating different labels to respective text snippets in the text data in accordance with the rule base; displaying the text snippets and associated labels resulting from the application of the rules from the rule base on the subject matter expert interface; and after the search request handler has at least once applied the rules, presenting to the human user, on the computer screen displayed subject matter expert interface, a revise option that can be selected by the human user through the subject matter expert interface to indicate a need to further encode the rules in the rule base, and the human user choosing the revise option and entering, via the computer screen displayed subject matter expert interface, freely typed text to thereby revise both the given label and the given synonyms; wherein the rule input by the human user includes information encoding a rule from among the rules in the rule base to include a label and synonyms including a corresponding set of match terms, where a mined text snippet containing a match term in the corresponding set of match terms is associated with the label. - View Dependent Claims (2, 3, 4, 5, 6, 18, 19, 20, 21)
-
-
7. A method comprising:
-
accessing text data to be mined, the text data including text snippets; encoding rules in a rule base, the encoding of a given one of the rules including a user entering, via a computer screen displayed subject matter expert interface, freely typed text representing a given label and the encoding of the given one further including the user entering, via the computer screen displayed subject matter expert interface, freely typed text representing given synonyms, the given label and the given synonyms defining the given one of the rules at least in part; submitting a search request to a search request handler; the search request handler applying the rules from the rule base to the text data and associating different labels to respective text snippets in the text data in accordance with the rule base; displaying the text snippets and associated labels resulting from the application of the rules from the rule base on the subject matter expert interface; and after the search request handler has at least once applied the rules, presenting to the human user, on the computer screen displayed subject matter expert interface, a revise option that can be selected by the human user through the subject matter expert interface to indicate a need to further encode the rules in the rule base, and the human user choosing the revise option and entering, via the computer screen displayed subject matter expert interface, freely typed text to thereby revise both the given label and the given synonyms; wherein the rule input by the human user includes information encoding a rule from among the rules in the rule base to include a label and synonyms including a corresponding set of match terms, where a mined text snippet containing a match term in the corresponding set of match terms is associated with the label. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. Apparatus comprising:
-
a computer screen displaying a computer screen interface; a search request handler to access text data to be mined, the text data including text snippets; a rule base including rules; a rule entering interface configured to receive a given one of the rules input by a human user, via a rules selection interface displayed on the computer screen, freely typing text representing a given label and further by the human user, via the computer screen interface, freely typing text representing given synonyms, the given label and the given synonyms defining the given one of the rules at least in part; the rule selection interface to encode, via the human user interacting with the computer screen, the rules in the rule base; a search interface to submit a search request to the search request handler; and an execution mechanism to apply the rules from the rule base, including the given one, to the text data and to associate different labels to respective text snippets in the text data in accordance with the rule base; a subject matter expert interface to display the text snippets and associated labels resulting from the application of the rules from the rule base; and a new rule input mechanism preconfigured to present via the computer screen displayed rule selection interface, after the execution mechanism has at least once applied the rules, a revise option that can be selected by the human user through the subject matter expert interface to indicate a need to further encode the rules in the rule base, and to receive, upon the human user choosing the revise option, freely typed text to thereby revise both the given label and the given synonyms; wherein a rule in the rule base includes a label and synonyms including a corresponding set of match terms, where, when the rule is executed by the execution mechanism, a mined text snippet containing at least one match term in the given set of match terms is associated with the label corresponding to the given set of match terms. - View Dependent Claims (14, 15, 16, 17)
-
Specification