APPARATUS, METHOD AND COMPUTER-ACCESSIBLE MEDIUM FOR EXPLAINING CLASSIFICATIONS OF DOCUMENTS
First Claim
1. A non-transitory computer readable medium including instructions thereon that are accessible by a hardware processing arrangement, wherein, when the processing arrangement executes the instructions, the processing arrangement is configured to at least generate information associated with a classification of at least one document, comprising:
- identifying at least first characteristic of the at least one document;
obtaining at least one second classification of the at least one document after removing the at least one first characteristic of the at least one document; and
generating the information associated with the classification of the at least one document based on the at least one second classification.
1 Assignment
0 Petitions
Accused Products
Abstract
Classification of collections of items such as words, which are called “document classification,” and more specifically explaining a classification of a document, such as a web-page or website. This can include exemplary procedure, system and/or computer-accessible medium to find explanations, as well as a framework to assess the procedure'"'"'s performance. An explanation is defined as a set of words (e.g., terms, more generally) such that removing words within this set from the document changes the predicted class from the class of interest. The exemplary procedure system and/or computer-accessible medium can include a classification of web pages as containing adult content, e.g., to allow advertising on safe web pages only. The explanations can be concise and document-specific, and provide insight into the reasons for the classification decisions, into the workings of the classification models, and into the business application itself. Other exemplary aspects describe how explaining documents'"'"' classifications can assist in improving the data quality and model performance.
-
Citations
68 Claims
-
1. A non-transitory computer readable medium including instructions thereon that are accessible by a hardware processing arrangement, wherein, when the processing arrangement executes the instructions, the processing arrangement is configured to at least generate information associated with a classification of at least one document, comprising:
-
identifying at least first characteristic of the at least one document; obtaining at least one second classification of the at least one document after removing the at least one first characteristic of the at least one document; and generating the information associated with the classification of the at least one document based on the at least one second classification. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A non-transitory computer readable medium including instructions thereon that are accessible by a hardware processing arrangement, wherein, when the processing arrangement executes the instructions, the processing arrangement is configured to at least generate information associated with at least one classification of a collection, comprising:
-
identifying at least first characteristic of the collection; obtaining at least one second classification of the collection after removing the at least one first characteristic of the collection; and generating the information associated with the classification of the collection based on the at least one second classification. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23-32. -32. (canceled)
-
33. A method for generating information associated with at least one classification of a collection, comprising:
-
identifying, with a processing arrangement, at least first characteristic of the collection; obtaining at least one second classification of the collection after removing the at least one first characteristic of the collection; and generating the information associated with the classification of the collection based on the at least one second classification. - View Dependent Claims (67)
-
-
34-54. -54. (canceled)
-
55. A system configured to at least generate information associated with at least one classification of a collection, comprising:
a processing arrangement configured to; identify at least first characteristic of the collection; obtain at least one second classification of the collection after removing the at least one first characteristic of the collection; and generate the information associated with the classification of the collection based on the at least one second classification. - View Dependent Claims (68)
-
56-66. -66. (canceled)
Specification