Semantic content searching
First Claim
1. A computer-based method for document searching by semantic content, comprising:
- receiving a user selection of a first portion of a document, the first portion comprising desired semantic content;
running the document, comprising the selected first portion, through one or more classifiers to identify a first potential target document comprising a second portion;
responsive to determining that the second portion of the first potential target document does not have the desired semantic content;
receiving a user selection of a third portion of the first potential target document that comprises the desired semantic content; and
running the first potential target document, comprising the selected third portion, through the one or more classifiers to identify a second potential target document comprising a fourth portion that has the desired semantic content; and
at least one of;
receiving user input comprising an indication that a second document identified by the one or more classifiers comprises the desired semantic content;
receiving user input comprising an indication that the second document identified by the one or more classifiers does not comprise the desired semantic content;
receiving user input comprising a selected fifth portion of the second document, the selected fifth portion comprising the desired semantic content;
running a plurality of documents through the one or more classifiers until a desired document selection accuracy is reached for the desired semantic content;
running the plurality of documents through the one or more classifiers until a desired number of correct documents are retrieved without an incorrect document being retrieved;
running the plurality of documents through the one or more classifiers until a desired number of documents have been retrieved;
using a second classifier to validate a document retrieved by a first classifier;
oridentifying a combination of two or more classifiers that has a desired accuracy rate for retrieving documents, and utilizing the identified combination to retrieve documents for the desired semantic content.
2 Assignments
0 Petitions
Accused Products
Abstract
One or more techniques and/or systems are disclosed that provide for document retrieval where a user can identify key attributes of potential target documents that are desirable (e.g., have a particular semantic content for the user). Further, relevant documents that comprise the desired semantic content can be retrieved. Additionally, the user can provide feedback on the retrieved documents, for example, based on key semantic concepts found in the documents, and the input can be used to update the classification. For example, this process can be iterated to improve the retrieval and precision of documents found through machine learning techniques.
63 Citations
20 Claims
-
1. A computer-based method for document searching by semantic content, comprising:
-
receiving a user selection of a first portion of a document, the first portion comprising desired semantic content; running the document, comprising the selected first portion, through one or more classifiers to identify a first potential target document comprising a second portion; responsive to determining that the second portion of the first potential target document does not have the desired semantic content; receiving a user selection of a third portion of the first potential target document that comprises the desired semantic content; and running the first potential target document, comprising the selected third portion, through the one or more classifiers to identify a second potential target document comprising a fourth portion that has the desired semantic content; and at least one of; receiving user input comprising an indication that a second document identified by the one or more classifiers comprises the desired semantic content; receiving user input comprising an indication that the second document identified by the one or more classifiers does not comprise the desired semantic content; receiving user input comprising a selected fifth portion of the second document, the selected fifth portion comprising the desired semantic content; running a plurality of documents through the one or more classifiers until a desired document selection accuracy is reached for the desired semantic content; running the plurality of documents through the one or more classifiers until a desired number of correct documents are retrieved without an incorrect document being retrieved; running the plurality of documents through the one or more classifiers until a desired number of documents have been retrieved; using a second classifier to validate a document retrieved by a first classifier;
oridentifying a combination of two or more classifiers that has a desired accuracy rate for retrieving documents, and utilizing the identified combination to retrieve documents for the desired semantic content. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A system, comprising:
-
one or more processing units; and memory comprising instructions that when executed by at least some of the one or more processing units, perform a method comprising; receiving a user selection of a first portion of a document, the first portion comprising desired semantic content; running the document, comprising the selected first portion, through one or more classifiers to identify a first potential target document comprising a second portion; responsive to determining that the second portion of the first potential target document does not have the desired semantic content; receiving a user selection of a third portion of the first potential target document that comprises the desired semantic content; and running the first potential target document, comprising the selected third portion, through the one or more classifiers to identify a second potential target document comprising a fourth portion that has the desired semantic content; and at least one of; receiving user input comprising an indication that a second document identified by the one or more classifiers comprises the desired semantic content; receiving user input comprising an indication that the second document identified by the one or more classifiers does not comprise the desired semantic content; receiving user input comprising a selected fifth portion of the second document, the selected fifth portion comprising the desired semantic content; running a plurality of documents through the one or more classifiers until a desired document selection accuracy is reached for the desired semantic content; running the plurality of documents through the one or more classifiers until a desired number of correct documents are retrieved without an incorrect document being retrieved; running the plurality of documents through the one or more classifiers until a desired number of documents have been retrieved; using a second classifier to validate a document retrieved by a first classifier;
oridentifying a combination of two or more classifiers that has a desired accuracy rate for retrieving documents, and utilizing the identified combination to retrieve documents for the desired semantic content. - View Dependent Claims (15, 16)
-
-
17. A non-signal computer readable storage device comprising instructions that when executed, perform a method comprising:
-
receiving a user selection of a first portion of a document, the first portion comprising desired semantic content; running the document, comprising the selected first portion, through one or more classifiers to identify a first potential target document comprising a second portion; responsive to determining that the second portion of the first potential target document does not have the desired semantic content; receiving a user selection of a third portion of the first potential target document that comprises the desired semantic content; and running the first potential target document, comprising the selected third portion, through the one or more classifiers to identify a second potential target document comprising a fourth portion that has the desired semantic content; and at least one of; receiving user input comprising an indication that a second document identified by the one or more classifiers comprises the desired semantic content; receiving user input comprising an indication that the second document identified by the one or more classifiers does not comprise the desired semantic content; receiving user input comprising a selected fifth portion of the second document, the selected fifth portion comprising the desired semantic content; running a plurality of documents through the one or more classifiers until a desired document selection accuracy is reached for the desired semantic content; running the plurality of documents through the one or more classifiers until a desired number of correct documents are retrieved without an incorrect document being retrieved; running the plurality of documents through the one or more classifiers until a desired number of documents have been retrieved; using a second classifier to validate a document retrieved by a first classifier;
oridentifying a combination of two or more classifiers that has a desired accuracy rate for retrieving documents, and utilizing the identified combination to retrieve documents for the desired semantic content. - View Dependent Claims (18, 19, 20)
-
Specification