Tailoring question answering system output based on user expertise
First Claim
1. A method, in a data processing system comprising at least one processor and at least one memory, the at least one memory comprising instructions which are executed by the at least one processor and configure the processor to implement a question answering system for tailoring question answering system output based on user expertise, the method comprising:
- training an expertise model, comprising;
harvesting a collection of question and answer postings;
labeling questions and answers in the collection with predetermined expertise levels;
determining a set of features associated with text of each question and answer; and
training a machine learning model based on the predetermined expertise levels and the sets of features associated with the text of the questions and answers to form the trained expertise model, wherein the trained expertise model comprises a question partition trained using questions in the collection of question and answer postings and an answer partition trained using answers in the collection of question and answer postings;
receiving, by the question answering system executing a question answering pipeline on the at least one processor of the data processing system, an input question from a questioning user,determining, by a question and topic analysis stage of the question answering pipeline, a set of features associated with text of the input question, wherein determining the set of features associated with the text of the input question comprises extracting a plurality of features from the text of the input question using an annotation engine pipeline in the data processing system;
obtaining features from the questioning user'"'"'s posting history within a collection of question and answer postings, wherein the features from the questioning user'"'"'s posting history include a percentage of the questioning user'"'"'s posts that are questions versus answers;
determining, by the question answering pipeline, an expertise level of the questioning user based on the set of features associated with the text of the input question and based on at least the percentage of the questioning user'"'"'s posts that are questions versus answers using the question partition of the trained expertise model;
generating, by a hypothesis generation stage of the question answering pipeline, one or more candidate answers for the input question; and
tailoring, by the question answering system, output of the one or more candidate answers based on the expertise level of the questioning user.
1 Assignment
0 Petitions
Accused Products
Abstract
A mechanism is provided in a data processing system for tailoring question answering system output based on user expertise. The mechanism receives an input question from a questioning user and determines a set of features associated with text of the input question. The mechanism determines an expertise level of the questioning user based on the set of features associated with the text of the input question using a trained expertise model. The mechanism generates one or more candidate answers for the input question and tailors output of the one or more candidate answers based on the expertise level of the questioning user.
40 Citations
17 Claims
-
1. A method, in a data processing system comprising at least one processor and at least one memory, the at least one memory comprising instructions which are executed by the at least one processor and configure the processor to implement a question answering system for tailoring question answering system output based on user expertise, the method comprising:
-
training an expertise model, comprising; harvesting a collection of question and answer postings; labeling questions and answers in the collection with predetermined expertise levels; determining a set of features associated with text of each question and answer; and training a machine learning model based on the predetermined expertise levels and the sets of features associated with the text of the questions and answers to form the trained expertise model, wherein the trained expertise model comprises a question partition trained using questions in the collection of question and answer postings and an answer partition trained using answers in the collection of question and answer postings; receiving, by the question answering system executing a question answering pipeline on the at least one processor of the data processing system, an input question from a questioning user, determining, by a question and topic analysis stage of the question answering pipeline, a set of features associated with text of the input question, wherein determining the set of features associated with the text of the input question comprises extracting a plurality of features from the text of the input question using an annotation engine pipeline in the data processing system; obtaining features from the questioning user'"'"'s posting history within a collection of question and answer postings, wherein the features from the questioning user'"'"'s posting history include a percentage of the questioning user'"'"'s posts that are questions versus answers; determining, by the question answering pipeline, an expertise level of the questioning user based on the set of features associated with the text of the input question and based on at least the percentage of the questioning user'"'"'s posts that are questions versus answers using the question partition of the trained expertise model; generating, by a hypothesis generation stage of the question answering pipeline, one or more candidate answers for the input question; and tailoring, by the question answering system, output of the one or more candidate answers based on the expertise level of the questioning user. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer program product comprising a computer readable storage medium having a computer readable program stored therein, wherein the computer readable program, when executed on a computing device, causes the computing device to implement a question answering system for tailoring question answering system output based on user expertise, wherein the computer readable program further causes the computing device to:
-
train an expertise model, comprising; harvesting a collection of question and answer postings; labeling questions and answers in the collection with predetermined expertise levels; determining a set of features associated with text of each question and answer; and training a machine learning model based on the predetermined expertise levels and the sets of features associated with the text of the questions and answers to form the trained expertise model, wherein the trained expertise model comprises a question partition trained using questions in the collection of question and answer postings and an answer partition trained using answers in the collection of question and answer postings; receive, by the question answering system executing a question answering pipeline on at least one processor of the computing device, an input question from a questioning user; determine, by a question and topic analysis stage of the question answering pipeline, a set of features associated with text of the input question, wherein determining the set of features associated with the text of the input question comprises extracting a plurality of features from the text of the input question using an annotation engine pipeline in the data processing system; obtain features from the questioning user'"'"'s posting history within a collection of question and answer postings, wherein the features from the questioning user'"'"'s posting history include a percentage of the questioning user'"'"'s posts that are questions versus answers; determine, by the question answering pipeline, an expertise level of the questioning user based on the set of features associated with the text of the input question and based on at least the percentage of the questioning user'"'"'s posts that are questions versus answers using the question partition of the trained expertise model; generate, by a hypothesis generation stage of the question answering pipeline, one or more candidate answers for the input question; and tailor, by the question answering system, output of the one or more candidate answers based on the expertise level of the questioning user. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. An apparatus comprising:
-
a processor; and a memory coupled to the processor, wherein the memory comprises instructions which, when executed by the processor, cause the processor to implement a question answering system for tailoring question answering system output based on user expertise, wherein the instructions further cause the processor to; train an expertise model, comprising; harvesting a collection of question and answer postings; labeling questions and answers in the collection with predetermined expertise levels; determining a set of features associated with text of each question and answer, and training a machine learning model based on the predetermined expertise levels and the sets of features associated with the text of the questions and answers to form the trained expertise model, wherein the trained expertise model comprises a question partition trained using questions in the collection of question and answer postings and an answer partition trained using answers in the collection of question and answer postings; receive, by the question answering system executing a question answering pipeline on the processor, an input question from a questioning user; determine, by a question and topic analysis stage of the question answering pipeline, a set of features associated with text of the input question, wherein determining the set of features associated with the text of the input question comprises extracting a plurality of features from the text of the input question using an annotation engine pipeline in the data processing system; obtain features from the questioning user'"'"'s posting history within a collection of question and answer postings, wherein the features from the questioning user'"'"'s posting history include a percentage of the questioning user'"'"'s posts that are questions versus answers; determine, by the question answering pipeline, an expertise level of the questioning user based on the set of features associated with the text of the input question and based on at least the percentage of the questioning user'"'"'s posts that are questions versus answers using the question partition of the trained expertise model; generate, by a hypothesis generation stage of the question answering pipeline, one or more candidate answers for the input question; and tailor, by the question answering system, output of the one or more candidate answers based on the expertise level of the questioning user. - View Dependent Claims (17)
-
Specification