Community-based Question Answering (CQA) services enable members to ask questions and have them answered by the community. These services have the potential of rapidly creating large archives of questions and answers. However, their information is rarely exploited. This thesis presents a new statistical topic model for modeling Question-Answering archives. The model explicitly captures topic dependency and correlation between questions and answers, and models differences in their vocabulary. The proposed model is applied for the task of Question Answering and its performance is evaluated using a dataset extracted from the programming website Stack Overflow. Experimental results show that it achieves improved performance in retrieving the correct answer for a query question compared to the LDA model. The model has also been applied for Automatic Tagging and comparisons with LDA show that the new model achieves better clustering performance for larger numbers of topics.
Identifer | oai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:NSHD.ca#10222/14584 |
Date | 29 February 2012 |
Creators | Zolaktaf Zadeh, Zeinab |
Source Sets | Library and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada |
Language | English |
Detected Language | English |
Page generated in 0.0018 seconds