Spelling suggestions: "subject:"metareasoning"" "subject:"itsreasoning""
1 |
Anomaly-Driven Belief Revision by Abductive MetareasoningEckroth, Joshua Ryan 09 July 2014 (has links)
No description available.
|
2 |
Self-Reflection on Chain-of-Thought Reasoning in Large Language Models / Självreflektion över Chain-of-Thought-resonerande i stora språkmodellerPraas, Robert January 2023 (has links)
A strong capability of large language models is Chain-of-Thought reasoning. Prompting a model to ‘think step-by-step’ has led to great performance improvements in solving problems such as planning and question answering, and with the extended output it provides some evidence about the rationale behind an answer or decision. In search of better, more robust, and interpretable language model behavior, this work investigates self-reflection in large language models. Here, self-reflection consists of feedback from large language models to medical question-answering and whether the feedback can be used to accurately distinguish between correct and incorrect answers. GPT-3.5-Turbo and GPT-4 provide zero-shot feedback scores to Chain-of-Thought reasoning on the MedQA (medical questionanswering) dataset. The question-answering is evaluated on traits such as being structured, relevant and consistent. We test whether the feedback scores are different for questions that were either correctly or incorrectly answered by Chain-of-Thought reasoning. The potential differences in feedback scores are statistically tested with the Mann-Whitney U test. Graphical visualization and logistic regressions are performed to preliminarily determine whether the feedback scores are indicative to whether the Chain-of-Thought reasoning leads to the right answer. The results indicate that among the reasoning objectives, the feedback models assign higher feedback scores to questions that were answered correctly than those that were answered incorrectly. Graphical visualization shows potential for reviewing questions with low feedback scores, although logistic regressions that aimed to predict whether or not questions were answered correctly mostly defaulted to the majority class. Nonetheless, there seems to be a possibility for more robust output from self-reflecting language systems. / En stark förmåga hos stora språkmodeller är Chain-of-Thought-resonerande. Att prompta en modell att tänka stegvis har lett till stora prestandaförbättringar vid lösandet av problem som planering och frågebesvarande, och med den utökade outputen ger det en del bevis rörande logiken bakom ett svar eller beslut. I sökandet efter bättre, mer robust och tolk bart beteende hos språkmodeller undersöker detta arbete självreflektion i stora språkmodeller. Forskningsfrågan är: I vilken utsträckning kan feedback från stora språkmodeller, såsom GPT-3.5-Turbo och GPT-4, på ett korrekt sätt skilja mellan korrekta och inkorrekta svar i medicinska frågebesvarande uppgifter genom användningen av Chainof-Thought-resonerande? Här ger GPT-3.5-Turbo och GPT-4 zero-shot feedback-poäng till Chain-ofThought-resonerande på datasetet för MedQA (medicinskt frågebesvarande). Frågebesvarandet bör vara strukturerat, relevant och konsekvent. Feedbackpoängen jämförs mellan två grupper av frågor, baserat på om dessa besvarades korrekt eller felaktigt i första hand. Statistisk testning genomförs på skillnaden i feedback-poäng med Mann-Whitney U-testet. Grafisk visualisering och logistiska regressioner utförs för att preliminärt avgöra om feedbackpoängen är indikativa för huruvida Chainof-Thought-resonerande leder till rätt svar. Resultaten indikerar att bland resonemangsmålen tilldelar feedbackmodellerna fler positiva feedbackpoäng till frågor som besvarats korrekt än de som besvarats felaktigt. Grafisk visualisering visar potential för granskandet av frågor med låga feedbackpoäng, även om logistiska regressioner som syftade till att förutsäga om frågorna besvarades korrekt eller inte för det mesta majoritetsklassen. Icke desto mindre verkar det finnas potential för robustare från självreflekterande språksystem.
|
3 |
Empirically-based self-diagnosis and repair of domain knowledgeJones, Joshua K. 17 December 2009 (has links)
In this work, I view incremental experiential learning in intelligent software agents as progressive agent self-adaptation. When an agent produces an incorrect behavior, then it may reflect on, and thus diagnose and repair, the reasoning and knowledge that produced the incorrect behavior. In particular, I focus on the self-diagnosis and self-repair of an agent's domain knowledge. The implementation of systems with the capability to self-diagnose and self-repair involves building both reasoning processes capable of such learning and knowledge representations capable of supporting those reasoning processes. The core issue my dissertation addresses is: what kind of metaknowledge (knowledge about knowledge) may enable the agent to diagnose faults in its domain knowledge? In providing a solution to this issue, the central contribution of this research is a theory of the kind of metaknowledge that enables a system to reason about and adapt its conceptual knowledge. For this purpose, I propose a representation that explicitly encodes metaknowledge in the form of procedures called Empirical Verification Procedures (EVPs). In the proposed knowledge representation, an EVP is associated with each concept within the agent's domain knowledge. Each EVP explicitly semantically grounds the associated concept in the agent's perception, and can thus be used as a test to determine the validity of knowledge of that concept during diagnosis.
I present the formal and empirical evaluation of a system, Augur, that makes use of EVP metaknowledge to adapt its own domain knowledge in the context of a particular subclass of classification problem that I call compositional classification, in which the overall classification task can be broken into a hierarchically organized set of subtasks. I hypothesize that EVP metaknowledge will enable a system to automatically adapt its knowledge in two ways: first, by adjusting the ways that inputs are categorized by a concept, in accordance with semantics fixed by an associated EVP; and second, by adjusting the semantics of concepts themselves when they fail to contribute appropriately to system goals. The latter adaptation is realized by altering the EVP associated with the concept in question. I further hypothesize that the semantic grounding of domain concepts in perception through the use of EVPs will increase the generalization power of a learner that operates over those concepts, and thus make learning more efficient. Beyond the support of these hypotheses, I also present results pertinent to the understanding of learning in compositional classification settings using structured knowledge representations.
|
Page generated in 0.0816 seconds