Return to search

‘How can one evaluate a conversational software agent framework?’

Yes / This paper presents a critical evaluation framework for a linguistically orientated conversational software agent
(CSA) (Panesar, 2017). The CSA prototype investigates the integration, intersection and interface of the
language, knowledge, and speech act constructions (SAC) based on a grammatical object (Nolan, 2014), and the
sub-model of belief, desires and intention (BDI) (Rao and Georgeff, 1995) and dialogue management (DM) for
natural language processing (NLP). A long-standing issue within NLP CSA systems is refining the accuracy of
interpretation to provide realistic dialogue to support the human-to-computer communication.
This prototype constitutes three phase models: (1) a linguistic model based on a functional linguistic theory –
Role and Reference Grammar (RRG) (Van Valin Jr, 2005); (2) Agent Cognitive Model with two inner models:
(a) knowledge representation model employing conceptual graphs serialised to Resource Description Framework
(RDF); (b) a planning model underpinned by BDI concepts (Wooldridge, 2013) and intentionality (Searle,
1983) and rational interaction (Cohen and Levesque, 1990); and (3) a dialogue model employing common
ground (Stalnaker, 2002).
The evaluation approach for this Java-based prototype and its phase models is a multi-approach driven by
grammatical testing (English language utterances), software engineering and agent practice. A set of evaluation
criteria are grouped per phase model, and the testing framework aims to test the interface, intersection and
integration of all phase models and their inner models. This multi-approach encompasses checking performance
both at internal processing, stages per model and post-implementation assessments of the goals of RRG, and
RRG based specifics tests.
The empirical evaluations demonstrate that the CSA is a proof-of-concept, demonstrating RRG’s fitness for
purpose for describing, and explaining phenomena, language processing and knowledge, and computational
adequacy. Contrastingly, evaluations identify the complexity of lower level computational mappings of NL –
agent to ontology with semantic gaps, and further addressed by a lexical bridging consideration (Panesar, 2017).

Identiferoai:union.ndltd.org:BRADFORD/oai:bradscholars.brad.ac.uk:10454/18136
Date07 October 2020
CreatorsPanesar, Kulvinder
Source SetsBradford Scholars
LanguageEnglish
Detected LanguageEnglish
TypeConference paper, Published version
Rights(c) 2018 The Author. Full-text reproduced with author permission.

Page generated in 0.002 seconds