Global ETD Search

1	Evaluation of Effective XML Information Retrieval Pehcevski, Jovan, jovanp@cs.rmit.edu.au January 2007 (has links) XML is being adopted as a common storage format in scientific data repositories, digital libraries, and on the World Wide Web. Accordingly, there is a need for content-oriented XML retrieval systems that can efficiently and effectively store, search and retrieve information from XML document collections. Unlike traditional information retrieval systems where whole documents are usually indexed and retrieved as information units, XML retrieval systems typically index and retrieve document components of varying granularity. To evaluate the effectiveness of such systems, test collections where relevance assessments are provided according to an XML-specific definition of relevance are necessary. Such test collections have been built during four rounds of the INitiative for the Evaluation of XML Retrieval (INEX). There are many different approaches to XML retrieval; most approaches either extend full-text information retrieval systems to handle XML retrieval, or use database technologies that incorporate existing XML standards to handle both XML presentation and retrieval. We present a hybrid approach to XML retrieval that combines text information retrieval features with XML-specific features found in a native XML database. Results from our experiments on the INEX 2003 and 2004 test collections demonstrate the usefulness of applying our hybrid approach to different XML retrieval tasks. A realistic definition of relevance is necessary for meaningful comparison of alternative XML retrieval approaches. The three relevance definitions used by INEX since 2002 comprise two relevance dimensions, each based on topical relevance. We perform an extensive analysis of the two INEX 2004 and 2005 relevance definitions, and show that assessors and users find them difficult to understand. We propose a new definition of relevance for XML retrieval, and demonstrate that a relevance scale based on this definition is useful for XML retrieval experiments. Finding the appropriate approach to evaluate XML retrieval effectiveness is the subject of ongoing debate within the XML information retrieval research community. We present an overview of the evaluation methodologies implemented in the current INEX metrics, which reveals that the metrics follow different assumptions and measure different XML retrieval behaviours. We propose a new evaluation metric for XML retrieval and conduct an extensive analysis of the retrieval performance of simulated runs to show what is measured. We compare the evaluation behaviour obtained with the new metric to the behaviours obtained with two of the official INEX 2005 metrics, and demonstrate that the new metric can be used to reliably evaluate XML retrieval effectiveness. To analyse the effectiveness of XML retrieval in different application scenarios, we use evaluation measures in our new metric to investigate the behaviour of XML retrieval approaches under the following two scenarios: the ad-hoc retrieval scenario, exploring the activities carried out as part of the INEX 2005 Ad-hoc track; and the multimedia retrieval scenario, exploring the activities carried out as part of the INEX 2005 Multimedia track. For both application scenarios we show that, although different values for retrieval parameters are needed to achieve the optimal performance, the desired textual or multimedia information can be effectively located using a combination of XML retrieval approaches. Read more XML information retrieval native XML database hybrid XML retrieval relevance application scenarios INEX
2	Εξόρυξη γνώσης από αναζητήσεις στον παγκόσμιο ιστό που δεν καταλήγουν σε προσπελάσεις δεδομένων και αξιολόγηση της απόδοσης ανάκτησης Κουμπούρη, Αθανασία 04 December 2012 (has links) Η έλλειψη της δραστηριότητας του χρήστη σχετικά με τα αποτελέσματα της αναζήτησης μέχρι πρόσφατα θεωρείτο ως ένδειξη της δυσαρέσκειας του από την απόδοση ανάκτησης, και συχνά τέτοια αδράνεια χαρακτήριζε την αναζήτηση ως αποτυχημένη (negative search abandonment). Ωστόσο, πρόσφατες μελέτες δείχνουν ότι ορισμένες αναζητήσεις μπορούν να ικανοποιηθούν από το περιεχόμενο των αποτελεσμάτων που παρουσιάζονται στον χρήστη, χωρίς να χρειάζεται να κάνει κλικ σε κάποιο από τα ανακτημένα αποτελέσματα (positive search abandonment), και έτσι τονίζεται η ανάγκη να γίνουν διακρίσεις μεταξύ των επιτυχημένων και αποτυχημένων αναζητήσεων που δεν ακολουθούνται από κλικς. Με αυτή την εργασία προτείνουμε τον σχεδιασμό και την υλοποίηση μιας μεθοδολογίας αξιολόγησης της ικανοποίησης του χρήστη από τα αποτελέσματα αναζητήσεων που δεν ακολουθούνται από επισκέψεις στο περιεχόμενο των δεδομένων ανάκτησης. Για την επίτευξη του στόχου αυτού διενεργήσαμε μελέτη χρηστών που διερευνά τις προθέσεις των χρηστών πίσω από ερωτήματα που δεν ακολουθούνται από επίσκεψη σε κάποιο από τα αποτελέσματα που επέστρεψε η αναζήτηση και εξετάζει τις εργασίες αναζήτησης που μπορούν να ολοκληρωθούν με επιτυχία βασισμένες εξ ολοκλήρου στις πληροφορίες που παρέχονται στη σελίδα με τα αποτελέσματα. Επιπρόσθετα, μελετήθηκαν και υλοποιήθηκαν εργαλεία, QWC Browser, για την καταγραφή της δραστηριότητας του χρήστη με συστήματα ανάκτησης πληροφορίας από τον Παγκόσμιο Ιστό. Στηριζόμενοι στην ευρέως αποδεχόμενη ιδέα της χρήσης της δραστηριότητας του χρήστη ως δείκτη υπονοούμενης αξιολόγησης συσχέτισης (implicit relevance judgments), εξετάσαμε την ύπαρξη σχέση μεταξύ των ρητών δηλώσεων (explicit judgments) ικανοποίησης του χρήστη και μετρικών αξιολόγησης της υπονοούμενης ανατροφοδότησης (implicit measures) του χρήστη. Τέλος, χρησιμοποιήσαμε τεχνικές μοντελοποίησης για την ανάπτυξη μοντέλων πρόβλεψης για την σύλληψη της ικανοποίησης του χρήστη από τις αναζητήσεις που δεν ακολουθούνται από κλικς. / The lack of user activity on search results was until recently perceived as a sign of user dissatisfaction from retrieval performance, often, referring to such inactivity as a failed search (negative search abandonment). However, recent studies suggest that some search tasks can be achieved in the contents of the results displayed without the need to click through them (positive search abandonment); thus they emphasize the need to discriminate between successful and failed searches without follow-up clicks. In this paper we propose to design and implement a methodology for assessing user satisfaction from the results of searches that are not followed by visits to the content of the retrieved results. To achieve this goal we conducted a user study in order to identify the search intentions of queries without follow-up clicks to any of the results returned by the search and identify the search tasks that can be accomplished successfully based entirely on information provided on the results page. Additionally, we developed an instrumented browser, QWC Browser, to collect a variety of measures of user activity after the query submittion. Moreover, we examined whether there is an association between explicit judgments of user satisfaction and implicit measures of user interest in order to understand what implicit measures were most strongly associated with user satisfaction. Finally, we used Bayesian modeling techniques to develop predictive models, to capture user satisfaction from searches that are not followed by clicks to the retrieved results. Read more Ρητή ανατροφοδότηση Απόδοση ανάκτησης Δραστηριότητα χρήστη Ικανοποίηση χρήστη 025.042 52 Positive search abandonment Implicit feedback Explicit feedback User satisfaction Retrieval effectiveness User activity

Search results

Evaluation of Effective XML Information Retrieval

Εξόρυξη γνώσης από αναζητήσεις στον παγκόσμιο ιστό που δεν καταλήγουν σε προσπελάσεις δεδομένων και αξιολόγηση της απόδοσης ανάκτησης