Spelling suggestions: "subject:"forminformation retrieval system"" "subject:"informationation retrieval system""
41 |
Entwurf und Implementierung eines Frameworks zur Analyse und Evaluation von Verfahren im Information RetrievalWilhelm, Thomas 13 August 2008 (has links) (PDF)
Diese Diplomarbeit führt kurz in das Thema Information Retrieval mit den Schwerpunkten
Evaluation und Evaluationskampagnen ein. Im Anschluss wird anhand der Nachteile eines
vorhandenen Retrieval Systems ein neues Retrieval Framework zur experimentellen Evaluation
von Ansätzen aus dem Information Retrieval entworfen und umgesetzt.
Die Komponenten des Frameworks sind dabei so abstrakt angelegt, dass verschiedene, bestehende
Retrieval Systeme, wie zum Beispiel Apache Lucene oder Terrier, integriert werden
können. Anhand einer Referenzimplementierung für den ImageCLEF Photographic Retrieval
Task des ImageCLEF Tracks des Cross Language Evaluation Forums wird die Funktionsfähigkeit
des Frameworks überprüft und bestätigt.
|
42 |
中國近現代思想及文學史資料庫檢索系統 / An Information Retrieval and Analysis System for Historical and Literary Documents in Chinese孫暐, Sun, Wei Unknown Date (has links)
數位人文是近年來數位科技應用的一種重要趨勢,是一個結合數位技術與人文研究的新興領域,透過資訊科學的技術並結合大量數位化的文史資料,輔助文史學者進行較深入的人文研究,彌補並獲得在數位化之前無法用人工逐一比對所觀察到的現象或新的詮釋。
本研究以「中國近現代思想及文學史專業數據庫」的大量文史資料作為基礎,希望建構出一個能協助文史學者對中國近現代史的歷史觀念或現象較有效率的進行分析之檢索系統。本系統讓研究者可以透過具代表性的檢索詞彙與檢索條件搜尋出與欲研究的議題相關的文獻集合,再利用PAT Tree(Patricia tree)技術與自身的專業知識,擷取出研究議題底下的專業關鍵詞彙,並針對有興趣的關鍵詞彙或關鍵詞彙組合進行相關的時序分析,提供研究者以不同的面向觀察並詮釋出新的趨勢和結論。
我們找了一些具有文史工作相關背景經驗的使用者來實際使用本系統,並進行系統滿意度的評估,從評估的結果得知大部份的使用者都給了不錯的滿意度回饋,顯示本系統對於研究者在進行相關的文史研究上,可以達到一定程度的助益。 / In recent years, digital humanities have become an important trend of the application of information technologies. It’s an emerging field by combining information technologies with humanities studies. Historians can make intensive humanities studies through the combination of information science technologies and numerous digitized historical data, finding the new historical phenomenon that cannot be obtained by manual handling the historical data.
In this paper, we construct an information retrieval and analysis system based on the text material available at the Database for the Study of Modern Chinese Thought and Literature that can help historians to make historical studies more efficiently. The system can let historians find the key documents which are related to the historical studies they make by representative search word. Besides, the system can use the PAT Tree (Patricia tree) method to extract useful keywords, with the help of historians to finalize the keyword selection. The system can also do the time analysis of one keyword or two keywords and help historians make new conclusions of historical studies.
We let some people who have historical work experience use the system and let them evaluate the system. Most of the evaluation results are good. It means that the system can help historians to make historical studies more efficiently.
|
43 |
Using the informational processing paradigm to design commercial rumour response strategies on the World Wide WebHowell, Gwyneth Veronica James January 2006 (has links)
[Truncated abstract] Rumours can lead to unpredictable events: the manner in which an organisation responds to a commercial rumour can alter its reputation, and can affect its profitability as well as, ultimately, its survival. Commercial rumours are now a prominent feature of the business environment. They can emerge from organisational change, pending workforce layoffs, mergers, and changes to management, in addition, commercial rumours can lower morale and undermine productivity. There are several well-known examples of commercial rumours that have been, or continue to be, circulated. Commercial rumours are typically either about a conspiracy or contamination issue. Conspiracy rumours usually target those organisational practices or policies which are identified as undesirable by the stakeholders. This form of rumour is often precipitated by situations where people do not have all the information about a situation, for example the rumour about Proctor & Gamble being run by the Moonies. Snapple, the soft drink company, was rumoured in 1992 to be supporting the Ku Klux Klan in closing abortion clinics. Contamination rumours are wide-ranging and typically have revulsion theme, such as McDonald’s "worms in the burger", Pop Rock’s candies which exploded in the stomach, and poison in Herron’s paracetamol . . . Marketers suggest that web sites Commerical Rumour Responses on the Web represent the future of marketing communications on the Internet. The key implication of this study for organisations is when faced with a negative rumour, specific and selected Web pages can be used manage company’s stakeholders recall the rumour and organisational stakeholders can be persuaded by the company’s rumour response strategies.
|
44 |
Modelling proactive behaviour of conversational interfacesL'Abbate, Marcello. Unknown Date (has links)
Techn. University, Diss., 2006--Darmstadt.
|
45 |
Adequacy of project based financial management systems of small and medium construction enterprises in BotswanaSsegawa-Kaggwa, Joseph 10 1900 (has links)
The thesis documents findings of a study conducted to develop a project based financial
management system (PBFMS) whose role was viewed as a contributor to the successful
delivery of projects leading to improved financial performance of small and medium
construction enterprise (SMCEs). In particular, the PBFMS was viewed as a facilitator
{function) for the efficient and effective conduction of the strategic management, project
planning and control processes. Thus an adequate PBFMS was seen as one which, facilitates
the efficient and effective delivery of projects with a view to provide enhanced enterprise
performance. In pursuit of this aim, theory and practices relating to the development, operation
and use of a PBFMS were investigated and analysed from both literature and field work
leading to findings being reported in the thesis. In addition, the actual financial management
systems of SMCEs were investigated to determine the extent to which their attributes match
those of the proposed PBFMS model.
The motivation for embarking on the study was brought about by three aspects observed in
Botswana. Firstly, was the frequently documented poor delivery of projects, that is, for a
sustained period of time, projects were being delivered beyond stipulated times, above agreed
cost, and below specified quality. In some worst scenarios, projects were being abandoned at
various stages execution but before completion. Secondly, the investigation was also prompted
by the frequent financial failures of enterprises that were being recorded in the construction
industry. Thirdly, the conduct of the proprietors of the construction enterprises was also
frequently circumspect, particularly in matters relating to financial management.
Thus in pursuing the study, a number of premises were made. Firstly, the financial
management systems of the SMCEs were considered inadequate to fulfil their functions, that
is, they were incapable of facilitating the strategic management, project planning and control
process. It was also speculated that management of SMCEs were not committed to the
PBFMS i.e. they did not participate, get involved and did not comply with the policies
regarding the planning, developing, and operation of financial management systems. As a
result, PBFMS were unable to play their role of facilitating to the successful delivery of
projects for improved contribution to the financial performance of SMCEs. The second
premise was that financial models available are either too generic to guide SMCEs in financial
management matters or the strategic component is not linked to the operational plans to
execute the strategy. For those which are meant for construction enterprises, they normally
prescribe practices for project planning and control without including the strategic element and
vice versa. In essence there is a gap in each of the models available for use by the SMCEs. It is
the closing of this perceived gap in knowledge that the results of the thesis contribute in
finding a solution to the mentioned problem. Thus the study aimed at answering two research
questions: (i) Do SMCEs have adequate PBFMS that facilitate the effective delivery of
projects for enhanced financial performance? and (ii) Is there a relationship between the
adequacy PBFMS and poor performance of SMCEs? To facilitate the answering of these two
question two hypothesis were formulated namely: Hoi: The PBFMS of SMCEs are adequate
to facilitate the delivery of projects; and Ho2: The adequacy of the PBFMS is positively
correlated with the performance of SMCEs. To test the two questions a research process was
planned and executed in several steps.
Firstly, a survey strategy using the questionnaire was selected as the most appropriate method
to provide a snap shot of the existence of attributes of PBFMS and to investigate associated
practices relating to their development and operation. The method was considered more
appropriate and effective in gathering large data in a short space of time in line with the
doctoral time framework. Construction enterprises registered with Public Procurement and
Asset Disposal Board (PPADB) for building and civil work in classes A, B, C and D were
surveyed. The internal quantity surveyor, estimator or accountant were requested to respond
on matters relating PBFMS on behalf of the SMCEs. The sampling frame from which the
SMCEs considered for study were obtained from the two government departments which work
closely with PPADB, the Department of Building and Engineering Services (DBES) and
Department of Roads (DR). The sample sizes for each group category (small and medium)
were determined using Krajcie and Morgan (1970) table. Stratified and systematic random
sampling was used to select the identity of the members to form a sample fro study from the
sampling frame. The second step was to design the questionnaire to probe the three aspects
identified as constituting the PBFMS namely the strategic management; project planning and
control; and management commitment. Essentially the questionnaire sought to investigate the knowledge, tools, techniques, practices, opinions and attitudes of those who design, develop,
operate and use the PBFMS in the SMCEs. To ensure a high quality design, the questionnaire
was given to experts in the subject area to provide some comments on its suitability and was
also piloted on four enterprises. Data collected was analysed using mainly the SPSS software
and involved application of various statistical techniques including cross-tabs, ratio analysis, ttests
and correlational tests.
A total of 101 completed questionnaires were received, made up of 55% and 46% small and
medium enterprises, respectively. The demographic profile of SMCEs confirmed some of the
expected results, for example, majority (59%) of the respondents were owner/managers
confirming the dominance of the owner in SMCEs. Majority of SMCEs (59%) were more than
9 years old, with medium enterprises being more mature (60% older than 9 years) than the
small sized enterprises (49% older than 9 years). Majority (56%) of SMCEs had 10 or more
employees, with medium sized enterprises having more employees (75% with 10 or more)
than the small sized enterprises (42% with 10 or more). SMCEs performed more of building
work alone (48%) than both building and civil work (48%) or maintenance (11%) and no
enterprise performed civil work (0%) alone. Majority of SMCEs (65%) acted as main
contractors as opposed to sub-contractors, though as expected sub-contracting was seen more
in small (20%) than medium (10%) enterprises. Lastly, the public sector (central and local
authorities) provided majority (65%) of the SMCEs jobs. However, if parastatals which are
wholly owned by government were added, the public sector job market adds up to 73%
(65%+8%).
The testing of the major two major hypothesis resulted in the following conclusion. The
results indicated that the first hypothesis was supported, that is, in a majority of SMCEs
operating in Botswana the PBFMS were found to be adequate in facilitating the delivery
of projects. The results were therefore not in agreement with the basic premise made at the
commencement of the study. In view of the finding, it suggests that SMCEs in Botswana have
adequate systems that support the efficient and effective project planning and control.
Secondly, management is committed to the 'welfare' of the PBFMS in terms of complying
and supporting their development and operation. However, like any human endeavour, there are weaknesses in the PBFMS, for example, they were found inadequate in facilitating the
strategic management process, including lack of linking the process to the operational process
in order to execute the strategy. They were also found weak in one of the most crucial process
of project management; that of project control.
The second major investigation showed a weak link between the adequacy of a PBFMS
and performance. Secondly, the results also indicated that the SMCEs which had
adequate PBFMS performed better than their counterparts. The first results were not
surprising since the cause of poor performance were shown as three pillars (business
environment, client/representatives and enterprise factors). However, the second results
emphasise that SMCEs with adequate PBFMS posted better performance than their
counterparts with inadequate systems. In this way the role of PBFMS in contributing to better
performance was illustrated by the results.
Some recommendations are proposed resulting from the findings and how to achieve a deeper
understanding of the subject. Firstly, SMCEs should pay more attention to matters pertaining
to strategic management to ensure a long-term view of their enterprises. Secondly, when a
strategic plan is developed, it must be implemented through operational plans as a means of
executing the strategy. Thirdly, concerted effort should made in ensuring that the projects are
controlled as it is the only way to achieve sustained profitability and satisfied customers.
Fourthly, as a way of providing a deeper understanding of the subject, it is suggested a
longitudinal study could be undertaken to yield a more encompassing investigation than a
cross sectional study which captured only one business cycle of the industry (down turn).
Lastly, the study could be replicated in another industry with a similar profile like the
construction industry in Botswana, for example, Namibia or/and the study could include large
enterprises to provide means of comparing the different profiles of enterprises. / Business Management / D. B. L.
|
46 |
A política de indexação na perspectiva do conhecimento organizacionalRubi, Milena Polsinelli [UNESP] 13 May 2004 (has links) (PDF)
Made available in DSpace on 2014-06-11T19:26:44Z (GMT). No. of bitstreams: 0
Previous issue date: 2004-05-13Bitstream added on 2014-06-13T18:30:28Z : No. of bitstreams: 1
rubi_mp_me_mar.pdf: 344416 bytes, checksum: 20187904ceffaa406ba541089c201b49 (MD5) / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) / A política de indexação deve ser constituída de estratégias pertinentes ao alcance dos objetivos de recuperação do sistema de informação. Isto porque, sob o ponto de vista do sistema, a indexação é reconhecida com sua parte mais importante dentro dos procedimentos realizados para o tratamento da informação, pois condiciona os resultados das estratégias de busca. O indexador tem a função primordial de compreender o documento ao realizar uma análise conceitual que represente adequadamente seu conteúdo, de modo que ocorra correspondência entre o índice e o assunto pesquisado pelo usuário. Para isso, existem os manuais de indexação que devem refletir a política de indexação do sistema de informação e a realidade de trabalho do indexador. No entanto, devido à literatura escassa sobre política de indexação, procurou-se obter por meio da experiência do indexador mais subsídios sobre o tema. Nossos objetivos operacionais são: analisar o contexto do indexador e investigar seu conhecimento sobre política de indexação por meio da metodologia de leitura como evento social/protocolo verbal em grupo. Dessa maneira poderemos atingir nosso objetivo final que é tentar preencher uma lacuna teórica sobre política de indexação. Primeiramente, realizou-se uma primeira coleta de dados com três indexadoras de duas bibliotecas universitárias de Marília - SP das áreas de Medicina e Direito que não estão subordinadas à nenhum sistema de informação maior. Essa coleta teve como objetivos verificar a aplicabilidade da metodologia para os fins desta dissertação e servir como modelo quanto à forma de aplicação do método e conduta durante a coleta dos dados. Posteriormente, realizou-se uma segunda coleta de dados com duas indexadoras e duas gerentes de duas bibliotecas universitárias da área de Odontologia, consideradas núcleos básicos... / The indexing policy must be constituted of pertaining strategies being appropriate to the information system retrieval objectives. That's because under the system viewpoint indexing is recognized as its most important part within the procedures carried on to treating information once it influences the results of searching strategies. The indexer has the primary function of understanding the document while carrying on a conceptual analysis that properly represents its contents so that correspondence occur between the index and the subject searched by users. For that, there are indexing manuals that must represent the information system indexing policy and the reality of the indexer work. However, due to scarce literature on indexing policy it was aimed to obtain more subsides on the subject through the indexer practical experience. Our operational objectives are: analyzing the indexer context and investigating his/her knowledge on indexing policy through the methodology of reading as a social event/verbal protocol in groups. Thus we can reach our final goal of trying to fill a theoretical gap on indexing policy. First, it was carried on a primary data collecting with three indexers in two university libraries in Marília - SP, that are not subordinate to any larger information system, on the areas of Medicine and Law. That collecting had as goals verifying the methodology applicability towards the ends of this thesis and serve as a model in relation to the way of method application and carrying on during data collecting. Afterwards, it was carried on a second data collecting with two indexers and two university libraries managers on the Dentistry area, considered basic centers by Sistema de Informação Especializado em Odontologia (SIEO) to which they are subordinate. The results obtained from the analysis... (Complete abstract, click electronic address below)
|
47 |
Recuperação da informação: estudo da usabilidade na base de dados Public Medical (PUBMED).Coelho, Odete Máyra Mesquita 21 February 2014 (has links)
Made available in DSpace on 2015-04-16T15:23:33Z (GMT). No. of bitstreams: 1
arquivototal.pdf: 4229373 bytes, checksum: 0087285c704b68c550008eeb3ca7869a (MD5)
Previous issue date: 2014-02-21 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / It investigates the understanding that resident doctors have about the process of information retrieval on the basis of Public Medical (PubMed) data, taking into consideration the aspects of usability in human-computer interaction, the resources available and the level of user satisfaction in searching process. The theoretical framework used for this research relates the concepts of information and information systems for the healthcare, and then addresses the Information Retrieval systems and databases, entering the field of information architecture for evaluating the usability of these sources information. The methodological approach includes exploratory research whose first phase consisted of the heuristic evaluation of the PubMed database interface, using the guidelines proposed by Nielsen and Tahir (2002). The results of this analysis show that although these guidelines have been designed to build homepage, thirty-eight of them are suited to the PubMed interface. Therefore, it is inferred that these guidelines can be used for heuristic evaluation of databases focused on the area of Health regarding the usability of this database, it was observed that the interface has a well-structured architecture, is friendly and objective, and present numerous possibilities for search and retrieval of information. The second phase of empirical study took place through the application of prospective usability testing to measure user satisfaction database. These tests were done using a semi-structured questionnaire administered to resident doctors specialty of Internal Medicine, University Hospital Walter Cantídio the Federal University of Ceará, totaling 36% of participants. The results of this step show a good performance and a good user satisfaction PubMed regarding the usability of the database, considering that enables them to achieve their research goals with real effectiveness and efficiency, yet they do not know all the resources available to search and retrieval of information offered by this database. / Investiga qual o entendimento que os médicos residentes têm sobre o processo de recuperação de informação na base de dados Public Medical (PubMed), levando em consideração os aspectos relativos à usabilidade na interação humano-computador, os recursos disponíveis e o nível de satisfação do usuário no processo de busca. O referencial teórico utilizado para esta pesquisa relaciona os conceitos de informação e de informação para a área da saúde, e em seguida aborda os Sistemas de Recuperação de Informação e as bases de dados, adentrando no campo da arquitetura da informação para avaliar a usabilidade dessas fontes de informação. O percurso metodológico contempla a pesquisa exploratória cuja primeira etapa constou da avaliação heurística da interface da base de dados PubMed, utilizando-se as diretrizes propostas por Nielsen e Tahir (2002). Os resultados dessa análise evidenciam que, embora tais diretrizes tenham sido pensadas para a construção de homepage, trinta e oito delas se adequaram à interface da PubMed. Portanto, infere-se que essas diretrizes podem ser utilizadas para a avaliação heurística de bases de dados voltadas para a área da Saúde. Com relação à usabilidade dessa base de dados, evidenciou-se que a interface tem uma arquitetura bem estruturada, é amigável e objetiva, além de apresentar inúmeras possibilidades de busca e recuperação da informação. A segunda etapa do estudo empírico deu-se por meio da aplicação dos testes prospectivos de usabilidade para mensurar a satisfação dos usuários da base de dados. Esses testes foram feitos por meio de um questionário semiestruturado aplicado aos médicos residentes da especialidade de Clínica Médica do Hospital Universitário Walter Cantídio da Universidade Federal do Ceará, perfazendo um total de 36% de participantes. Os resultados dessa etapa evidenciam um bom desempenho e uma boa satisfação dos usuários da PubMed quanto à usabilidade dessa base de dados, haja vista que permite a eles atingirem seus objetivos de pesquisa com real eficácia e eficiência, ainda que não conheçam todos os recursos disponíveis para a busca e a recuperação da informação oferecidos por essa base de dados.
|
48 |
O vocabulário controlado como instrumento de organização e representação da informação na FINEPAlmeida, Tatiana de 30 March 2011 (has links)
Made available in DSpace on 2015-10-19T11:50:16Z (GMT). No. of bitstreams: 1
TatianaMest2011.pdf: 1824675 bytes, checksum: d16d13170b99f1d20e15f3dbd7f22446 (MD5)
Previous issue date: 2011-03-30 / The great volume of data and the complexity of its treatment regarding information retrieval indicate the relevance of studies on search tools. The objective of the present work is to
analyse the FINEP Controlled Vocabulary (VCF) as an instrument of organization and as a representation of information originated from research proposals submitted and approved to be financed by FINEP in the fields of science, technology and innovation in the country. FINEP, concerning study areas, is described as an important system of information retrieval. The VCF analysis are based on a historical and methodological approach on the vocabulary construction, pointing out aspects of its conception and improvements over time. The work investigates the feasibility of applying classification methods for descriptors in use, aiming to contribute to the present vocabulary. Results indicate advantages of applying categorization methods on the controlled vocabulary, and emphasize the fundamental role played by the descriptor s definition as an effective element to the whole process / O volume de informação e a complexidade do tratamento para recuperação da informação indicam a importância de estudos sobre instrumentos de busca. O presente trabalho tem como
objetivo analisar o Vocabulário Controlado FINEP (VCF) enquanto instrumento de organização e representação da informação de uma empresa com acervo relevante de informações sobre o financiamento de propostas de pesquisa em ciência, tecnologia e inovação no país. Apresenta a empresa FINEP, como ambiente de estudo, no contexto dos Sistemas de Recuperação da Informação. A análise do VCF é feita por abordagem histórico metodológica da construção do vocabulário, ressaltando aspectos da concepção e das etapas
de desenvolvimento do instrumento, destacando as principais mudanças no tempo. Investiga a viabilidade de aplicação do processo de categorização dos descritores em uso, como
contribuição para a fase atual de reavaliação e reestruturação do vocabulário. Aponta para a viabilidade de aplicação do método de categorização no VCF, destacando a importância
fundamental da definição dos descritores como elemento de análise para o processo de categorização
|
49 |
A doctor-patient communication tool (DPCT) Ryodoroku application on the webBi, Hongwei 01 January 2002 (has links)
No description available.
|
50 |
A Generic Approach to Component-Level Evaluation in Information RetrievalKürsten, Jens 19 November 2012 (has links)
Research in information retrieval deals with the theories and models that constitute the foundations for any kind of service that provides access or pointers to particular elements of a collection of documents in response to a submitted information need. The specific field of information retrieval evaluation is concerned with the critical assessment of the quality of search systems. Empirical evaluation based on the Cranfield paradigm using a specific collection of test queries in combination with relevance assessments in a laboratory environment is the classic approach to compare the impact of retrieval systems and their underlying models on retrieval effectiveness.
In the past two decades international campaigns, like the Text Retrieval Conference, have led to huge advances in the design of experimental information retrieval evaluations. But in general the focus of this system-driven paradigm remained on the comparison of system results, i.e. retrieval systems are treated as black boxes. This approach to the evaluation of retrieval system has been criticised for treating systems as black boxes. Recent works on this subject have proposed the study of the system configurations and their individual components. This thesis proposes a generic approach to the evaluation of retrieval systems at the component-level.
The focus of the thesis at hand is on the key components that are needed to address typical ad-hoc search tasks, like finding books on a particular topic in a large set of library records. A central approach in this work is the further development of the Xtrieval framework by the integration of widely-used IR toolkits in order to eliminate the limitations of individual tools. Strong empirical results at international campaigns that provided various types of evaluation tasks confirm both the validity of this approach and the flexibility of the Xtrieval framework.
Modern information retrieval systems contain various components that are important for solving particular subtasks of the retrieval process. This thesis illustrates the detailed analysis of important system components needed to address ad-hoc retrieval tasks. Here, the design and implementation of the Xtrieval framework offers a variety of approaches for flexible system configurations. Xtrieval has been designed as an open system and allows the integration of further components and tools as well as addressing search tasks other than ad-hoc retrieval. This approach ensures that it is possible to conduct automated component-level evaluation of retrieval approaches.
Both the scale and impact of these possibilities for the evaluation of retrieval systems are demonstrated by the design of an empirical experiment that covers more than 13,000 individual system configurations. This experimental set-up is tested on four test collections for ad-hoc search. The results of this experiment are manifold. For instance, particular implementations of ranking models fail systematically on all tested collections. The exploratory analysis of the ranking models empirically confirms the relationships between different implementations of models that share theoretical foundations. The obtained results also suggest that the impact on retrieval effectiveness of most instances of IR system components depends on the test collections that are being used for evaluation. Due to the scale of the designed component-level evaluation experiment, not all possible interactions of the system component under examination could be analysed in this work. For this reason the resulting data set will be made publicly available to the entire research community. / Das Forschungsgebiet Information Retrieval befasst sich mit Theorien und Modellen, die die Grundlage für jegliche Dienste bilden, die als Antwort auf ein formuliertes Informationsbedürfnis den Zugang zu oder einen Verweis auf entsprechende Elemente einer Dokumentsammlung ermöglichen. Die Qualität von Suchalgorithmen wird im Teilgebiet Information Retrieval Evaluation untersucht. Der klassische Ansatz für den empirischen Vergleich von Retrievalsystemen basiert auf dem Cranfield-Paradigma und nutzt einen spezifischen Korpus mit einer Menge von Beispielanfragen mit zugehörigen Relevanzbewertungen.
Internationale Evaluationskampagnen, wie die Text Retrieval Conference, haben in den vergangenen zwei Jahrzehnten zu großen Fortschritten in der Methodik der empirischen Bewertung von Suchverfahren geführt. Der generelle Fokus dieses systembasierten Ansatzes liegt jedoch nach wie vor auf dem Vergleich der Gesamtsysteme, dass heißt die Systeme werden als Black Box betrachtet. In jüngster Zeit ist diese Evaluationsmethode vor allem aufgrund des Black-Box-Charakters des Untersuchungsgegenstandes in die Kritik geraten. Aktuelle Arbeiten fordern einen differenzierteren Blick in die einzelnen Systemeigenschaften, bzw. ihrer Komponenten. In der vorliegenden Arbeit wird ein generischer Ansatz zur komponentenbasierten Evaluation von Retrievalsystemen vorgestellt und empirisch untersucht.
Der Fokus der vorliegenden Dissertation liegt deshalb auf zentralen Komponenten, die für die Bearbeitung klassischer Ad-Hoc Suchprobleme, wie dem Finden von Büchern zu einem bestimmten Thema in einer Menge von Bibliothekseinträgen, wichtig sind. Ein zentraler Ansatz der Arbeit ist die Weiterentwicklung des Xtrieval Frameworks mittels der Integration weitverbreiteter Retrievalsysteme mit dem Ziel der gegenseitigen Eliminierung systemspezifischer Schwächen. Herausragende Ergebnisse im internationalen Vergleich, für verschiedenste Suchprobleme, verdeutlichen sowohl das Potenzial des Ansatzes als auch die Flexibilität des Xtrieval Frameworks.
Moderne Retrievalsysteme beinhalten zahlreiche Komponenten, die für die Lösung spezifischer Teilaufgaben im gesamten Retrievalprozess wichtig sind. Die hier vorgelegte Arbeit ermöglicht die genaue Betrachtung der einzelnen Komponenten des Ad-hoc Retrievals. Hierfür wird mit Xtrieval ein Framework dargestellt, welches ein breites Spektrum an Verfahren flexibel miteinander kombinieren lässt. Das System ist offen konzipiert und ermöglicht die Integration weiterer Verfahren sowie die Bearbeitung weiterer Retrievalaufgaben jenseits des Ad-hoc Retrieval. Damit wird die bislang in der Forschung verschiedentlich geforderte aber bislang nicht erfolgreich umgesetzte komponentenbasierte Evaluation von Retrievalverfahren ermöglicht.
Mächtigkeit und Bedeutung dieser Evaluationsmöglichkeiten werden anhand ausgewählter Instanzen der Komponenten in einer empirischen Analyse mit über 13.000 Systemkonfigurationen gezeigt. Die Ergebnisse auf den vier untersuchten Ad-Hoc Testkollektionen sind vielfältig. So wurden beispielsweise systematische Fehler bestimmter Ranking-Modelle identifiziert und die theoretischen Zusammenhänge zwischen spezifischen Klassen dieser Modelle anhand empirischer Ergebnisse nachgewiesen. Der Maßstab des durchgeführten Experiments macht eine Analyse aller möglichen Einflüsse und Zusammenhänge zwischen den untersuchten Komponenten unmöglich. Daher werden die erzeugten empirischen Daten für weitere Studien öffentlich bereitgestellt.
|
Page generated in 0.133 seconds