Spelling suggestions: "subject:"forminformation retrieval"" "subject:"informationation retrieval""
281 |
Information Retrieval Using the Constructivist's Approach to Get the Most Out of the InternetShukla, Ishani 01 December 2009 (has links)
The constructivist's theory and its application to information retrieval from the Internet was reviewed. The main aim of the study was to devise and test an approach with which the most relevant information could be easily and efficiently extracted from the Internet. The impact of a judicious choice of the keywords to retrieve information, according to the particular approach to be implemented as well as the importance of speed reading as an additional technique to improve information retrieval, was compared and critically analyzed. The study was based on information retrieval from www.google.com and www.images.google.com and focused on real-life examples and goal-directed searches. After a careful selection, the criteria used for evaluation were factors such as data quality, accuracy, integrity, and speed of retrieval. These factors helped to determine how useful the constructivist theory could be in information retrieval if it was to be applied in combination with speed reading and traditional approaches.
|
282 |
Using reinforcement learning to learn relevance ranking of search queriesSandupatla, Hareesh 05 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Web search has become a part of everyday life for hundreds of millions of users
around the world. However, the effectiveness of a user's search depends vitally
on the quality of search result ranking. Even though enormous efforts have been
made to improve the ranking quality, there is still significant misalignment
between search engine ranking and an end user's preference order. This is
evident from the fact that, for many search results on major search and
e-commerce platforms, many users ignore the top ranked results and click on the
lower ranked results. Nevertheless, finding a ranking that suits all the users is a
difficult problem to solve as every user's need is different. So, an ideal ranking is
the one which is preferred by the majority of the users. This emphasizes the need
for an automated approach which improves the search engine ranking dynamically
by incorporating user clicks in the ranking algorithm. In existing search result
ranking methodologies, this direction has not been explored profoundly.
A key challenge in using user clicks in search result ranking is that the
relevance feedback that is learnt from click data is imperfect. This is due
to the fact that a user is more likely to click a top ranked result than
a lower ranked result, irrespective of the actual relevance of those results.
This phenomenon is known as position bias which poses a major difficulty
in obtaining an automated method for dynamic update of search rank orders.
In my thesis, I propose a set of methodologies which incorporate user clicks
for dynamic update of search rank orders. The updates are based on adaptive
randomization of results using reinforcement learning strategy by considering
the user click activities as reinforcement signal. Beginning at any rank order
of the search results, the proposed methodologies guaranty to converge to
a ranking which is close to the ideal rank order. Besides, the usage of reinforcement
learning strategy enables the proposed methods to overcome the position bias phenomenon.
To measure the effectiveness
of the proposed method, I perform experiments considering a
simplified user behavior model which I call color ball abstraction model.
I evaluate the quality of the proposed methodologies using standard information retrieval
metrics like Precision at n (P@n), Kendall tau rank correlation, Discounted
Cumulative Gain (DCG) and Normalized Discounted Cumulative Gain (NDCG).
The experiment results clearly demonstrate the success of the proposed methodologies.
|
283 |
A Study on Web Search and Analysis based on Typicality / 典型性に基づくWeb検索と分析に関する研究Tsukuda, Kosetsu 24 September 2014 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第18617号 / 情博第541号 / 新制||情||96(附属図書館) / 31517 / 京都大学大学院情報学研究科社会情報学専攻 / (主査)教授 田中 克己, 教授 吉川 正俊, 教授 黒橋 禎夫 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
|
284 |
Improving Library Searches Using Word-Correlation Factors and FolksonomiesPera, Maria Soledad 18 December 2008 (has links) (PDF)
Libraries, private and public, offer valuable resources to library patrons; however, formulating library queries to retrieve relevant results can be difficult. This occurs because when using a library catalog for library searches, patrons often do not know the exact keywords to be included in a query that match the rigid subject terms (chosen by the Library of Congress) or terms in other fields of a desired library catalog record. These improperly formulated queries often translate into a high percentage of failed searches that retrieve irrelevant results or no results at all. This explains why frustrated library patrons nowadays rely on Web search engines to perform their searches first, and upon obtaining the initial information, such as titles, subject areas, or authors, they query the library catalog. This searching strategy is an evidence of failure of today's library systems. In solving this problem, we propose an enhanced library system, called EnLibS, which allows partial, similarity matching of (i) tags defined by ordinary users at a folksonomy site which describe the content of books and (ii) keywords in a library query to improve the searches on library catalogs. The proposed library system allows patrons to post a query Q with commonly-used words and ranks the retrieved results according to their degrees of resemblance with Q. Experimental results show that EnLibS (i) reduces the amount of queries that retrieve no results, (ii) obtains high precision in retrieving and accuracy in ranking relevant results, and (iii) achieves a processing time comparable to existing library catalog search engines.
|
285 |
Metrohelper: A Real-time Web-based System for Metro Incident Detection Using Social MediaChen, Chih Fang 26 May 2022 (has links)
In recent years the usage of public transit services has been rapidly increased, thanks to huge progress on network technologies. However, the disruptions in modern public transit services also increased, due to aging infrastructure, non-comprehensive system design and the needs for maintenance. Any disruptions happened in current transit networks can cause to major disasters on passengers who use these networks for their daily commutes. Although we have lots of usage on transit network, still most current disruptions detection systems either lack of network coverage or did not have real-time system. The goal of this thesis was to create a system that can leverage Twitter data to help in detecting service disruptions in their early stage. This work involves a web applications which contains front-end, back-end and database, along with data mining techniques that obtain Tweets from a live Twitter stream related to the Washington Metropolitan Area Transit Authority (WMATA) metro system. The fundamental features of the system includes real-time incidents panel, historical events review, activities search near specific metro station and recent news review, which allowing people to have more relatively information based on their needs. After the initial functionalities is being settled, we further developed storytelling and sentiment analysis applications, which allowed people have more comprehensive information about the incidents that are happened around metro stations. Also, with the emergency report we developed, the developer can have immediate notification when an urgent event occurred. After fully testified the system's case study on storytelling, sentiment analysis and emergency report, the outcomes are extreme convincing and trustworthy. / Master of Science / As public transit network become more and more accessible, people around the world rely on these network for their daily commutes. It is clearly that service disruptions among these system will affect passengers severely, especially when there are more and more people using it. This thesis is dedicated to build a web application that will not only allowing people to search latest information, but also assisting on the early detection of the disruptions. In this work we have developed an web application which has easy to use user interface, along with data mining techniques that connected with live data from Twitter to identify these disruptions. Our website is a real-time platform that contains real-time incidents panel, historical events review, activities search near specific metro station and recent news review based on latest tweets and news. By collecting live data from Twitter and various news website, we further developed storytelling and sentiment analysis features. For storytelling, we applied a machine learning model to help us clustering the related tweets/news, after summarize and track the evolution of tweets/news, we converted into stories and displayed it with interactive timelines. For sentiment analysis, we integrated a machine learning model which will scaled the emotional strength of tweets/news, then show the feelings of particular tweets/news. Additionally, we create an emergency report functionality, since it is important for the authority to where and when the incidents happened as soon as possible. The outcome of the system has been well-testify based on the daily case studies, and the results not only meet the ground truth, but also provide with various information.
|
286 |
Retrieval and Evaluation Techniquesfor Personal InformationKim, Jinyoung 01 September 2012 (has links)
Providing an effective mechanism for personal information retrieval is important for many applications, and requires different techniques than have been developed for general web search. This thesis focuses on developing retrieval models and representations for personal search, and on designing evaluation frameworks that can be used to demonstrate retrieval effectiveness in a personal environment.
From the retrieval model perspective, personal information can be viewed as a collection of multiple document types each of which has unique metadata. Based on this perspective, we propose a retrieval model that exploits document metadata and multi-type structure. Proposed retrieval models were found to be effective in other structured document collections, such as movies and job descriptions.
Associative browsing is another search method that can complement keyword search. To support this type of search, we propose a method for building an association graph representation by combining multiple similarity measures based on a user's click patterns. We also present a learning techniques for refining the graph structure based on user's clicks.
Evaluating these methods is particularly challenging for personal information due to privacy issues. This thesis introduces a set of techniques that enables realistic and repeatable evaluation of techniques for personal information retrieval. In particular, we describe techniques for simulating test collections and show that game-based user studies can collect more realistic usage data with relatively small cost.
|
287 |
CONCEPT BASED INFORMATION ORGANIZATION AND RETRIEVALYARDI, APARNA ARVIND 19 July 2006 (has links)
No description available.
|
288 |
Searching for information on occupational accidentsChen, Shih-Kwang 11 September 2008 (has links)
No description available.
|
289 |
Bridging the gap: cognitive approaches to musical preference using large datasetsBarone, Michael D. 11 1900 (has links)
Using a large dataset of digital music downloads, this thesis examines the extent to which cognitive-psychology research can generate and predict user behaviours relevant to the distinct fields of computer science and music perception. Three distinct topics are explored. Topic one describes the current difficulties with using large digital music resources for cognitive research and provides a solution by linking metadata through a complex validation process. Topic two uses this enriched information to explore the extent to which extracted acoustic features influence genre preferences considering personality, and mood research; analysis suggests acoustic features which are pronounced in an individual's preferred genre influence choice when selecting less-preferred genres. Topic three examines whether metrics of music listening behaviour can be derived and validated by social psychological research; results support the notion that user behaviours can be derived and validated using an informed psychological background, and may be more useful than acoustic features for a variety of computational music tasks. A primary motivation for this thesis was to approach interdisciplinary music research in two ways: (1) utilize a shared understanding of statistical learning as a theoretical framework underpinning for prediction and interpretation; and (2) by providing resources, and approaches to analysis of "big data" which are experimentally valid, and psychologically useful. The unique strengths of this interdisciplinary approach, and the weaknesses that remain, are then addressed by discussing refined analyses and future directions. / Thesis / Master of Science (MSc) / This thesis examines whether research from cognitive psychology can be used to inform and predict behaviours germane to computational music analysis including genre choice, music feature preference, and consumption patterns from data provided by digital-music platforms. Specific topics of focus include: information integrity and consistency of large datasets, whether signal processing algorithms can be used to assess music preference across multiple genres, and the degree to which consumption behaviours can be derived and validated using more traditional experimental paradigms. Results suggest that psychologically motivated research can provide useful insights and metrics in the computationally focused area of global music consumption behaviour and digital music analysis. Limitations that remain within this interdisciplinary approach are addressed by providing refined analysis techniques for future work.
|
290 |
A Semantic Web-Based Digital Library Infrastructure to Facilitate Computational EpidemiologyHasan, S. M. Shamimul 15 September 2017 (has links)
Computational epidemiology generates and utilizes massive amounts of data. There are two primary categories of datasets: reported and synthetic. Reported data include epidemic data published by organizations (e.g., WHO, CDC, other national ministries and departments of health) during and following actual outbreaks, while synthetic datasets are comprised of spatially explicit synthetic populations, labeled social contact networks, multi-cell statistical experiments, and output data generated from the execution of computer simulation experiments. The discipline of computational epidemiology encounters numerous challenges because of the size, volume, and dynamic nature of both types of these datasets.
In this dissertation, we present semantic web-based schemas to organize diverse reported and synthetic computational epidemiology datasets. There are three layers of these schemas: conceptual, logical, and physical. The conceptual layer provides data abstraction by exposing common entities and properties to the end user. The logical layer captures data fragmentation and linking aspects of the datasets. The physical layer covers storage aspects of the datasets. We can create mapping files from the schemas. The schemas are flexible and can grow.
The schemas presented include data linking approaches that can connect large-scale and widely varying epidemic datasets. This linked data leads to an integrated knowledge-base, enabling an epidemiologist to ask complex queries that employ multiple datasets. We demonstrate the utility of our knowledge-base by developing a query bank, which represents typical analyses carried out by an epidemiologist during the course of planning for or responding to an epidemic. By running queries with different data mapping techniques, we demonstrate the performance of various tools. The empirical results show that leveraging semantic web technology is an effective strategy for: reasoning over multiple datasets simultaneously, developing network queries pertinent in an epidemic analysis, and conducting realistic studies undertaken in an epidemic investigation. The performance of queries varies according to the choice of hardware, underlying database, and resource description framework (RDF) engine. We provide application programming interfaces (APIs) on top of our linked datasets, which an epidemiologist can use for information retrieval, without knowing much about underlying datasets. The proposed semantic web-based digital library infrastructure can be highly beneficial for epidemiologists as they work to comprehend disease propagation for timely outbreak detection and efficient disease control activities. / PHD / Computational epidemiology generates and utilizes massive amounts of data, and the field faces numerous challenges because of the volume and dynamic nature of the datasets utilized. There are two primary categories of datasets. The first contains epidemic datasets tracking actual outbreaks of disease, which are reported by governments, private companies, and associated parties. The second category is synthetic data created through computer simulation. We present semantic web-based schemas to organize diverse reported and synthetic computational epidemiology datasets. The schemas are flexible in use and scale, and utilize data linking approaches that can connect large-scale and widely varying epidemic datasets. This linked data leads to an integrated knowledge-base, enabling an epidemiologist to ask complex queries that employ multiple datasets. This ability helps epidemiologists better understand disease propagation, for efficient outbreak detection and disease control activities.
|
Page generated in 0.1089 seconds