Global ETD Search

11	Grid-Enabled Automatic Web Page Classification Metikurke, Seema Sreenivasamurthy 12 June 2006 (has links) Much research has been conducted on the retrieval and classification of web-based information. A big challenge is the performance issue, especially for a classification algorithm returning results for a large set of data that is typical when accessing the Web. This thesis describes a grid-enabled approach for automatic web page classification. The basic approach is first described that uses a vector space model (VSM). An enhancement of the approach through the use of a genetic algorithm (GA) is then described. The enhanced approach can efficiently process candidate web pages from a number of web sites and classify them. A prototype is implemented and empirical studies are conducted. The contributions of this thesis are: 1) Application of grid computing to improve performance of both VSM and GA using VSM based web page classification; 2) Improvement of the VSM classification algorithm by applying GA that uniquely discovers a set of training web pages while also generating a near optimal parameter values set for VSM. Automatic Web Page Classification Vector Space Model Genetic Algorithm Grid Computing Computer Sciences
12	Search Queries in an Information Retrieval System for Arabic-Language Texts Albujasim, Zainab Majeed 01 January 2014 (has links) Information retrieval aims to extract from a large collection of data a subset of information that is relevant to user’s needs. In this study, we are interested in information retrieval in Arabic-Language text documents. We focus on the Arabic language, its morphological features that potentially impact the implementation and performance of an information retrieval system and its unique characters that are absent in the Latin alphabet and require specialized approaches. Specifically, we report on the design, implementation and evaluation of the search functionality using the Vector Space Model with several weighting schemes. Our implementation uses the ISRI stemming algorithms as the underlying stemming technique and the general Arabic stop word list for building inverted indices for Arabic-language documents. We evaluate our implementation on a corpus consisting of selected technical papers published in Arabic-language journals. We use the Open Journal Systems (OJS) from the Public Knowledge Project as a repository for the corpus used in the evaluation. We evaluate the performance of our implementation of the search using a classic recall/precision approach and compare it to one of the default multilingual search functions supported in the OJS. Our experimental analysis suggests that stemming is an effective technique for searches in Arabic-language texts that improves the quality of the information retrieval system. Open Journal Arabic language Vector Space Model information retrieval ranking schemes Databases and Information Systems
13	Nyckelordssökning : Baserat på Vector Space Model / Keyword search : Based on Vector Space Model Borg, Oskar January 2013 (has links) Då mängden information bara ökar, så ökar även behovet att ha åtkomst till informationen lättillgängligt. Detta skapar då ett behov för ett gränssnitt som kan söka bland informationen. I detta arbete har det undersökts om en implementation av Vector Space Model ger mera relevanta resultat jämfört mot en enklare implementation som inte baseras på Vector Space Model. Sökningen utförs i en relationsdatabas med ett inverterat index, databasen fylls med data ifrån internetforumet Stack Overflow. Genom att bygga en sökmotor som returnerade två olika resultatlistor för varje sökning så fick tio användare testa och utvärdera resultatens relevans. Resultatet av testerna visade att Vector Space Model ger mer relevanta resultat dock till en kostnad av söktiden. Vector Space Model Inventerad index Nyckelordssökning Sökmotor Computer Sciences Datavetenskap (datalogi)
14	Authorship classification using the Vector Space Model and kernel methods Westin, Emil January 2020 (has links) Authorship identification is the field of classifying a given text by its author based on the assumption that authors exhibit unique writing styles. This thesis investigates the semantic shortcomings of the vector space model by constructing a semantic kernel created from WordNet which is evaluated on the problem of authorship attribution. A multiclass SVM classifier is constructed using the one-versus-all strategy and evaluated in terms of precision, recall, accuracy and F1 scores. Results show that the use of the semantic scores from WordNet degrades the performance compared to using a linear kernel. Experiments are run to identify the best feature engineering configurations, showing that removing stopwords has a positive effect on the financial dataset Reuters while the Kaggle dataset consisting of short extracts of horror stories benefit from keeping the stopwords. vector space model semantic kernel support vector machine bag-of-words Probability Theory and Statistics Sannolikhetsteori och statistik
15	Exploring the potentials of a new perspective for a local approach: The Water-Energy-Food Nexus at the Dampalit Stream, the Philippines / 地域アプローチのための新たな展開可能性を求めて：フィリピン・ダンパリット川流域における水・エネルギー・食料連環 Maximilian, Spiegelberg 23 May 2017 (has links) 京都大学 / 0048 / 新制・課程博士 / 博士(地球環境学) / 甲第20594号 / 地環博第165号 / 新制\|\|地環\|\|33(附属図書館) / 京都大学大学院地球環境学舎環境マネジメント専攻 / (主査)教授星野敏, 教授柴田昌三, 准教授西前出 / 学位規則第4条第1項該当 / Doctor of Global Environmental Studies / Kyoto University / DFAM livelihoods integrated approach Sustainable Development Goals vector space model transformation 450
16	Optimal Dual Frames For Erasures And Discrete Gabor Frames Lopez, Jerry 01 January 2009 (has links) Since their discovery in the early 1950's, frames have emerged as an important tool in areas such as signal processing, image processing, data compression and sampling theory, just to name a few. Our purpose of this dissertation is to investigate dual frames and the ability to find dual frames which are optimal when coping with the problem of erasures in data transmission. In addition, we study a special class of frames which exhibit algebraic structure, discrete Gabor frames. Much work has been done in the study of discrete Gabor frames in Rn, but very little is known about the l2(Z) case or the l2(Zd) case. We establish some basic Gabor frame theory for l2(Z) and then generalize to the l2(Zd) case. Frames Dual Frames Vector Space Hilbert Space Functional Analysis Discrete Gabor Frames Gabor Analysis Mathematics
17	Random indexing with Pattern Grammar : Multi-context vector space model that uses linguistics patterns / Random indexing med hjälp av mallgramatik : Multikontextinbäddning av ord som använder lingvistiska mönster Klåvus, Carl Henrik January 2024 (has links) This thesis presents an algorithm incorporating pattern grammar with random indexing to solve three English synonym benchmarks. A pattern grammar model and a baseline random indexing implementation benchmarked the solution. The results show an significant improvement on the synonym benchmark compared to a baseline random indexing implementation. Most language models today focus on vector space models where the linguistic origins of the information are lost. Even though these algorithms produce good results, it is hard to know where the model learned something. With the help of patterns, we can learn more about how these models work. / Den här uppsatsen presenterar en algoritm som använder sig av mallgrammatik tillsammans med random indexing för att lösa tre synonymtest för engelska. En mallgrammatiksmodell och en referensimplementation av random indexing utvärderades. Resultaten visade en tydlig förbättring på de olika testerna jämfört med referensimplementationen. De flesta språkmodeller idag fokuserar på vektorrepresentationer av språk där det lingvistiska ursprunget hos språket försvinner. Dessa modeller är mycket framgångsrika, men det är svårt att säga något om vad och hur en modell kommit fram till en slutsats. Med hjälp av språkmönster baserade på mallgrammatik kan vi lära oss mer om hur dessa modeller fungerar. Random Indexing synonyms vector space model Random Indexing synonymer vektorrumsmodell Computer Sciences Datavetenskap (datalogi)
18	Měření vzdáleností mezi stanicemi v IP sítích / Distance measurement between nodes in IP networks Šimák, Jan January 2010 (has links) This thesis deals with delay prediction issue between nodes on the Internet. Accurate delay prediction helps with choosing of the nearest internet neighbor and contributes to effective usage of network sources. Unnecessary network load is decreased due to algorithms of delay prediction (no need for many latency measuring). The thesis focuses theoretically on the three main algorithms using coordinate systems - GNP, Vivaldi, Lighthouses. Last one is at the same time the main subject of the thesis too. Algorithm Lighthouses is explored in detail theoretically and in practise too. In order to verify the accurate of delay prediction of Lighthouses algorithm the simulation application was developed. The application is able to compute node coordinates of synthetic network using Lighthouses algorithm. Description of simulation application and evaluation of simalution results are part of practice part of this thesis.
19	Category-theoretic quantitative compositional distributional models of natural language semantics Grefenstette, Edward Thomas January 2013 (has links) This thesis is about the problem of compositionality in distributional semantics. Distributional semantics presupposes that the meanings of words are a function of their occurrences in textual contexts. It models words as distributions over these contexts and represents them as vectors in high dimensional spaces. The problem of compositionality for such models concerns itself with how to produce distributional representations for larger units of text (such as a verb and its arguments) by composing the distributional representations of smaller units of text (such as individual words). This thesis focuses on a particular approach to this compositionality problem, namely using the categorical framework developed by Coecke, Sadrzadeh, and Clark, which combines syntactic analysis formalisms with distributional semantic representations of meaning to produce syntactically motivated composition operations. This thesis shows how this approach can be theoretically extended and practically implemented to produce concrete compositional distributional models of natural language semantics. It furthermore demonstrates that such models can perform on par with, or better than, other competing approaches in the field of natural language processing. There are three principal contributions to computational linguistics in this thesis. The first is to extend the DisCoCat framework on the syntactic front and semantic front, incorporating a number of syntactic analysis formalisms and providing learning procedures allowing for the generation of concrete compositional distributional models. The second contribution is to evaluate the models developed from the procedures presented here, showing that they outperform other compositional distributional models present in the literature. The third contribution is to show how using category theory to solve linguistic problems forms a sound basis for research, illustrated by examples of work on this topic, that also suggest directions for future research. 006.3
20	A Rich Context Model : Design and Implementation Sotsenko, Alisa January 2017 (has links) The latest developments of mobile devices include a variety of hardware features that allow for more rich data collection and services. Numerous sensors, Internet connectivity, low energy Bluetooth connectivity to other devices (e.g., smart watches, activity tracker, health data monitoring devices) are just some examples of hardware that helps to provide additional information that can be beneficially used for many application domains. Among others, they could be utilized in mobile learning scenarios (for data collection in science education, field trips), in mobile health scenarios (for health data collection and monitoring the health state of patients, changes in health conditions and/or detection of emergency situations), and in personalized recommender systems. This information captures the current context situation of the user that could help to make mobile applications more personalized and deliver a better user experience. Moreover, the context related information collected by the mobile device and the different applications can be enriched by using additional external information sources (e.g., Web Service APIs), which help to describe the user’s context situation in more details. The main challenge in context modeling is the lack of generalization at the core of the model, as most of the existing context models depend on particular application domains or scenarios. We tackle this challenge by conceptualizing and designing a rich generic context model. In this thesis, we present the state of the art of recent approaches used for context modeling and introduce a rich context model as an approach for modeling context in a domain-independent way. Additionally, we investigate whether context information can enhance existing mobile applications by making them sensible to the user’s current situation. We demonstrate the reusability and flexibility of the rich context model in a several case studies. The main contributions of this thesis are: (1) an overview of recent, existing research in context modeling for different application domains; (2) a theoretical foundation of the proposed approach for modeling context in a domain-independent way; (3) several case studies in different mobile application domains. Context modeling rich context model mobile users current context of the user mobile sensors multidimensional vector space model contextualization Computer Systems Datorsystem

Search results