Global ETD Search

1	Integrating and querying semantic annotations Chen, Luying January 2014 (has links) Semantic annotations are crucial components in turning unstructured text into more meaningful and machine-understandable information. The acquisition of the mass of semantically-enriched information would allow applications that consume the information to gain wide benefits. At present there are a plethora of commercial and open-source services or tools for enriching documents with semantic annotations. Since there has been limited effort to compare such annotators, this study first surveys and compares them in multiple dimensions, including the techniques, the coverage and the quality of annotations. The overlap and the diversity in capabilities of annotators motivate the need of semantic annotation integration: middleware that produces a unified annotation with improved quality on top of diverse semantic annotators. The integration of semantic annotations leads to new challenges, both compared to usual data integration scenarios and to standard aggregation of machine learning tools. A set of approaches to these challenges are proposed that perform ontology-aware aggregation, adapting Maximum Entropy Markov models to the setting of ontology-based annotations. These approaches are further compared with the existing ontology-unaware supervised approaches, ontology-aware unsupervised methods and individual annotators, demonstrating their effectiveness by an overall improvement in all the testing scenarios. A middleware system – ROSeAnn and its corresponding APIs have been developed. In addition, this study also concerns the availability and usability of semantic-rich data. Thus the second focus of this thesis aims to allow users to query text annotated with different annotators by using both explicit and implicit knowledge. We describe our first step towards this, a query language and a prototype system – QUASAR that provides a uniform way to query multiple facets of annotated documents. We will show how integrating semantic annotations and utilizing external knowledge help in increasing the quality of query answers over annotated documents. 006.3
2	Generation and application of semantic networks from plain text and Wikipedia Wojtinnek, Pia-Ramona January 2012 (has links) Natural Language Processing systems crucially depend on the availability of lexical and conceptual knowledge representations. They need to be able to disambiguate word senses and detect synonyms. In order to draw inferences, they require access to hierarchical relations between concepts (dog isAn animal) as well as non-hierarchical ones (gasoline fuels car). Knowledge resources such as lexical databases, semantic networks and ontologies explicitly encode such conceptual knowledge. However, traditionally, these have been manually created, which is expensive and time consuming for large re- sources, and cannot provide adequate coverage in specialised domains. In order to alleviate this acquisition bottleneck, statistical methods have been created to acquire lexical and conceptual knowledge automatically from text. In particular, unsupervised techniques have the advantage that they can be easily adapted to any domain, given some corpus on the topic. However, due to sparseness issues, they often require very large corpora to achieve high quality results. The spectrum of resources and statistical methods has a crucial gap in situations when manually cre- ated resources do not provide the necessary coverage and only limited corpora are available. This is the case for real-world domain applications such as an NLP system for processing technical information based on a limited amount of company documentation. We provide a large-scale demonstration that this gap can be filled through the use of automatically generated networks. The corpus is automatically transformed into a network representing the terms or concepts which occur in the text and their relations, based entirely on linguistic tools. The net- works structurally lie in between the unstructured corpus and the highly structured manually created resources. We show that they can be useful in situations for which neither existing approach is ap- plicable. In contrast to manually created resources, our networks can be generated quickly and on demand. Conversely, they make it possible to achieve higher quality representations from less text than corpus-based methods, relieving the requirement of very large scale corpora. We devise scaleable frameworks for building networks from plain text and Wikipedia with varying levels of expressiveness. This work creates concrete networks from the entire British National Corpus covering 1.2m terms and 21m relations and a Wikipedia network covering 2.7m concepts. We develop a network-based semantic space model and evaluate it on the task of measuring semantic relatedness. In addition, noun compound paraphrasing is tackled to demonstrate the quality of the indirect paths in the network for concept relation description. On both evaluations we achieve results competitive to the state of the art. In particular, our network-based methods outperform corpus-based methods, demonstrating the gain created by leveraging the network structure. 004.6
3	Exploiting parallelism in decomposition methods for constraint satisfaction Akatov, Dmitri January 2010 (has links) Constraint Satisfaction Problems (CSPs) are NP-complete in general, however, there are many tractable subclasses that rely on the restriction of the structure of their underlying hypergraphs. It is a well-known fact, for instance, that CSPs whose underlying hypergraph is acyclic are tractable. Trying to deﬁne “nearly acyclic” hypergraphs led to the deﬁnition of various hypergraph decomposition methods. An important member in this class is the hypertree decomposition method, introduced by Gottlob et al. It possesses the property that CSPs falling into this class can be solved eﬃciently, and that hypergraphs in this class can be recognized efﬁciently as well. Apart from polynomial tractability, complexity analysis has shown, that both afore-mentioned problems lie in the low complexity class LOGCFL and are thus moreover eﬃciently parallelizable. A parallel algorithm has been proposed for the “evaluation problem”, however all algorithms for the “recognition problem” presented to date are sequential. The main contribution of this dissertation is the creation of an object oriented programming library including a task scheduler which allows the parallelization of a whole range of computational problems, fulﬁlling certain complexity-theoretic restrictions. This library merely requires the programmer to provide the implementation of several classes and methods, representing a general alternating algorithm, while the mechanics of the task scheduler remain hidden. In particular, we use this library to create an eﬃcient parallel algorithm, which computes hypertree decompositions of a ﬁxed width. Another result of a more theoretical nature is the deﬁnition of a new type of decomposition method, called Balanced Decompositions. Solving CSPs of bounded balanced width and recognizing such hypergraphs is only quasi-polynomial, however still parallelizable to a certain extent. A complexity-theoretic analysis leads to the deﬁnition of a new complexity class hierarchy, called the DC-hierarchy, with the ﬁrst class in this hierarchy, DC1 , precisely capturing the complexity of solving CSPs of bounded balanced width. 005.3
4	Delay-tolerant data collection in sensor networks with mobile sinks Wohlers, Felix Ricklef Scriven January 2012 (has links) Collecting data from sensor nodes to designated sinks is a common and challenging task in a wide variety of wireless sensor network (WSN) applications, ranging from animal monitoring to security surveillance. A number of approaches exploiting sink mobility have been proposed in recent years: some are proactive, in that sensor nodes push their read- ings to storage nodes from where they are collected by roaming mobile sinks, whereas others are reactive, in that mobile sinks pull readings from nearby sensor nodes as they traverse the sensor network. In this thesis, we point out that deciding which data collection approach is more energy-efficient depends on application characteristics, includ- ing the mobility patterns of sinks and the desired latency of collected data. We introduce novel adaptive data collection schemes that are able to automatically adjust to changing sink visiting patterns or data requirements, thereby significantly easing the deployment of a WSN. We illustrate cases where combining proactive and reactive modes of data collection is particularly beneficial. This motivates the design of TwinRoute, a novel hybrid algorithm that can flexibly mix the two col- lection modes at appropriate levels depending on the application sce- nario. Our extensive experimental evaluation, which uses synthetic and real-world sink traces, allows us to identify scenario characteristics that suit proactive, reactive or hybrid data collection schemes. It shows that TwinRoute outperforms the pure approaches in most scenarios, achiev- ing desirable tradeoffs between communication cost and timely delivery of sensor data. 004.68
5	Display computers Smith, Lisa Min-yi Chen 16 August 2006 (has links) A Display Computer (DC) is an everyday object: Display Computer = Display + Computer. The ÂDisplayÂ part is the standard viewing surface found on everyday objects that conveys information or art. The ÂComputerÂ is found on the same everyday object; but by its ubiquitous nature, it will be relatively unnoticeable by the DC user, as it is manufactured Âin the marginsÂ. A DC may be mobile, moving with us as part of the everyday object we are using. DCs will be ubiquitous: Âeffectively invisibleÂ, available at a glance, and seamlessly integrated into the environment. A DC should be an example of WeiserÂs calm technology: encalming to the user, providing peripheral awareness without information overload. A DC should provide unremarkable computing in support of our daily routines in life. The nbaCub (nightly bedtime ambient Cues utility buddy) prototype illustrates a sample application of how DCs can be useful in the everyday environment of the home of the future. Embedding a computer into a toy, such that the display is the only visible portion, can present many opportunities for seamless and nontraditional uses of computing technology for our youngest user community. A field study was conducted in the home environment of a five-year old child over ten consecutive weeks as an informal, proof of concept of what Display Computers for children can look like and be used for in the near future. The personalized nbaCub provided lightweight, ambient information during the necessary daily routines of preparing for bed (evening routine) and preparing to go to school (morning routine). To further understand the childÂs progress towards learning abstract concepts of time passage and routines, a novel Âtest by designÂ activity was included. Here, the role of the subject changed to primary designer/director. Final post-testing showed the subject knew both morning and bedtime routines very well and correctly answered seven of eight questions based on abstract images of time passage. Thus, the subject was in the process of learning the more abstract concept of time passage, but was not totally comfortable with the idea at the end of the study. Display Computers Visionary Computing Ubiquitous Computing Calm Technology Test by Design Children's Computing Applications
6	LoCo : a logic for configuration problems Aschinger, Markus Wolfgang January 2014 (has links) This thesis deals with the problem of technical product configuration: Connect individual components conforming to a component catalogue in order to meet a given objective while respecting certain constraints. Solving such configuration problems is one of the major success stories of applied AI research: In industrial environments they support the configuration of complex products and, compared to manual processes, help to reduce error rates and increase throughput. Practical applications are nowadays ubiquitous and range from configurable cars to the configuration of telephone communication switching units. In the classical definition of a configuration problem the number of components to be used is fixed while in practice, however, the number of components needed is often not easily stated beforehand. Existing knowledge representation (KR) formalisms expressive enough to deal with this dynamic aspect of configuration require that explicit bounds on all generated components are given as well as extensive knowledge about the underlying solving algorithms. To date there is still a lack of high-level KR tools being able to cope with these demands. In this work we present LoCo, a fragment of classical first order logic that has been carefully tailored for expressing technical product configuration problems. The core feature of LoCo is that the number of components used in configurations does not have to be finitely bounded explicitly, but instead is bounded implicitly through the axioms. We identify configurations with models of the logic; hence, configuration finding becomes model finding. LoCo serves as a high-level representation language which allows the modelling of general configuration problems in an intuitive and declarative way without the need of having knowledge about underlying solving algorithms; in fact, the specification gets automatically translated into low-level executable code. LoCo allows translations into different target languages. We present the language, related algorithms and complexity results as well as a prototypical implementation via answer-set programming. 006.3
7	Multi-hop localization in cluttered environments Hussain, Muzammil January 2013 (has links) Range-based localization is a widely used technique for position estimation where distances are measured to anchors, nodes with known positions, and the position is analytically estimated. It offers the benefits of providing high localization accuracy and involving simple operation over multiple deployments. Examples are the Global Positioning System (GPS) and network-based cellular handset localization. Range-based localization is promising for a range of applications, such as robot deployment in emergency scenarios or monitoring industrial processes. However, the presence of clutter in some of these environments leads to a severe degradation of the localization accuracy due to non-line-of-sight (NLOS) signal propagation. Moreover, current literature in NLOS-mitigation techniques requires that the NLOS distances constitute only a minority of the total number of distances to anchors. The key ideas proposed in the dissertation are: 1) multi-hop localization offers significant advantages over single-hop localization in NLOS-prone environments; and 2) it is possible to further reduce position errors by carefully placing intermediate nodes among the clutter to minimize multi-hop distances between the anchors and the unlocalized node. We demonstrate that shortest path distance (SPD) based multi-hop localization algorithms, namely DV-Distance and MDS-MAP, perform the best among other competing techniques in NLOS-prone settings. However, with random node placement, these algorithms require large node densities to produce high localization accuracy. To tackle this, we show that the strategic placement of a relatively small number of nodes in the clutter can offer significant benefits. We propose two algorithms for node placement: first, the Optimal Placement for DV-Distance (OPDV) focuses on obtaining the optimal positions of the nodes for a known clutter topology; and second, the Adaptive Placement for DV-Distance (APDV) offers a distributed control technique that carefully moves nodes in the monitored area to achieve localization accuracies close to those achieved by OPDV. We evaluate both algorithms via extensive simulations, as well as demonstrate the APDV algorithm on a real robotic hardware platform. We finally demonstrate how the characteristics of the clutter topology influence single-hop and multi-hop distance errors, which in turn, impact the performance of the proposed algorithms. 005.1
8	Applying particle filtering to unsupervised part-of-speech induction Dubbin, Gregory January 2014 (has links) Statistical Natural Language Processing (NLP) lies at the intersection of Computational Linguistics and Machine Learning. As linguistic models incorporate more subtle nuances of language and its structure, standard inference techniques can fall behind. One such application is research on the unsupervised induction of part-of-speech tags. It has the potential to improve both our understanding of the plausibility of theories of first language acquisition, and Natural Language Processing applications such as Speech Recognition and Machine Translation. Sequential Monte Carlo (SMC) approaches, i.e. particle filters, are well suited to approximating such models. This thesis seeks to determine whether one application of SMC methods, particle Gibbs sampling, is capable of performing inference in otherwise intractable NLP applications. Specifically, this research analyses the benefits and drawbacks to relying on particle Gibbs to perform unsupervised part-of-speech induction without the flawed one-tag-per-type assumption of similar approaches. Additionally, this thesis explores the affects of type-based supervision with tag-dictionaries extracted from annotated corpora or from the wiktionary. The semi-supervised tag dictionary improves the performance of the local Gibbs PYP-HMM sampler enough to nearly match the performance of the particle Gibbs type-sampler. Finally, this thesis also extends the Pitman-Yor HMM tagger of Blunsom and Cohn (2011) to include an explicit model of the lexicon which encodes those tags from which a word-type may be generated. This has the effect of both biasing the model to produce fewer tags per type and modelling the tendency for open class words to be ambiguous between only a subset of the available tags. Furthermore, I extend the type based particle Gibbs inference algorithm to simultaneously resample the ambiguity class as well as tags for all of the tokens of a given word type. The result is a principled probabilistic model of part-of-speech induction that achieves state-of-the-art performance. Overall, the experiments and contributions of this thesis demonstrate the applicability of the particle Gibbs sampler and particle methods in general to otherwise intractable problems in NLP. 006.3
9	Discrete quantum walks and quantum image processing Venegas-Andraca, Salvador Elías January 2005 (has links) In this thesis we have focused on two topics: Discrete Quantum Walks and Quantum Image Processing. Our work is a contribution within the field of quantum computation from the perspective of a computer scientist. With the purpose of finding new techniques to develop quantum algorithms, there has been an increasing interest in studying Quantum Walks, the quantum counterparts of classical random walks. Our work in quantum walks begins with a critical and comprehensive assessment of those elements of classical random walks and discrete quantum walks on undirected graphs relevant to algorithm development. We propose a model of discrete quantum walks on an infinite line using pairs of quantum coins under different degrees of entanglement, as well as quantum walkers in different initial state configurations, including superpositions of corresponding basis states. We have found that the probability distributions of such quantum walks have particular forms which are different from the probability distributions of classical random walks. Also, our numerical results show that the symmetry properties of quantum walks with entangled coins have a non-trivial relationship with corresponding initial states and evolution operators. In addition, we have studied the properties of the entanglement generated between walkers, in a family of discrete Hadamard quantum walks on an infinite line with one coin and two walkers. We have found that there is indeed a relation between the amount of entanglement available in each step of the quantum walk and the symmetry of the initial coin state. However, as we show with our numerical simulations, such a relation is not straightforward and, in fact, it can be counterintuitive. Quantum Image Processing is a blend of two fields: quantum computation and image processing. Our aim has been to promote cross-fertilisation and to explore how ideas from quantum computation could be used to develop image processing algorithms. Firstly, we propose methods for storing and retrieving images using non-entangled and entangled qubits. Secondly, we study a case in which 4 different values are randomly stored in a single qubit, and show that quantum mechanical properties can, in certain cases, allow better reproduction of original stored values compared with classical methods. Finally, we briefly note that entanglement may be used as a computational resource to perform hardware-based pattern recognition of geometrical shapes that would otherwise require classical hardware and software. 004.1
10	Coordinated search with unmanned aerial vehicle teams Ward, Paul A. January 2013 (has links) Advances in mobile robot technology allow an increasing variety of applications to be imagined, including: search and rescue, exploration of unknown areas and working with hazardous materials. State of the art robots are able to behave autonomously and without direct human control, using on-board devices to perceive, navigate and reason about the world. Unmanned Aerial Vehicles (UAVs) are particularly well suited to performing advanced sensing tasks by moving rapidly through the environment irrespective of the terrain. Deploying groups of mobile robots offers advantages, such as robustness to individual failures and a reduction in task completion time. However, to operate efficiently these teams require specific approaches to enable the individual agents to cooperate. This thesis proposes coordinated approaches to search scenarios for teams of UAVs. The primary application considered is Wilderness Search and Rescue (WiSaR), although the techniques developed are applicable elsewhere. A novel frontier-based search approach is developed for rotor-craft UAVs, taking advantage of available terrain information to minimise altitude changes during flight. This is accompanied by a lightweight coordination mechanism to enable cooperative behaviour with minimal additional overhead. The concept of a team rendezvous is introduced, at which all team members attend to exchange data. This also provides an ideal opportunity to create a comprehensive team solution to relay newly gathered data to a base station. Furthermore, the delay between sensing and the acquired data becoming available to mission commanders is analysed and a technique proposed for adapting the team to meet a latency requirement. These approaches are evaluated and characterised experimentally through simulation. Coordinated frontier search is shown to outperform greedy walk methods, reducing redundant sensing coverage using only a minimal coordination protocol. Combining the search, rendezvous and relay techniques provides a holistic approach to the deployment of UAV teams, meeting mission objectives without extensive pre-configuration. 629.8

Search results