Global ETD Search

821	Fundamental Work Toward an Image Processing-Empowered Dental Intelligent Educational System Olsen, Grace 23 April 2010 (has links) Computer-aided education in dental schools is greatly needed in order to reduce the need for human instructors to provide guidance and feedback as students practice dental procedures. A portable computer-aided educational system with advanced digital image processing capabilities would be less expensive than current computer-aided dental educational systems and would also address some of their limitations. This dissertation outlines the development of novel components that would be part of such a system. This research includes the design of a novel image processing technique, the Directed Active Shape Model algorithm, which is used to locate the tooth and drilled preparation from a digital image, and also to measure the exact size, shape and location of the drilled preparation in relation to the expected preparation. The use of statistical measures taken from the digital images to provide feedback about the smoothness and depth of the dental preparation is also detailed. This research also includes the design and testing of a posture-monitoring component for a portable educational system. Maintaining proper posture is critical for dental practitioners, because poor posture can affect not only the dental practitioner's health, but also the quality of the practitioner's work. The algorithms and techniques designed for use in the dental education support system could also be applied in the design of computer-aided educational systems for the development of procedural skills in many other fields, and in the design of systems to support practicing dentists. Image Processing Machine Learning Computer Sciences Physical Sciences and Mathematics
822	IMPROVING UNDERSTANDABILITY AND UNCERTAINTY MODELING OF DATA USING FUZZY LOGIC SYSTEMS Wijayasekara, Dumidu S 01 January 2016 (has links) The need for automation, optimality and efficiency has made modern day control and monitoring systems extremely complex and data abundant. However, the complexity of the systems and the abundance of raw data has reduced the understandability and interpretability of data which results in a reduced state awareness of the system. Furthermore, different levels of uncertainty introduced by sensors and actuators make interpreting and accurately manipulating systems difficult. Classical mathematical methods lack the capability to capture human knowledge and increase understandability while modeling such uncertainty. Fuzzy Logic has been shown to alleviate both these problems by introducing logic based on vague terms that rely on human understandable terms. The use of linguistic terms and simple consequential rules increase the understandability of system behavior as well as data. Use of vague terms and modeling data from non-discrete prototypes enables modeling of uncertainty. However, due to recent trends, the primary research of fuzzy logic have been diverged from the basic concept of understandability. Furthermore, high computational costs to achieve robust uncertainty modeling have led to restricted use of such fuzzy systems in real-world applications. Thus, the goal of this dissertation is to present algorithms and techniques that improve understandability and uncertainty modeling using Fuzzy Logic Systems. In order to achieve this goal, this dissertation presents the following major contributions: 1) a novel methodology for generating Fuzzy Membership Functions based on understandability, 2) Linguistic Summarization of data using if-then type consequential rules, and 3) novel Shadowed Type-2 Fuzzy Logic Systems for uncertainty modeling. Finally, these presented techniques are applied to real world systems and data to exemplify their relevance and usage. Fuzzy Logic Machine Learning Control Systems Linguistic Summarization Computer Engineering
823	Sledování aktivovanosti objektů v textech / Sledování aktivovanosti objektů v textech Václ, Jan January 2014 (has links) The notion of salience in the discourse analysis models how the activation of referred objects evolves in the flow of text. The salience algorithm was defined and tested briefly in an earlier research, we present a reproduction of its results in a larger scale using data from the Prague Discourse Treebank 1.0. The results are then collected into an accessible shape and analyzed both in their visual and quantitative form in the context of the two main resources of the salience - coreference relations and topic-focus articulation. Finally, attempts are made with using the salience information in the machine learning NLP tasks of document clustering and topic modeling. Powered by TCPDF (www.tcpdf.org)
824	Sledování aktivovanosti objektů v textech / Sledování aktivovanosti objektů v textech Václ, Jan January 2014 (has links) The notion of salience in the discourse analysis models how the activation of referred objects evolves in the flow of text. The salience algorithm was already defined and tested briefly in an earlier research, we present a reproduction of its results in a larger scale using data from the Prague Discourse Treebank 1.0. The results are then collected into an accessible shape and analyzed both in their visual and quantitative form in the context of the two main resources of the salience - coreference relations and topic-focus articulation. Finally, attempts are made with using the salience information in the machine learning NLP tasks of document clustering and topic modeling. Powered by TCPDF (www.tcpdf.org)
825	Robust French syntax analysis : reconciling statistical methods and linguistic knowledge in the Talismane toolkit / Analyse syntaxique robuste du français : concilier méthodes statistiques et connaissances linguistiques dans l'outil Talismane Urieli, Assaf 17 December 2013 (has links) Dans cette thèse, nous explorons l'analyse syntaxique robuste statistique du français. Notre principal souci est de trouver des méthodes qui permettent au linguiste d'injecter des connaissances et/ou des ressources linguistiques dans un moteur statistique afin d'améliorer les résultats de certains phénomènes spécifiques. D'abord nous décrivons le schéma d'annotation en dépendances du français, et les algorithmes capables de produire cette annotation, en particulier le parsing par transitions. Après avoir exploré les algorithmes d'apprentissage automatique supervisé pour les problèmes de classification en TAL, nous présentons l'analyseur syntaxique Talismane développé dans le cadre de cette thèse et comprenant quatre modules statistiques – le découpage en phrases, la segmentation en mots, l'étiquetage morpho-syntaxique et le parsing – ainsi que les diverses ressources linguistiques utilisées par le modèle de base. Nos premières expériences tentent d'identifier la meilleure configuration de base parmi de nombreuses configurations possibles. Ensuite nous explorons les améliorations apportées par la recherche par faisceau et la propagation du faisceau. Enfin nous présentons une série d'expériences dont le but est de corriger des erreurs linguistiques spécifiques au moyen de traits ciblés. Une de nos innovations est l'introduction des règles qui imposent ou interdisent certaines décisions locales, permettant ainsi de contourner le modèle statistique. Nous explorons l'utilisation de règles pour les erreurs que les traits n'ont pu corriger. Finalement, nous présentons une expérience semi-supervisée avec une ressource de sémantique distributionnelle. / In this thesis we explore robust statistical syntax analysis for French. Our main concern is to explore methods whereby the linguist can inject linguistic knowledge and/or resources into the robust statistical engine in order to improve results for specific phenomena. We first explore the dependency annotation schema for French, concentrating on certain phenomena. Next, we look into the various algorithms capable of producing this annotation, and in particular on the transition-based parsing algorithm used in the rest of this thesis. After exploring supervised machine learning algorithms for NLP classification problems, we present the Talismane toolkit for syntax analysis, built within the framework of this thesis, including four statistical modules - sentence boundary detection, tokenisation, pos-tagging and parsing - as well as the various linguistic resources used for the baseline model, including corpora, lexicons and feature sets. Our first experiments attempt various machine learning configurations in order to identify the best baseline. We then look into improvements made possible by beam search and beam propagation. Finally, we present a series of experiments aimed at correcting errors related to specific linguistic phenomena, using targeted features. One our innovation is the introduction of rules that can impose or prohibit certain decisions locally, thus bypassing the statistical model. We explore the usage of rules for errors that the features are unable to correct. Finally, we look into the enhancement of targeted features by large scale linguistic resources, and in particular a semi-supervised approach using a distributional semantic resource. Analyse syntaxique Apprentissage automatique Parsing Machine learning Targeted features
826	On the simulation and design of manycore CMPs Thompson, Christopher Callum January 2015 (has links) The progression of Moore’s Law has resulted in both embedded and performance computing systems which use an ever increasing number of processing cores integrated in a single chip. Commercial systems are now available which provide hundreds of cores, and academics have proposed architectures for up to 1024 cores. Embedded multicores are increasingly popular as it is easier to guarantee hard-realtime constraints using individual cores dedicated for tasks, than to use traditional time-multiplexed processing. However, finding the optimal hardware configuration to meet these requirements at minimum cost requires extensive trial and error approaches to investigate the design space. This thesis tackles the problems encountered in the design of these large scale multicore systems by first addressing the problem of fast, detailed micro-architectural simulation. Initially addressing embedded systems, this work exploits the lack of hardware cache-coherence support in many deeply embedded systems to increase the available parallelism in the simulation. Then, through partitioning the NoC and using packet counting and cycle skipping reduces the amount of computation required to accurately model the NoC interconnect. In combination, this enables simulation speeds significantly higher than the state of the art, while maintaining less error, when compared to real hardware, than any similar simulator. Simulation speeds reach up to 370MIPS (Million (target) Instructions Per Second), or 110MHz, which is better than typical FPGA prototypes, and approaching final ASIC production speeds. This is achieved while maintaining an error of only 2.1%, significantly lower than other similar simulators. The thesis continues by scaling the simulator past large embedded systems up to 64-1024 core processors, adding support for coherent architectures using the same packet counting techniques along with low overhead context switching to enable the simulation of such large systems with stricter synchronisation requirements. The new interconnect model was partitioned to enable parallel simulation to further improve simulation speeds in a manner which did not sacrifice any accuracy. These innovations were leveraged to investigate significant novel energy saving optimisations to the coherency protocol, processor ISA, and processor micro-architecture. By introducing a new instruction, with the name wait-on-address, the energy spent during spin-wait style synchronisation events can be significantly reduced. This functions by putting the core into a low-power idle state while the cache line of the indicated address is monitored for coherency action. Upon an update or invalidation (or traditional timer or external interrupts) the core will resume execution, but the active energy of running the core pipeline and repeatedly accessing the data and instruction caches is effectively reduced to static idle power. The thesis also shows that existing combined software-hardware schemes to track data regions which do not require coherency can adequately address the directory-associativity problem, and introduces a new coherency sharer encoding which reduces the energy consumed by sharer invalidations when sharers are grouped closely together, such as would be the case with a system running many tasks with a small degree of parallelism in each. The research concludes by using the extremely fast simulation speeds developed to produce a large set of training data, collecting various runtime and energy statistics for a wide range of embedded applications on a huge diverse range of potential MPSoC designs. This data was used to train a series of machine learning based models which were then evaluated on their capacity to predict performance characteristics of unseen workload combinations across the explored MPSoC design space, using only two sample simulations, with promising results from some of the machine learning techniques. The models were then used to produce a ranking of predicted performance across the design space, and on average Random Forest was able to predict the best design within 89% of the runtime performance of the actual best tested design, and better than 93% of the alternative design space. When predicting for a weighted metric of energy, delay and area, Random Forest on average produced results within 93% of the optimum result. In summary this thesis improves upon the state of the art for cycle accurate multicore simulation, introduces novel energy saving changes the the ISA and microarchitecture of future multicore processors, and demonstrates the viability of machine learning techniques to significantly accelerate the design space exploration required to bring a new manycore design to market. 004
827	Computer aided analysis of inflammatory muscle disease using magnetic resonance imaging Jack, James January 2015 (has links) Inflammatory muscle disease (myositis) is characterised by inflammation and a gradual increase in muscle weakness. Diagnosis typically requires a range of clinical tests, including magnetic resonance imaging of the thigh muscles to assess the disease severity. In the past, this has been measured by manually counting the number of muscles affected. In this work, a computer-aided analysis of inflammatory muscle disease is presented to help doctors diagnose and monitor the disease. Methods to quantify the level of oedema and fat infiltration from magnetic resonance scans are proposed and the disease quantities determined are shown to have positive correlation against expert medical opinion. The methods have been designed and tested on a database of clinically acquired T1 and STIR sequences, and are proven to be robust despite suboptimal image quality. General background information is first introduced, giving an overview of the medical, technical, and theoretical topics necessary to understand the problem domain. Next, a detailed introduction to the physics of magnetic resonance imaging is given. A review of important literature from similar and related domains is presented, with valuable insights that are utilised at a later stage. Scans are carefully pre-processed to bring all slices in to a common frame of reference and the methods to quantify the level of oedema and fat infiltration are defined and shown to have good positive correlation with expert medical opinion. A number of validation tests are performed with re-scanned subjects to indicate the level of repeatability. The disease quantities, together with statistical features from the T1-STIR joint histogram, are used for automatic classification of the disease severity. Automatic classification is shown to be successful on out of sample data for both the oedema and fat infiltration problems. 616.7
828	Collective analysis of multiple high-throughput gene expression datasets Abu Jamous, Basel January 2015 (has links) Modern technologies have resulted in the production of numerous high-throughput biological datasets. However, the pace of development of capable computational methods does not cope with the pace of generation of new high-throughput datasets. Amongst the most popular biological high-throughput datasets are gene expression datasets (e.g. microarray datasets). This work targets this aspect by proposing a suite of computational methods which can analyse multiple gene expression datasets collectively. The focal method in this suite is the unification of clustering results from multiple datasets using external specifications (UNCLES). This method applies clustering to multiple heterogeneous datasets which measure the expression of the same set of genes separately and then combines the resulting partitions in accordance to one of two types of external specifications; type A identifies the subsets of genes that are consistently co-expressed in all of the given datasets while type B identifies the subsets of genes that are consistently co-expressed in a subset of datasets while being poorly co-expressed in another subset of datasets. This contributes to the types of questions which can addressed by computational methods because existing clustering, consensus clustering, and biclustering methods are inapplicable to address the aforementioned objectives. Moreover, in order to assist in setting some of the parameters required by UNCLES, the M-N scatter plots technique is proposed. These methods, and less mature versions of them, have been validated and applied to numerous real datasets from the biological contexts of budding yeast, bacteria, human red blood cells, and malaria. While collaborating with biologists, these applications have led to various biological insights. In yeast, the role of the poorly-understood gene CMR1 in the yeast cell-cycle has been further elucidated. Also, a novel subset of poorly understood yeast genes has been discovered with an expression profile consistently negatively correlated with the well-known ribosome biogenesis genes. Bacterial data analysis has identified two clusters of negatively correlated genes. Analysis of data from human red blood cells has produced some hypotheses regarding the regulation of the pathways producing such cells. On the other hand, malarial data analysis is still at a preliminary stage. Taken together, this thesis provides an original integrative suite of computational methods which scrutinise multiple gene expression datasets collectively to address previously unresolved questions, and provides the results and findings of many applications of these methods to real biological datasets from multiple contexts. 572.8
829	Fitness Function for a Subscriber Podapati, Sasidhar January 2017 (has links) Mobile communication has become a vital part of modern communication. The cost of network infrastructure has become a deciding factor with rise in mobile phone usage. Subscriber mobility patterns have major effect on load of radio cell in the network. The need for data analysis of subscriber mobility data is of utmost priority. The paper aims at classifying the entire dataset provided by Telenor, into two main groups i.e. Infrastructure stressing and Infrastructure friendly with respect to their impact on the mobile network. The research aims to predict the behavior of new subscriber based on his MOSAIC group. A heuristic method is formulated to characterize the subscribers into three different segments based on their mobility. Tetris Optimization is used to reveal the “Infrastructure Stressing” subscribers in the mobile network. All the experiments have been conducted on the subscriber trajectory data provided by the telecom operator. The results from the experimentation reveal that 5 percent of subscribers from entire data set are “Infrastructure Stressing”. A classification model is developed and evaluated to label the new subscriber as friendly or stressing using WEKA machine learning tool. Naïve Bayes, k-nearest neighbor and J48 Decision tree are classification algorithms used to train the model and to find the relation between features in the labeled subscriber dataset Classification Machine Learning Subscriber Mobility Analysis Weka tool
830	Real-Time and Data-Driven Operation Optimization and Knowledge Discovery for an Enterprise Information System Duan, Qing January 2014 (has links) <p>An enterprise information system (EIS) is an integrated data-applications platform characterized by diverse, heterogeneous, and distributed data sources. For many enterprises, a number of business processes still depend heavily on static rule-based methods and extensive human expertise. Enterprises are faced with the need for optimizing operation scheduling, improving resource utilization, discovering useful knowledge, and making data-driven decisions. </p><p>This thesis research is focused on real-time optimization and knowledge discovery that addresses workflow optimization, resource allocation, as well as data-driven predictions of process-execution times, order fulfillment, and enterprise service-level performance. In contrast to prior work on data analytics techniques for enterprise performance optimization, the emphasis here is on realizing scalable and real-time enterprise intelligence based on a combination of heterogeneous system simulation, combinatorial optimization, machine-learning algorithms, and statistical methods. </p><p> </p><p>On-demand digital-print service is a representative enterprise requiring a powerful EIS.We use real-life data from Reischling Press, Inc. (RPI), a digit-print-service provider (PSP), to evaluate our optimization algorithms.</p><p>In order to handle the increase in volume and diversity of demands, we first present a high-performance, scalable, and real-time production scheduling algorithm for production automation based on an incremental genetic algorithm (IGA). The objective of this algorithm is to optimize the order dispatching sequence and balance resource utilization. Compared to prior work, this solution is scalable for a high volume of orders and it provides fast scheduling solutions for orders that require complex fulfillment procedures. Experimental results highlight its potential benefit in reducing production inefficiencies and enhancing the productivity of an enterprise.</p><p>We next discuss analysis and prediction of different attributes involved in hierarchical components of an enterprise. We start from a study of the fundamental processes related to real-time prediction. Our process-execution time and process status prediction models integrate statistical methods with machine-learning algorithms. In addition to improved prediction accuracy compared to stand-alone machine-learning algorithms, it also performs a probabilistic estimation of the predicted status. An order generally consists of multiple series and parallel processes. We next introduce an order-fulfillment prediction model that combines advantages of multiple classification models by incorporating flexible decision-integration mechanisms. Experimental results show that adopting due dates recommended by the model can significantly reduce enterprise late-delivery ratio. Finally, we investigate service-level attributes that reflect the overall performance of an enterprise. We analyze and decompose time-series data into different components according to their hierarchical periodic nature, perform correlation analysis,</p><p>and develop univariate prediction models for each component as well as multivariate models for correlated components. Predictions for the original time series are aggregated from the predictions of its components. In addition to a significant increase in mid-term prediction accuracy, this distributed modeling strategy also improves short-term time-series prediction accuracy. </p><p>In summary, this thesis research has led to a set of characterization, optimization, and prediction tools for an EIS to derive insightful knowledge from data and use them as guidance for production management. It is expected to provide solutions for enterprises to increase reconfigurability, accomplish more automated procedures, and obtain data-driven recommendations or effective decisions.</p> / Dissertation Computer science data mining knowledge discovery machine learning operation optimization

Search results