Global ETD Search

21	Human-Machine Alignment for Context Recognition in the Wild Bontempelli, Andrea 30 April 2024 (has links) The premise for AI systems like personal assistants to provide guidance and suggestions to an end-user is to understand, at any moment in time, the personal context that the user is in. The context – where the user is, what she is doing and with whom – allows the machine to represent the world in user’s terms. The context must be inferred from a stream of sensor readings generated by smart wearables such as smartphones and smartwatches, and the labels are acquired from the user directly. To perform robust context prediction in this real-world scenario, the machine must handle the egocentric nature of the context, adapt to the changing world and user, and maintain a bidirectional interaction with the user to ensure the user-machine alignment of world representations. To this end, the machine must learn incrementally on the input stream of sensor readings and user supervision. In this work, we: (i) introduce interactive classification in the wild and present knowledge drift (KD), a special form of concept drift, occurring due to world and user changes; (ii) develop simple and robust ML methods to tackle these scenarios; (iii) showcase the advantages of each of these methods in empirical evaluations on controlled synthetic and real-world data sets; (iv) design a flexible and modular architecture that combines the methods above to support context recognition in the wild; (v) present an evaluation with real users in a concrete social science use case.
22	A multi-layered approach to information extraction from tables in biomedical documents Milosevic, Nikola January 2018 (has links) The quantity of literature in the biomedical domain is growing exponentially. It is becoming impossible for researchers to cope with this ever-increasing amount of information. Text mining provides methods that can improve access to information of interest through information retrieval, information extraction and question answering. However, most of these systems focus on information presented in main body of text while ignoring other parts of the document such as tables and figures. Tables present a potentially important component of research presentation, as authors often include more detailed information in tables than in textual sections of a document. Tables allow presentation of large amounts of information in relatively limited space, due to their structural flexibility and ability to present multi-dimensional information. Table processing encapsulates specific challenges that table mining systems need to take into account. Challenges include a variety of visual and semantic structures in tables, variety of information presentation formats, and dense content in table cells. The work presented in this thesis examines a multi-layered approach to information extraction from tables in biomedical documents. In this thesis we propose a representation model of tables and a method for table structure disentangling and information extraction. The model describes table structures and how they are read. We propose a method for information extraction that consists of: (1) table detection, (2) functional analysis, (3) structural analysis, (4) semantic tagging, (5) pragmatic analysis, (6) cell selection and (7) syntactic processing and extraction. In order to validate our approach, show its potential and identify remaining challenges, we applied our methodology to two case studies. The aim of the first case study was to extract baseline characteristics of clinical trials (number of patients, age, gender distribution, etc.) from tables. The second case study explored how the methodology can be applied to relationship extraction, examining extraction of drug-drug interactions. Our method performed functional analysis with a precision score of 0.9425, recall score of 0.9428 and F1-score of 0.9426. Relationships between cells were recognized with a precision of 0.9238, recall of 0.9744 and F1-score of 0.9484. The information extraction methodology performance is the state-of-the-art in table information extraction recording an F1-score range of 0.82-0.93 for demographic data, adverse event and drug-drug interaction extraction, depending on the complexity of the task and available semantic resources. Presented methodology demonstrated that information can be efficiently extracted from tables in biomedical literature. Information extraction from tables can be important for enhancing data curation, information retrieval, question answering and decision support systems with additional information from tables that cannot be found in the other parts of the document. 004
23	Hierarchical TCP network traffic classification with adaptive optimisation Wang, Xiaoming January 2010 (has links) Nowadays, with the increasing deployment of modern packet-switching networks, traffic classification is playing an important role in network administration. To identify what kinds of traffic transmitting across networks can improve network management in various ways, such as traffic shaping, differential services, enhanced security, etc. By applying different policies to different kinds of traffic, Quality of Service (QoS) can be achieved and the granularity can be as fine as flow-level. Since illegal traffic can be identified and filtered, network security can be enhanced by employing advanced traffic classification. There are various traditional techniques for traffic classification. However, some of them cannot handle traffic generated by applications using non-registered ports or forged ports, some of them cannot deal with encrypted traffic and some techniques require too much computational resources. The newly proposed technique by other researchers, which uses statistical methods, gives an alternative approach. It requires less resources, does not rely on ports and can deal with encrypted traffic. Nevertheless, the performance of the classification using statistical methods can be further improved. In this thesis, we are aiming for optimising network traffic classification based on the statistical approach. Because of the popularity of the TCP protocol, and the difficulties for classification introduced by TCP traffic controls, our work is focusing on classifying network traffic based on TCP protocol. An architecture has been proposed for improving the classification performance, in terms of accuracy and response time. Experiments have been taken and results have been evaluated for proving the improved performance of the proposed optimised classifier. In our work, network packets are reassembled into TCP flows. Then, the statistical characteristics of flows are extracted. Finally the classes of input flows can be determined by comparing them with the profiled samples. Instead of using only one algorithm for classifying all traffic flows, our proposed system employs a series of binary classifiers, which use optimised algorithms to detect different traffic classes separately. There is a decision making mechanism for dealing with controversial results from the binary classifiers. Machining learning algorithms including k-nearest neighbour, decision trees and artificial neural networks have been taken into consideration together with a kind of non-parametric statistical algorithm — Kolmogorov-Smirnov test. Besides algorithms, some parameters are also optimised locally, such as detection windows, acceptance thresholds. This hierarchical architecture gives traffic classifier more flexibility, higher accuracy and less response time. 004.6
24	Akvizice nákladné informace při rozhodování na základě dat / Acquisition of Costly Information in Data-Driven Decision Making Janásek, Lukáš January 2021 (has links) This thesis formulates and solves an economic decision problem of the acquisi- tion of costly information in data-driven decision making. The thesis assumes an agent predicting a random variable utilizing several costly explanatory vari- ables. Prior to the decision making, the agent learns about the relationship between the random variables utilizing its past realizations. During the deci- sion making, the agent decides what costly variables to acquire and predicts using the acquired variables. The agent's utility consists of the correctness of the prediction and the costs of the acquired variables. To solve the decision problem, the thesis divides the decision process into two parts: acquisition of variables and prediction using the acquired variables. For the prediction, the thesis presents a novel approach for training a single predictive model accepting any combination of acquired variables. For the acquisition, the thesis presents two novel methods using supervised machine learning models: a backward es- timation of the expected utility of each variable and a greedy acquisition of variables based on a myopic increase in the expected utility of variables. Next, the thesis formulates the decision problem as a Markov decision process which allows approximating the optimal acquisition via deep...
25	Transformace dat pomocí evolučních algoritmů / Evolutionary Algorithms for Data Transformation Švec, Ondřej January 2017 (has links) In this work, we propose a novel method for a supervised dimensionality reduc- tion, which learns weights of a neural network using an evolutionary algorithm, CMA-ES, optimising the success rate of the k-NN classifier. If no activation func- tions are used in the neural network, the algorithm essentially performs a linear transformation, which can also be used inside of the Mahalanobis distance. There- fore our method can be considered to be a metric learning algorithm. By adding activations to the neural network, the algorithm can learn non-linear transfor- mations as well. We consider reductions to low-dimensional spaces, which are useful for data visualisation, and demonstrate that the resulting projections pro- vide better performance than other dimensionality reduction techniques and also that the visualisations provide better distinctions between the classes in the data thanks to the locality of the k-NN classifier. 1
26	Exploiting Deep Learning and Traffic Models for Freeway Traffic Estimation Genser, Alexander, Makridis, Michail A., Kouvelas, Anastasios 23 June 2023 (has links) Emerging sensors and intelligent traffic technologies provide extensive data sets in a traffic network. However, realizing the full potential of such data sets for a unique representation of real-world states is challenging due to data accuracy, noise, and temporal-spatial resolution. Data assimilation is a known group of methodological approaches that exploit physics-informed traffic models and data observations to perform short-term predictions of the traffic state in freeway environments. At the same time, neural networks capture high non-linearities, similar to those presented in traffic networks. Despite numerous works applying different variants of Kalman filters, the possibility of traffic state estimation with deep-learning-based methodologies is only partially explored in the literature. We present a deep-learning modeling approach to perform traffic state estimation on large freeway networks. The proposed framework is trained on local observations from static and moving sensors and identifies differences between well-trusted data and model outputs. The detected patterns are then used throughout the network, even where there are no available observations to estimate fundamental traffic quantities. The preliminary results of the work highlight the potential of deep learning for traffic state estimation. info:eu-repo/classification/ddc/360 ddc:360
27	Empowering Children with Hope and Opportunity: School Based Counseling and Social Work Services in Catholic Schools Mercs, Rhonda Helen 11 August 2022 (has links) No description available. School Counseling Social Work Mental Health Counseling Education Education
28	Joining implications in formal contexts and inductive learning in a Horn description logic: Extended Version Kriegel, Francesco 20 June 2022 (has links) A joining implication is a restricted form of an implication where it is explicitly specified which attributesmay occur in the premise and in the conclusion, respectively. A technique for sound and complete axiomatization of joining implications valid in a given formal context is provided. In particular, a canonical base for the joining implications valid in a given formal context is proposed, which enjoys the property of being of minimal cardinality among all such bases. Background knowledge in form of a set of valid joining implications can be incorporated. Furthermore, an application to inductive learning in a Horn description logic is proposed, that is, a procedure for sound and complete axiomatization of Horn-M concept inclusions from a given interpretation is developed. A complexity analysis shows that this procedure runs in deterministic exponential time. info:eu-repo/classification/ddc/004 ddc:004
29	Analyse des modèles résines pour la correction des effets de proximité en lithographie optique / Resist modeling analysis for optical proximity correction effect in optical lithography Top, Mame Kouna 12 January 2011 (has links) Les progrès réalisés dans la microélectronique répondent à la problématique de la réduction des coûts de production et celle de la recherche de nouveaux marchés. Ces progrès sont possibles notamment grâce à ceux effectués en lithographie optique par projection, le procédé lithographique principalement utilisé par les industriels. La miniaturisation des circuits intégrés n’a donc été possible qu’en poussant les limites d’impression lithographique. Cependant en réduisant les largeurs des transistors et l’espace entre eux, on augmente la sensibilité du transfert à ce que l’on appelle les effets de proximité optique au fur et à mesure des générations les plus avancées de 45 et 32 nm de dimension de grille de transistor.L’utilisation des modèles OPC est devenue incontournable en lithographie optique, pour les nœuds technologiques avancés. Les techniques de correction des effets de proximité (OPC) permettent de garantir la fidélité des motifs sur plaquette, par des corrections sur le masque. La précision des corrections apportées au masque dépend de la qualité des modèles OPC mis en œuvre. La qualité de ces modèles est donc primordiale. Cette thèse s’inscrit dans une démarche d’analyse et d’évaluation des modèles résine OPC qui simulent le comportement de la résine après exposition. La modélisation de données et l’analyse statistique ont été utilisées pour étudier ces modèles résine de plus en plus empiriques. Outre la fiabilisation des données de calibrage des modèles, l’utilisation des plateformes de création de modèles dédiées en milieu industriel et la méthodologie de création et de validation des modèles OPC ont également été étudié. Cette thèse expose le résultat de l’analyse des modèles résine OPC et propose une nouvelles méthodologie de création, d’analyse et de validation de ces modèles. / The Progress made in microelectronics responds to the matter of production costs reduction and to the search of new markets. These progresses have been possible thanks those made in optical lithography, the printing process principally used in integrated circuit (IC) manufacturing.The miniaturization of integrated circuits has been possible only by pushing the limits of optical resolution. However this miniaturization increases the sensitivity of the transfer, leading to more proximity effects at progressively more advanced technology nodes (45 and 32 nm in transistor gate size). The correction of these optical proximity effects is indispensible in photolithographic processes for advanced technology nodes. Techniques of optical proximity correction (OPC) enable to increase the achievable resolution and the pattern transfer fidelity for advanced lithographic generations. Corrections are made on the mask based on OPC models which connect the image on the resin to the changes made on the mask. The reliability of these OPC models is essential for the improvement of the pattern transfer fidelity.This thesis analyses and evaluates the OPC resist models which simulates the behavior of the resist after the photolithographic process. Data modeling and statistical analysis have been used to study these increasingly empirical resist models. Besides the model calibration data reliability, we worked on the way of using the models calibration platforms generally used in IC manufacturing.This thesis exposed the results of the analysis of OPC resist models and proposes a new methodology for OPC resist models creation, analysis and validation. Microélectroniques Photolithographie Modèle Résine Métrologie Valeurs aberrantes Ensemble d’apprentissage Validation de modèles Modélisation de données Microelectronics Photolithography OPC (Optical Proximity Correction) Resist Models Metrology Outliers Learning Data Model validation Data modelling Statistical analysis 620
30	Klasifikace na nevyvážených datech / Classification on unbalanced data Hlosta, Martin Unknown Date (has links) Tématem této disertační práce je klasifikace daty s nevyváženými daty. Jedná se o oblast strojového, jejímž cílem je řešit problémy, které plynou z toho, že jedna ze tříd je v datech zastoupena výrazně méně než třída druhá. Minoritní třída má často větší význam a tradiční metody upřednostňující majoritní třídu nedosahují dobrých výsledků na třídě minoritní. Dvě aplikační domény motivovaly výzkum a vedly na identifikaci dvou specifických, dosud neřešených problémů. V první z nich vedlo omezení kladené na minimální požadovanou přesnost na minoritní třídě v počítačové bezpečnosti na formulaci úlohy klasifikace s omezením. Navrhl jsem metodu, která kombinuje upravenou verzi logistické regrese a stochastické algoritmy, které vždy vylepšily výsledky logistické regrese.Druhou je doména analýzy učení (Learning Analytics), která motivovala definici problému predikce splnění cíle, jenž má specifikovaný termín splnění. Byl představen koncept sebe-učení (Self-Learning), kdy trénování modelu probíhá díky jedincům, kteří tento cíl splní předčasně. Díky malému počtu jedinců splňujících úlohu na začátku je problém silně nevyvážený, ale nevyváženost klesá směrem k termínu splnění. Na problému identifikace rizikových studentů distanční univerzity bylo ukázáno, že (1) takový koncept dává lepší výsledky než specifikovaná základna (baseline), (2) a že metody pro vypořádání se s nevyvážeností, které neberou v potaz informaci o doméně, nevedly k velkým zlepšením. Evaluace ukázala, že metody založené na znalosti domény v rozšířené verzi pro Self-Learning vylepšily klasifikaci více než běžné metody pro vypořádání se s nevyvážeností a že znalost příčiny nevyváženosti může vést k lepším výsledkům.

Search results