Global ETD Search

1	Multiple Change-Point Detection: A Selective Overview Niu, Yue S., Hao, Ning, Zhang, Heping 11 1900 (has links) Very long and noisy sequence data arise from biological sciences to social science including high throughput data in genomics and stock prices in econometrics. Often such data are collected in order to identify and understand shifts in trends, for example, from a bull market to a bear market in finance or from a normal number of chromosome copies to an excessive number of chromosome copies in genetics. Thus, identifying multiple change points in a long, possibly very long, sequence is an important problem. In this article, we review both classical and new multiple change-point detection strategies. Considering the long history and the extensive literature on the change-point detection, we provide an in-depth discussion on a normal mean change-point model from aspects of regression analysis, hypothesis testing, consistency and inference. In particular, we present a strategy to gather and aggregate local information for change-point detection that has become the cornerstone of several emerging methods because of its attractiveness in both computational and theoretical properties. Binary segmentation consistency multiple testing normal mean change-point model regression screening and ranking algorithm
2	Χρήση θεματικών ταξινομιών για την αυτόματη δημιουργία και οργάνωση εξατομικευμένων καταλόγων διαδικτύου : ένας πρότυπος αλγόριθμος ταξινόμησης / Usage of thematic taxonomy for the automatic creation and organization of specialized network catalogs Κρίκος, Βλάσης 16 May 2007 (has links) Οι εξατομικευμένοι κατάλογοι διαδικτύου εμφανίστηκαν σχεδόν ταυτόχρονα με την εμφάνιση των φυλλομετρητών διαδικτύων, και από τότε όλοι οι φυλλομετρητές ενσωματώνουν απλά συστήματα διαχείρισης των εξατομικευμένων καταλόγων. Με τον όρο εξατομικευμένοι κατάλογοι εννοούμε τις προσωπικές συλλογές από ιστοσελίδες που ένας χρήστης διαδικτύου αποθηκεύει κατά την ώρα της πλοήγησης στον Παγκόσμιο Ιστό. Οι εξατομικευμένοι κατάλογοι διαδικτύου χρησιμοποιούνται σαν «προσωπικός χώρος πληροφορίας του δικτύου» για να βοηθούν τους ανθρώπους να θυμούνται και να ανακτούν ενδιαφέρουσες ιστοσελίδες από το διαδίκτυο. Στην εργασία αυτή παρουσιάζουμε ένα πρότυπο σύστημα διαχείρισης εξατομικευμένων καταλόγων διαδικτύου ορίζοντας τις προϋποθέσεις που πρέπει να πληρεί ώστε να είναι εύχρηστο και αποτελεσματικό. Το σύστημα αυτό έχει όλες τις δυνατότητες που έχουν τα εμπορικά αλλά και τα πρότυπα συστήματα διαχείρισης bookmarks. Επιπλέον διαθέτει καινοτόμες λειτουργίες που το καθιστούν μοναδικό. Παράλληλα παρουσιάζουμε αναλυτικά έναν πρότυπο αλγόριθμο κατάταξης, τον αλγόριθμο κατάταξης με βάση την συνάφεια των σελίδων με τις κατηγορίες στις οποίες ανήκουν. Τον αλγόριθμο αυτόν τον συγκρίνουμε με τον δημοφιλή αλγόριθμος γενικής κατάταξης το PageRank. Από το πείραμα που κάναμε προκύπτει ότι ο αλγόριθμος που προτείνουμε είναι πιο κατάλληλος για την ταξινόμηση των σελίδων σε θεματικές κατηγορίες από το PageRank. / The individualised lists of internet were presented almost simultaneously with the appearance of browser internets, and from then all browser incorporate simple systems of management of individualised lists. With the term individualised lists we mean the personal collections from web pages that a user of internet stores at the hour of pilotage in the World Web. The individualised lists of internet are used as \"personal space of information of network\" in order to they help the persons to remember and to recover interesting web pages from the internet. In this work we present a model system of management of individualised lists of internet horizon the conditions that should plirej Θεματικές ταξινομίες 025.04 Bookmarks Ranking algorithm Thematic taxonomy
3	J-model : an open and social ensemble learning architecture for classification Kim, Jinhan January 2012 (has links) Ensemble learning is a promising direction of research in machine learning, in which an ensemble classifier gives better predictive and more robust performance for classification problems by combining other learners. Meanwhile agent-based systems provide frameworks to share knowledge from multiple agents in an open context. This thesis combines multi-agent knowledge sharing with ensemble methods to produce a new style of learning system for open environments. We now are surrounded by many smart objects such as wireless sensors, ambient communication devices, mobile medical devices and even information supplied via other humans. When we coordinate smart objects properly, we can produce a form of collective intelligence from their collaboration. Traditional ensemble methods and agent-based systems have complementary advantages and disadvantages in this context. Traditional ensemble methods show better classification performance, while agent-based systems might not guarantee their performance for classification. Traditional ensemble methods work as closed and centralised systems (so they cannot handle classifiers in an open context), while agent-based systems are natural vehicles for classifiers in an open context. We designed an open and social ensemble learning architecture, named J-model, to merge the conflicting benefits of the two research domains. The J-model architecture is based on a service choreography approach for coordinating classifiers. Coordination protocols are defined by interaction models that describe how classifiers will interact with one another in a peer-to-peer manner. The peer ranking algorithm recommends more appropriate classifiers to participate in an interaction model to boost the success rate of results of their interactions. Coordinated participant classifiers who are recommended by the peer ranking algorithm become an ensemble classifier within J-model. We evaluated J-model’s classification performance with 13 UCI machine learning benchmark data sets and a virtual screening problem as a realistic classification problem. J-model showed better performance of accuracy, for 9 benchmark sets out of 13 data sets, than 8 other representative traditional ensemble methods. J-model gave better results of specificity for 7 benchmark sets. In the virtual screening problem, J-model gave better results for 12 out of 16 bioassays than already published results. We defined different interaction models for each specific classification task and the peer ranking algorithm was used across all the interaction models. Our research contributions to knowledge are as follows. First, we showed that service choreography can be an effective ensemble coordination method for classifiers in an open context. Second, we used interaction models that implement task specific coordinations of classifiers to solve a variety of representative classification problems. Third, we designed the peer ranking algorithm which is generally and independently applicable to the task of recommending appropriate member classifiers from a classifier pool based on an open pool of interaction models and classifiers. 006.3
4	Detection of IoT Botnets using Decision Trees Meghana Raghavendra (10723905) 29 April 2021 (has links) <p>International Data Corporation<sup>[3]</sup> (IDC) data estimates that 152,200 Internet of things (IoT) devices will be connected to the Internet every minute by the year 2025. This rapid expansion in the utilization of IoT devices in everyday life leads to an increase in the attack surface for cybercriminals. IoT devices are frequently compromised and used for the creation of botnets. However, it is difficult to apply the traditional methods to counteract IoT botnets and thus calls for finding effective and efficient methods to mitigate such threats. In this work, the network snapshots of IoT traffic infected with two botnets, i.e., Mirai and Bashlite, are studied. Specifically, the collected datasets include network traffic from 9 different IoT devices such as baby monitor, doorbells, thermostat, web cameras, and security cameras. Each dataset consists of 115 stream aggregation feature statistics like weight, mean, covariance, correlation coefficient, standard deviation, radius, and magnitude with a timeframe decay factor, along with a class label defining the traffic as benign or anomalous.</p><p>The goal of the research is to identify a proper machine learning method that can detect IoT botnet traffic accurately and in real-time on IoT edge devices with low computation power, in order to form the first line of defense in an IoT network. The initial step is to identify the most important features that distinguish between benign and anomalous traffic for IoT devices. Specifically, the Input Perturbation Ranking algorithm<sup>[12]</sup> with XGBoost<sup>[26]</sup>is applied to find the 9 most important features among the 115 features. These 9 features can be collected in real time and be applied as inputs to any detection method. Next, a supervised predictive machine learning method, i.e., Decision Trees, is proposed for faster and accurate detection of botnet traffic. The advantage of using decision trees over other machine learning methodologies, is that it achieves accurate results with low computation time and power. Unlike deep learning methodologies, decision trees can provide visual representation of the decision making and detection process. This can be easily translated into explicit security policies in the IoT environment. In the experiments conducted, it can be clearly seen that decision trees can detect anomalous traffic with an accuracy of 99.997% and takes 59 seconds for training and 0.068 seconds for prediction, which is much faster than the state-of-art deep-learning based detector, i.e., Kitsune<sup>[4]</sup>. Moreover, our results show that decision trees have an extremely low false positive rate of 0.019%. Using the 9 most important features, decision trees can further reduce the processing time while maintaining the accuracy. Hence, decision trees with important features are able to accurately and efficiently detect IoT botnets in real time and on a low performance edge device such as Raspberry Pi<sup>[9]</sup>.</p> Computer System Security Internet of Things Botnets Decision Trees Input Perturbation Ranking Algorithm
5	Exploitation informatique des annotations sémantiques automatiques d'Excom pour la recherche d'informations et la navigation / Information Retrieval and Text Navigation through the Exploitation of the Automatic Semantic Annotation of the Excom Engine Atanassova, Iana 14 January 2012 (has links) À partir du moteur d’annotation sémantique Excom, nous avons élaboré un systèmede recherche d’informations qui repose sur des catégories sémantiques issues d’analyses linguistiquesautomatiques afin de proposer une approche de fouille textuelle innovante. Les annotationssont obtenues par la méthode d’Exploration Contextuelle faisant appel à une modélisationdes connaissances linguistiques sous forme de marqueurs et de règles. Le traitement des requêtesselon des points de vue de fouille se trouve au coeur de la stratégie de recherche d’informations.Pour cela, notre approche s’appuie sur des catégories d’annotation organisées en ontologies linguistiquessous forme de graphes. Afin d’offrir à l’utilisateur des résultats pertinents, nous avonsmis en place des algorithmes d’ordonnancement des réponses et de gestion de la redondance.Ces algorithmes reposent principalement sur la structure des ontologies linguistiques utiliséespour l’annotation. Nous avons proposé une évaluation de la pertinence des résultats en tenantcompte de la spécificité de l’approche. Les interfaces que nous avons développées permettent laconstruction de nouveaux produits documentaires tels que les fiches de synthèse offrant une extractiond’informations structurées selon des critères sémantiques. Cee approche a égalementpour vocation de proposer des outils dédiés à la veille stratégique et à l’intelligence économique. / Using the Excom engine for semantic annotation, we have constructed an InformationRetrieval System based on semantic categories from automatic language analyses in order topropose a new approach to text search. e annotations are obtained by the Contextual Explorationmethod which is a knowledge based linguistic approach using markers and disambiguationrules. e queries are formulated according to search viewpoints which are at the heart of theInformation Retrieval strategy. Our approach uses the annotation categories which are organisedin linguistic ontologies structured as graphs. In order to provide relevant results to the user,we have designed algorithms for ranking and paraphrase identification. ese algorithms exploitprincipally the structure of the linguistic ontologies for the annotation. We have carriedout an evaluation of the relevance of the system results taking into account the specificity ofour approach. We have developed user interfaces allowing the construction of new informationproducts such as structured text syntheses using information extraction according to semanticcriteria. is approach also aims to offer tools in the field of economic intelligence. Recherche d'information Exploration contextuelle Annotation semantique Ordonnancement des réponses Information Retrieval Semantic Annotation Contextual Exploration Information Extraction, Ranking algorithm
6	Weightless neural networks for face recognition Khaki, Kazimali M. January 2013 (has links) The interface with the real-world has proved to be extremely challenging throughout the past 70 years in which computer technology has been developing. The problem initially is assumed to be somewhat trivial, as humans are exceptionally skilled at interpreting real-world data, for example pictures and sounds. Traditional analytical methods have so far not provided the complete answer to what will be termed pattern recognition. Biological inspiration has motivated pattern recognition researchers since the early days of the subject, and the idea of a neural network which has self-evolving properties has always been seen to be a potential solution to this endeavour. Unlike the development of computer technology in which successive generations of improved devices have been developed, the neural network approach has been less successful, with major setbacks occurring in its development. However, the fact that natural processing in animals and humans is a voltage-based process, devoid of software, and self-evolving, provides an on-going motivation for pattern recognition in artificial neural networks. This thesis addresses the application of weightless neural networks using a ranking pre-processor to implement general pattern recognition with specific reference to face processing. The evaluation of the system will be carried out on open source databases in order to obtain a direct comparison of the efficacy of the method, in particular considerable use will be made of the MIT-CBCL face database. The methodology is cost effective in both software and hardware forms, offers real-time video processing, and can be implemented on all computer platforms. The results of this research show significant improvements over published results, and provide a viable commercial methodology for general pattern recognition. 006.42

1

Page generated in 0.0461 seconds