1 |
Probabilistic Diagnostic Model for Handling Classifier Degradation in Machine LearningGustavo A. Valencia-Zapata (8082655) 04 December 2019 (has links)
Several studies point out different causes of performance degradation in supervised machine learning. Problems such as class imbalance, overlapping, small-disjuncts, noisy labels, and sparseness limit accuracy in classification algorithms. Even though a number of approaches either in the form of a methodology or an algorithm try to minimize performance degradation, they have been isolated efforts with limited scope. This research consists of three main parts: In the first part, a novel probabilistic diagnostic model based on identifying signs and symptoms of each problem is presented. Secondly, the behavior and performance of several supervised algorithms are studied when training sets have such problems. Therefore, prediction of success for treatments can be estimated across classifiers. Finally, a probabilistic sampling technique based on training set diagnosis for avoiding classifier degradation is proposed<br>
|
2 |
VePMAD: A Vehicular Platoon Management Anomaly Detection System : A Case Study of Car-following Mode, Middle Join and Exit ManeuversBayaa, Weaam January 2021 (has links)
Vehicle communication using sensors and wireless channels plays an important role to allow exchanging information. Adding more components to allow exchanging more information with infrastructure enhanced the capabilities of vehicles and enabled the rise of Cooperative Intelligent Transport Systems (C-ITS). Leveraging such capabilities, more applications such as Cooperative Adaptive Cruise Control (CACC) and platooning were introduced. CACC is an enhancement of Adaptive Cruise Control (ACC). It enables longitudinal automated vehicle control and follows the Constant Time Gap (CTG) strategy where, distance between vehicles is proportional to the speed. Platooning is different in terms of addressing both longitudinal and lateral control. In addition, it adopts the Constant Distance Gap (CDG) control strategy, with separation between vehicles unchanged with speed. Platooning requires close coupling and accordingly achieves goals of increased lane throughput and reduced energy consumption. When a longitudinal controller only is used, platooning operates in car-following mode and no Platoon Management Protocol (PMP) is used. On the other hand, when both longitudinal and lateral controllers are used, platooning operates in maneuver mode and coordination between vehicles is needed to perform maneuvers. Exchanging information allows the platoon to make real time maneuvering decisions. However, all the aforementioned benefits of platooning cannot be achieved if the system is vulnerable to misbehavior (i.e., the platoon is behaving incorrectly). Most of work in the literature attributes this misbehavior to malicious actors where an attacker injects malicious messages. Standards made efforts to develop security services to authenticate and authorize the sender. However, authenticated users equipped with cryptographic primitives can mount attacks (i.e., falsification attacks) and accordingly they cannot be detected by standard services such as cryptographic signatures. Misbehavior can disturb platoon behavior or even cause collision. Many Misbehavior Detection Schemes (MDSs) are proposed in the literature in the context of Vehicular ad hoc network (VANET) and CACC. These MDSs apply algorithms or rules to detect sudden or gradual changes of kinematic information disseminated by other vehicles. Reusing these MDSs directly during maneuvers can lead to false positives when they treat changes in kinematic information during the maneuver as an attack. This thesis addresses this gap by designing a new modular framework that has the capability to discern maneuvering process from misbehavior by leveraging platoon behavior recognition, that is, the platoon mode of operation (e.g., car-following mode or maneuver mode). In addition, it has the ability to recognize the undergoing maneuver (e.g., middle join or exit). Based on the platoon behavior recognition module, the anomaly detection module detects deviations from expected behavior. Unsupervised machine learning, notably Hidden Markov Model with Gaussian Mixture Model emission (GMMHMM), is used to learn the nominal behavior of the platoon during different modes and maneuvers. This is used later by the platoon behavior recognition and anomaly detection modules. GMMHMM is trained with nominal behavior of platoon using multivariate time series representing kinematic characteristics of the vehicles. Different models are used to detect attacks in different scenarios (e.g., different speeds). Two approaches for anomaly detection are investigated, Viterbi algorithm based anomaly detection and Forward algorithm based anomaly detection. The proposed framework managed to detect misbehavior whether the compromised vehicle is a platoon leader or follower. Empirical results show very high performance, with the platoon behavior recognition module reaching 100% in terms of accuracy. In addition, it can predict ongoing platoon behavior at early stages and accordingly, use the correct model representing the nominal behavior. Forward algorithm based anomaly detection, which rely on computing likelihood, showed better performance reaching 98% with slight variations in terms of accuracy, precision, recall and F1 score. Different platooning controllers can be resilient to some attacks and accordingly, the attack can result in slight deviation from nominal behavior. However, The anomaly detection module was able to detect this deviation. / Kommunikation mellan fordon som använder sensorer och radiokommunikation spelar en viktig roll för att kunna möjliggöra informationsutbyte. Genom att lägga till er komponenter för infrastrukturkommunikation förbättras fordonens generella kommunikationskapacitet och möjliggör C-ITS. Det möjliggör också för att introducera ytterligare applikationer, exempelvis CACC samt plutonering. CACC är en förbättring av ACC -konceptet. Denna teknik möjliggör longitudinell automatiserad fordonskontroll och följer en CTG -strategi där avståndet mellan fordon är proportionellt mot hastigheten. Plutonering är annorlunda med avseende på att hantera longitudinell och lateral kontroll. Dessutom antar den en kontrollstrategi för CDG där avståndet mellan fordon förblir oförändrat med hastighet. Plutonering kräver en nära koppling mellan fordon för att uppnå målet med ökad filgenomströmning och reducerad energikonsumtion. När enbart longitudinell kontroll är aktiverad, fungerar plutonering i bilföljande läge och funktionen PMP används inte. När både longitudinella och laterala kontroller används, arbetar plutonen istället i manöverläge och samordning mellan fordon behövs för att utföra olika manövrar. Informationsutbytet möjliggör att plutonen kan man manövrera i realtid. Alla ovan nämnda fördelar med plutonering kan emellertid inte uppnås om systemet är sårbart för felbeteende, det vill säga att plutonen beter sig fel. I litteraturen kopplas detta missförhållande till skadliga aktörer där en angripare injicerar skadliga meddelanden. I standardiseringsarbeten har man försökt utveckla säkerhetstjänster för att autentisera och auktorisera avsändaren. Trots detta kan autentiserade användare utrustade med kryptografiska primitiv upprätta förfalskningsattacker som inte detekteras av standardtjänster som kryptografiska signaturer. Felaktigt handhavande kan orsaka störningar i plutonens beteende eller till och med orsaka kollisioner och följaktligen påverka tillförlitligheten. Det finns manga MDSs beskrivna i litteraturen i relation till VANET och CACC. MDSs använder algoritmer eller regler för att detektera snabba eller långsamma förändringar kinematisk information som sprids av andra fordon. Direkt användning av MDSs under manövrar kan leda till falska positiva resultat eftersom de kommer att behandla förändringar i kinematisk information under manövern som en attack. Denna avhandling adresserar detta gap genom utformningen av ett modulärt ramverk som kan urskilja manöverprocessen från misskötsamhet genom att utnyttja plutonens beteendeigenkänningsmodul för att intelligent känna igen plutonläget (t.ex. bilföljande läge eller manöverläge). Ramverket har vidare egenskapen att känna igen pågående manövrar (frikoppling eller växelbyte) och avvikelser från förväntat beteende. Modulen använder en oövervakad maskininlärningssmodell, GMMHMM, för att lära en plutons normala beteende under olika lägen och manövrar som sedan används för plutonbeteendeigenkänning och avvikelsedetektion. GMMHMM tränas på data från plutoneringens normalbeteende i form av multivariata tidsserier som representerar fordonets kinematiska karakteristik. Olika modeller används för att upptäcka attacker i olika scenarier (t.ex. olika hastigheter). Två tillvägagångssätt för avvikelsedetektion undersöks, Viterbi-algoritmen samt Forward-algoritmen. Det föreslagna systemet lyckas upptäcka det felaktiga beteendet oavsett om det komprometterade fordonet är en plutonledare eller följare. Empiriska resultat visar mycket hög prestanda för beteendeigenkänningsmodulen som när 100%. Dessutom kan den känna igen plutonens beteende i ett tidigt skede. Resultat med Forward- algoritmen för avvikelsedetektion visar på en prestanda på 98% med små variationer med avseende på måtten accuracy, precision, recall och F1-score. Avvikelsedetektionsmodulen kan även upptäcka små avvikelser i beteende.
|
3 |
CURE RATE AND DESTRUCTIVE CURE RATE MODELS UNDER PROPORTIONAL ODDS LIFETIME DISTRIBUTIONSFENG, TIAN January 2019 (has links)
Cure rate models, introduced by Boag (1949), are very commonly used while modelling
lifetime data involving long time survivors. Applications of cure rate models can be seen
in biomedical science, industrial reliability, finance, manufacturing, demography and criminology. In this thesis, cure rate models are discussed under a competing cause scenario,
with the assumption of proportional odds (PO) lifetime distributions for the susceptibles,
and statistical inferential methods are then developed based on right-censored data.
In Chapter 2, a flexible cure rate model is discussed by assuming the number of competing
causes for the event of interest following the Conway-Maxwell (COM) Poisson distribution,
and their corresponding lifetimes of non-cured or susceptible individuals can be
described by PO model. This provides a natural extension of the work of Gu et al. (2011)
who had considered a geometric number of competing causes. Under right censoring, maximum likelihood estimators (MLEs) are obtained by the use of expectation-maximization
(EM) algorithm. An extensive Monte Carlo simulation study is carried out for various scenarios,
and model discrimination between some well-known cure models like geometric,
Poisson and Bernoulli is also examined. The goodness-of-fit and model diagnostics of the
model are also discussed. A cutaneous melanoma dataset example is used to illustrate the
models as well as the inferential methods.
Next, in Chapter 3, the destructive cure rate models, introduced by Rodrigues et al. (2011), are discussed under the PO assumption. Here, the initial number of competing
causes is modelled by a weighted Poisson distribution with special focus on exponentially
weighted Poisson, length-biased Poisson and negative binomial distributions. Then, a damage
distribution is introduced for the number of initial causes which do not get destroyed.
An EM-type algorithm for computing the MLEs is developed. An extensive simulation
study is carried out for various scenarios, and model discrimination between the three
weighted Poisson distributions is also examined. All the models and methods of estimation
are evaluated through a simulation study. A cutaneous melanoma dataset example is used
to illustrate the models as well as the inferential methods.
In Chapter 4, frailty cure rate models are discussed under a gamma frailty wherein the
initial number of competing causes is described by a Conway-Maxwell (COM) Poisson
distribution in which the lifetimes of non-cured individuals can be described by PO model.
The detailed steps of the EM algorithm are then developed for this model and an extensive
simulation study is carried out to evaluate the performance of the proposed model and the
estimation method. A cutaneous melanoma dataset as well as a simulated data are used for
illustrative purposes.
Finally, Chapter 5 outlines the work carried out in the thesis and also suggests some
problems of further research interest. / Thesis / Doctor of Philosophy (PhD)
|
Page generated in 0.1706 seconds