Global ETD Search

51	Algoritmy pro detekci anomálií v datech z klinických studií a zdravotnických registrů / Algorithms for anomaly detection in data from clinical trials and health registries Bondarenko, Maxim January 2018 (has links) This master's thesis deals with the problems of anomalies detection in data from clinical trials and medical registries. The purpose of this work is to perform literary research about quality of data in clinical trials and to design a personal algorithm for detection of anomalous records based on machine learning methods in real clinical data from current or completed clinical trials or medical registries. In the practical part is described the implemented algorithm of detection, consists of several parts: import of data from information system, preprocessing and transformation of imported data records with variables of different data types into numerical vectors, using well known statistical methods for detection outliers and evaluation of the quality and accuracy of the algorithm. The result of creating the algorithm is vector of parameters containing anomalies, which has to make the work of data manager easier. This algorithm is designed for extension the palette of information system functions (CLADE-IS) on automatic monitoring the quality of data by detecting anomalous records.
52	Algoritmy pro detekci anomálií v datech z klinických studií a zdravotnických registrů / Algorithms for anomaly detection in data from clinical trials and health registries Bondarenko, Maxim January 2018 (has links) This master's thesis deals with the problems of anomalies detection in data from clinical trials and medical registries. The purpose of this work is to perform literary research about quality of data in clinical trials and to design a personal algorithm for detection of anomalous records based on machine learning methods in real clinical data from current or completed clinical trials or medical registries. In the practical part is described the implemented algorithm of detection, consists of several parts: import of data from information system, preprocessing and transformation of imported data records with variables of different data types into numerical vectors, using well known statistical methods for detection outliers and evaluation of the quality and accuracy of the algorithm. The result of creating the algorithm is vector of parameters containing anomalies, which has to make the work of data manager easier. This algorithm is designed for extension the palette of information system functions (CLADE-IS) on automatic monitoring the quality of data by detecting anomalous records.
53	Mahalanobis kernel-based support vector data description for detection of large shifts in mean vector Nguyen, Vu 01 January 2015 (has links) Statistical process control (SPC) applies the science of statistics to various process control in order to provide higher-quality products and better services. The K chart is one among the many important tools that SPC offers. Creation of the K chart is based on Support Vector Data Description (SVDD), a popular data classifier method inspired by Support Vector Machine (SVM). As any methods associated with SVM, SVDD benefits from a wide variety of choices of kernel, which determines the effectiveness of the whole model. Among the most popular choices is the Euclidean distance-based Gaussian kernel, which enables SVDD to obtain a flexible data description, thus enhances its overall predictive capability. This thesis explores an even more robust approach by incorporating the Mahalanobis distance-based kernel (hereinafter referred to as Mahalanobis kernel) to SVDD and compare it with SVDD using the traditional Gaussian kernel. Method's sensitivity is benchmarked by Average Run Lengths obtained from multiple Monte Carlo simulations. Data of such simulations are generated from multivariate normal, multivariate Student's (t), and multivariate gamma populations using R, a popular software environment for statistical computing. One case study is also discussed using a real data set received from Halberg Chronobiology Center. Compared to Gaussian kernel, Mahalanobis kernel makes SVDD and thus the K chart significantly more sensitive to shifts in mean vector, and also in covariance matrix. Svdd svm support vector support vector machine support vector data description k chart process control control chart multivariate statistics statistical computing computational statistical process control mahalanobis quality control Statistics and Probability
54	變數轉換之離群值偵測 / Detection of Outliers with Data Transformation 吳秉勳, David Wu Unknown Date (has links) 在迴歸分析中，當資料中存在很多離群值時，偵測的工作變得非常不容易。在此狀況下，我們無法使用傳統的殘差分析正確地偵測出其是否存在，此現象稱為遮蔽效應(The Masking Effect)。而為了避免此效應的發生，我們利用最小中位數穩健迴歸估計值(Least Median Squares Estimator)正確地找出這些群集離群值，此估計值擁有最大即50﹪的容離值 (Breakdown point)。在這篇論文中，用來求出最小中位數穩健迴歸估計值的演算法稱為步進搜尋演算法 (the Forward Search Algorithm)。結果顯示，我們可以利用此演算法得到的穩健迴歸估計值，很快並有效率的找出資料中的群集離群值；另外，更進一步的結果顯示，我們只需從資料中隨機選取一百次子集，並進行步進搜尋，即可得到概似的穩健迴歸估計值並正確的找出那些群集離群值。最後，我們利用鐘乳石圖(Stalactite Plot)列出所有被偵測到的離群值。在多變量資料中，我們若使用Mahalanobis距離也會遭遇到同樣的屏蔽效應。而此一問題，隨著另一高度穩健估計值的採用，亦可迎刃而解。此估計值稱為最小體積橢圓體估計值 (Minimum Volume Ellipsoid)，其亦擁有最大即50﹪的容離值。在此，我們也利用步進搜尋法求出此估計值，並利用鐘乳石圖列出所有被偵測到的離群值。這篇論文的第二部分則利用變數轉換的技巧將迴歸資料中的殘差項常態化並且加強其等變異的特性以利後續的資料分析。在步進搜尋進行的過程中，我們觀察分數統計量(Score Statistic)和其他相關診斷統計量的變化。結果顯示，這些統計量一起提供了有關轉換參數選取豐富的資訊，並且我們亦可從步進搜尋進行的過程中觀察出某些離群值對參數選取的影響。 / Detecting regression outliers is not trivial when there are many of them. The methods of using classical diagnostic plots sometimes fail to detect them. This phenomenon is known as the masking effect. To avoid this, we propose to find out those multiple outliers by using a highly robust regression estimator called the least median squares (LMS) estimator which has maximal breakdown point. The algorithm in search of the LMS estimator is called the forward search algorithm. The estimator found by the forward search is shown to lead to the rapid detection of multiple outliers. Furthermore, the result reveals that 100 repeats of a simple forward search from a random starting subset are shown to provide sufficiently robust parameter estimators to reveal multiple outliers. Finally, those detected outliers are exhibited by the stalactite plot that shows greatly stable pattern of them. Referring to multivariate data, the Mahalanobis distance also suffers from the masking effect that can be remedied by using a highly robust estimator called the minimum volume ellipsoid (MVE) estimator. It can also be found by using the forward search algorithm and it also has maximal breakdown point. The detected outliers are then displayed in the stalactite plot. The second part of this dissertation is the transformation of regression data so that the approximate normality and the homogeneity of the residuals can be achieved. During the process of the forward search, we monitor the quantity of interest called score statistic and some other diagnostic plots. They jointly provide a wealth of information about transformation along with the effect of individual observation on this statistic. 容離值最小中位數穩健迴歸估計值遮蔽效應最小體積橢圓體估計值 Mahalanobis 距離分數統計量鐘乳石圖步進搜尋演算法 Breakdown Point Least Median Square (LMS) Estimator The Masking Effect Minimum Volume Ellipsoid (MVE) Estimator Mahalanobis Distance Score Statistic Stalactite Plot The Forward Search Algorithm
55	Robot navigation in sensor space Keeratipranon, Narongdech January 2009 (has links) This thesis investigates the problem of robot navigation using only landmark bearings. The proposed system allows a robot to move to a ground target location specified by the sensor values observed at this ground target posi- tion. The control actions are computed based on the difference between the current landmark bearings and the target landmark bearings. No Cartesian coordinates with respect to the ground are computed by the control system. The robot navigates using solely information from the bearing sensor space. Most existing robot navigation systems require a ground frame (2D Cartesian coordinate system) in order to navigate from a ground point A to a ground point B. The commonly used sensors such as laser range scanner, sonar, infrared, and vision do not directly provide the 2D ground coordi- nates of the robot. The existing systems use the sensor measurements to localise the robot with respect to a map, a set of 2D coordinates of the objects of interest. It is more natural to navigate between the points in the sensor space corresponding to A and B without requiring the Cartesian map and the localisation process. Research on animals has revealed how insects are able to exploit very limited computational and memory resources to successfully navigate to a desired destination without computing Cartesian positions. For example, a honeybee balances the left and right optical flows to navigate in a nar- row corridor. Unlike many other ants, Cataglyphis bicolor does not secrete pheromone trails in order to find its way home but instead uses the sun as a compass to keep track of its home direction vector. The home vector can be inaccurate, so the ant also uses landmark recognition. More precisely, it takes snapshots and compass headings of some landmarks. To return home, the ant tries to line up the landmarks exactly as they were before it started wandering. This thesis introduces a navigation method based on reflex actions in sensor space. The sensor vector is made of the bearings of some landmarks, and the reflex action is a gradient descent with respect to the distance in sensor space between the current sensor vector and the target sensor vec- tor. Our theoretical analysis shows that except for some fully characterized pathological cases, any point is reachable from any other point by reflex action in the bearing sensor space provided the environment contains three landmarks and is free of obstacles. The trajectories of a robot using reflex navigation, like other image- based visual control strategies, do not correspond necessarily to the shortest paths on the ground, because the sensor error is minimized, not the moving distance on the ground. However, we show that the use of a sequence of waypoints in sensor space can address this problem. In order to identify relevant waypoints, we train a Self Organising Map (SOM) from a set of observations uniformly distributed with respect to the ground. This SOM provides a sense of location to the robot, and allows a form of path planning in sensor space. The navigation proposed system is analysed theoretically, and evaluated both in simulation and with experiments on a real robot.
56	Studying the effectiveness of dynamic analysis for fingerprinting Android malware behavior / En studie av effektivitet hos dynamisk analys för kartläggning av beteenden hos Android malware Regard, Viktor January 2019 (has links) Android is the second most targeted operating system for malware authors and to counter the development of Android malware, more knowledge about their behavior is needed. There are mainly two approaches to analyze Android malware, namely static and dynamic analysis. Recently in 2017, a study and well labeled dataset, named AMD (Android Malware Dataset), consisting of over 24,000 malware samples was released. It is divided into 135 varieties based on similar malicious behavior, retrieved through static analysis of the file classes.dex in the APK of each malware, whereas the labeled features were determined by manual inspection of three samples in each variety. However, static analysis is known to be weak against obfuscation techniques, such as repackaging or dynamic loading, which can be exploited to avoid the analysis. In this study the second approach is utilized and all malware in the dataset are analyzed at run-time in order to monitor their dynamic behavior. However, analyzing malware at run-time has known weaknesses as well, as it can be avoided through, for instance, anti-emulator techniques. Therefore, the study aimed to explore the available sandbox environments for dynamic analysis, study the effectiveness of fingerprinting Android malware using one of the tools and investigate whether static features from AMD and the dynamic analysis correlate. For instance, by an attempt to classify the samples based on similar dynamic features and calculating the Pearson Correlation Coefficient (r) for all combinations of features from AMD and the dynamic analysis. The comparison of tools for dynamic analysis, showed a need of development, as most popular tools has been released for a long time and the common factor is a lack of continuous maintenance. As a result, the choice of sandbox environment for this study ended up as Droidbox, because of aspects like ease of use/install and easily adaptable for large scale analysis. Based on the dynamic features extracted with Droidbox, it could be shown that Android malware are more similar to the varieties which they belong to. The best metric for classifying samples to varieties, out of four investigated metrics, turned out to be Cosine Similarity, which received an accuracy of 83.6% for the entire dataset. The high accuracy indicated a correlation between the dynamic features and static features which the varieties are based on. Furthermore, the Pearson Correlation Coefficient confirmed that the manually extracted features, used to describe the varieties, and the dynamic features are correlated to some extent, which could be partially confirmed by a manual inspection in the end of the study. Android malware dynamic analysis droidbox cuckoodroid droidscope mobsf malware behavior correlation pearson correlation cosine similarity euclidean distance chebyshev distance mahalanobis distance similarity analysis static features dynamic features tf-idf AMD Android malware dataset malware dataset UpDroid EC2 Computer and Information Sciences Data- och informationsvetenskap
57	A Wide-Area Perspective on Power System Operation and Dynamics Gardner, Robert Matthew 23 April 2008 (has links) Classically, wide-area synchronized power system monitoring has been an expensive task requiring significant investment in utility communications infrastructures for the service of relatively few costly sensors. The purpose of this research is to demonstrate the viability of power system monitoring from very low voltage levels (120 V). Challenging the accepted norms in power system monitoring, the document will present the use of inexpensive GPS time synchronized sensors in mass numbers at the distribution level. In the past, such low level monitoring has been overlooked due to a perceived imbalance between the required investment and the usefulness of the resulting deluge of information. However, distribution level monitoring offers several advantages over bulk transmission system monitoring. First, practically everyone with access to electricity also has a measurement port into the electric power system. Second, internet access and GPS availability have become pedestrian commodities providing a communications and synchronization infrastructure for the transmission of low-voltage measurements. Third, these ubiquitous measurement points exist in an interconnected fashion irrespective of utility boundaries. This work offers insight into which parameters are meaningful to monitor at the distribution level and provides applications that add unprecedented value to the data extracted from this level. System models comprising the entire Eastern Interconnection are exploited in conjunction with a bounty of distribution level measurement data for the development of wide-area disturbance detection, classification, analysis, and location routines. The main contributions of this work are fivefold: the introduction of a novel power system disturbance detection algorithm; the development of a power system oscillation damping analysis methodology; the development of several parametric and non-parametric power system disturbance location methods, new methods of power system phenomena visualization, and the proposal and mapping of an online power system event reporting scheme. / Ph. D. TDOA FNET FDR GPS Wide-Area monitoring wide-area measurements power system event power system load shedding generation trip eastern interconnection wams ems nerc ercot wecc parzen window interconnection islanding PMU half-plane method least squares event trigger generation-load mismatch electromechanical wave wave propagation time delay of arrival oscillation trigger modal analysis electric grid transmission network transmission system hypocenter frequency matrix pencil mahalanobis distance
58	Evaluation of Target Tracking Using Multiple Sensors and Non-Causal Algorithms Vestin, Albin, Strandberg, Gustav January 2019 (has links) Today, the main research field for the automotive industry is to find solutions for active safety. In order to perceive the surrounding environment, tracking nearby traffic objects plays an important role. Validation of the tracking performance is often done in staged traffic scenarios, where additional sensors, mounted on the vehicles, are used to obtain their true positions and velocities. The difficulty of evaluating the tracking performance complicates its development. An alternative approach studied in this thesis, is to record sequences and use non-causal algorithms, such as smoothing, instead of filtering to estimate the true target states. With this method, validation data for online, causal, target tracking algorithms can be obtained for all traffic scenarios without the need of extra sensors. We investigate how non-causal algorithms affects the target tracking performance using multiple sensors and dynamic models of different complexity. This is done to evaluate real-time methods against estimates obtained from non-causal filtering. Two different measurement units, a monocular camera and a LIDAR sensor, and two dynamic models are evaluated and compared using both causal and non-causal methods. The system is tested in two single object scenarios where ground truth is available and in three multi object scenarios without ground truth. Results from the two single object scenarios shows that tracking using only a monocular camera performs poorly since it is unable to measure the distance to objects. Here, a complementary LIDAR sensor improves the tracking performance significantly. The dynamic models are shown to have a small impact on the tracking performance, while the non-causal application gives a distinct improvement when tracking objects at large distances. Since the sequence can be reversed, the non-causal estimates are propagated from more certain states when the target is closer to the ego vehicle. For multiple object tracking, we find that correct associations between measurements and tracks are crucial for improving the tracking performance with non-causal algorithms. evaluation target tracking multiple sensors non-causal smoother smoothing tracking vehicle tracking camera lidar estimate estimation prediction vehicle dynamics sensor fusion real-time tracking extended kalman filter filter validation validation position estimation velocity estimation dynamic model model complexity multi object tracking multiple object tracking single object tracking data association tracking fundamentals iterated kalman filter track management gnn global nearest neighbour mahalanobis mahalanobis distance performance evaluation differential gps dgps roi ego several sensors sensors rmse root mean square error invertible motion anti-causal motion anti-causal tracking constant velocity gnn imu tfs two filter smoother ekf rts radar inertial measurement unit nonlinear nonlinear systems mono camera monocular camera noise model tracking performance fixed interval smoothing m/n logic centralized fusion non-causal object tracker car tracking car dynamics automotive active safety object tracking automotive industry thesis master reverse dynamics reverse tracking reverse sequence sequence tracking data propagation ground truth estimating ground truth additional sensors mounted sensors true estimates environment comparison algorithm independent targets overlapping measurements occluded track switch improve lower uncertainty more certain state process noise covariance sampling image sprt adas cnn cv pdf track target ego tracker tentative track observatiom online tracking offline tracking online offline recorded sequences robust self driving self-driving car traffic trajectory true state scenario scenarios future accurate output advanced driver assistance systems non-linear complex noise pedestrian truck bus maneuvering vehicles processed measurement frame state correction probability density function tuning likelihood transition measurement motion model recursion gaussian approximation distribution linear jacobian multiplicative noise ratio ad hoc ad hoc state space approach backward auction euclidean distance statistical threshold gating association margin normalize covariance matrix fusion confirmed rejected tentative history absolute error modular ego motion parameters variables logg hardware specification fused causal factorization independent uncorrelated transform moving rotation translation oncoming overtaking Control Engineering Reglerteknik

Search results