• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 75
  • 10
  • 5
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 114
  • 114
  • 44
  • 26
  • 21
  • 20
  • 19
  • 19
  • 15
  • 15
  • 14
  • 14
  • 14
  • 13
  • 12
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Classification of Points Acquired by Airborne Laser Systems

Ruhe, Jakob, Nordin, Johan January 2007 (has links)
During several years research has been performed at the Department of Laser Systems, the Swedish Defense Research Agency (FOI), to develop methods to produce high resolution 3D environment models based on data acquired with airborne laser systems. The 3D models are used for several purposes, both military and civilian applications, for example mission planning, crisis management analysis and planning of infrastructure. We have implemented a new format to store laser point data. Instead of storing rasterized images of the data this new format stores the original location of each point. We have also implemented a new method to detect outliers, methods to estimate the ground surface and also to divide the remaining data into two classes: buildings and vegetation. It is also shown that it is possible to get more accurate results by analyzing the points directly instead of only using rasterized images and image processing algorithms. We show that these methods can be implemented without increasing the computational complexity.
32

Inference and Visualization of Periodic Sequences

Sun, Ying 2011 August 1900 (has links)
This dissertation is composed of four articles describing inference and visualization of periodic sequences. In the first article, a nonparametric method is proposed for estimating the period and values of a periodic sequence when the data are evenly spaced in time. The period is estimated by a "leave-out-one-cycle" version of cross-validation (CV) and complements the periodogram, a widely used tool for period estimation. The CV method is computationally simple and implicitly penalizes multiples of the smallest period, leading to a "virtually" consistent estimator. The second article is the multivariate extension, where we present a CV method of estimating the periods of multiple periodic sequences when data are observed at evenly spaced time points. The basic idea is to borrow information from other correlated sequences to improve estimation of the period of interest. We show that the asymptotic behavior of the bivariate CV is the same as the CV for one sequence, however, for finite samples, the better the periods of the other correlated sequences are estimated, the more substantial improvements can be obtained. The third article proposes an informative exploratory tool, the functional boxplot, for visualizing functional data, as well as its generalization, the enhanced functional boxplot. Based on the center outwards ordering induced by band depth for functional data, the descriptive statistics of a functional boxplot are: the envelope of the 50 percent central region, the median curve and the maximum non-outlying envelope. In addition, outliers can be detected by the 1.5 times the 50 percent central region empirical rule. The last article proposes a simulation-based method to adjust functional boxplots for correlations when visualizing functional and spatio-temporal data, as well as detecting outliers. We start by investigating the relationship between the spatiotemporal dependence and the 1.5 times the 50 percent central region empirical outlier detection rule. Then, we propose to simulate observations without outliers based on a robust estimator of the covariance function of the data. We select the constant factor in the functional boxplot to control the probability of correctly detecting no outliers. Finally, we apply the selected factor to the functional boxplot of the original data.
33

Wavelet-based Outlier Detection And Denoising Of Airborne Laser Scanning Data

Akyay, Tolga 01 December 2008 (has links) (PDF)
The method of airborne laser scanning &ndash / also named as LIDAR &ndash / has recently turned out to be an efficient way for generating high quality digital surface and elevation models. In this work, wavelet-based outlier detection and different wavelet thresholding (wavelet shrinkage) methods for denoising of airborne laser scanning data are discussed. The task is to investigate the effect of wavelet-based outlier detection and find out which wavelet thresholding methods provide best denoising results for post-processing. Data and results are analyzed and visualized by using a MATLAB program which was developed during this work.
34

A Comparison Of Some Robust Regression Techniques

Avci, Ezgi 01 September 2009 (has links) (PDF)
Robust regression is a commonly required approach in industrial studies like data mining, quality control and improvement, and finance areas. Among the robust regression methods / Least Median Squares, Least Trimmed Squares, Mregression, MM-method, Least Absolute Deviations, Locally Weighted Scatter Plot Smoothing and Multivariate Adaptive Regression Splines are compared under contaminated normal distributions with each other and Ordinary Least Squares with respect to the multiple outlier detection performance measures. In this comparison / a simulation study is performed by changing some of the parameters such as outlier density, outlier locations in the x-axis, sample size and number of independent variables. In the comparison of the methods, multiple outlier detection is carried out with respect to the performance measures detection capability, false alarm rate and improved mean square error and ratio of improved mean square error. As a result of this simulation study, the three most competitive methods are compared on an industrial data set with respect to the coefficient of multiple determination and mean square error.
35

Improving Data Quality: Development and Evaluation of Error Detection Methods

Lee, Nien-Chiu 25 July 2002 (has links)
High quality of data are essential to decision support in organizations. However estimates have shown that 15-20% of data within an organization¡¦s databases can be erroneous. Some databases contain large number of errors, leading to a large potential problem if they are used for managerial decision-making. To improve data quality, data cleaning endeavors are needed and have been initiated by many organizations. Broadly, data quality problems can be classified into three categories, including incompleteness, inconsistency, and incorrectness. Among the three data quality problems, data incorrectness represents the major sources for low quality data. Thus, this research focuses on error detection for improving data quality. In this study, we developed a set of error detection methods based on the semantic constraint framework. Specifically, we proposed a set of error detection methods including uniqueness detection, domain detection, attribute value dependency detection, attribute domain inclusion detection, and entity participation detection. Empirical evaluation results showed that some of our proposed error detection techniques (i.e., uniqueness detection) achieved low miss rates and low false alarm rates. Overall, our error detection methods together could identify around 50% of the errors introduced by subjects during experiments.
36

Toward accurate and efficient outlier detection in high dimensional and large data sets

Nguyen, Minh Quoc 22 April 2010 (has links)
An efficient method to compute local density-based outliers in high dimensional data was proposed. In our work, we have shown that this type of outlier is present even in any subset of the dataset. This property is used to partition the data set into random subsets to compute the outliers locally. The outliers are then combined from different subsets. Therefore, the local density-based outliers can be computed efficiently. Another challenge in outlier detection in high dimensional data is that the outliers are often suppressed when the majority of dimensions do not exhibit outliers. The contribution of this work is to introduce a filtering method whereby outlier scores are computed in sub-dimensions. The low sub-dimensional scores are filtered out and the high scores are aggregated into the final score. This aggregation with filtering eliminates the effect of accumulating delta deviations in multiple dimensions. Therefore, the outliers are identified correctly. In some cases, the set of outliers that form micro patterns are more interesting than individual outliers. These micro patterns are considered anomalous with respect to the dominant patterns in the dataset. In the area of anomalous pattern detection, there are two challenges. The first challenge is that the anomalous patterns are often overlooked by the dominant patterns using the existing clustering techniques. A common approach is to cluster the dataset using the k-nearest neighbor algorithm. The contribution of this work is to introduce the adaptive nearest neighbor and the concept of dual-neighbor to detect micro patterns more accurately. The next challenge is to compute the anomalous patterns very fast. Our contribution is to compute the patterns based on the correlation between the attributes. The correlation implies that the data can be partitioned into groups based on each attribute to learn the candidate patterns within the groups. Thus, a feature-based method is developed that can compute these patterns efficiently.
37

Classification of Points Acquired by Airborne Laser Systems

Ruhe, Jakob, Nordin, Johan January 2007 (has links)
<p>During several years research has been performed at the Department of Laser Systems, the Swedish Defense Research Agency (FOI), to develop methods to produce high resolution 3D environment models based on data acquired with airborne laser systems. The 3D models are used for several purposes, both military and civilian applications, for example mission planning, crisis management analysis and planning of infrastructure.</p><p>We have implemented a new format to store laser point data. Instead of storing rasterized images of the data this new format stores the original location of each point. We have also implemented a new method to detect outliers, methods to estimate the ground surface and also to divide the remaining data into two classes: buildings and vegetation.</p><p>It is also shown that it is possible to get more accurate results by analyzing the points directly instead of only using rasterized images and image processing algorithms. We show that these methods can be implemented without increasing the computational complexity.</p>
38

Robust techniques for regression models with minimal assumptions / M.M. van der Westhuizen

Van der Westhuizen, Magdelena Marianna January 2011 (has links)
Good quality management decisions often rely on the evaluation and interpretation of data. One of the most popular ways to investigate possible relationships in a given data set is to follow a process of fitting models to the data. Regression models are often employed to assist with decision making. In addition to decision making, regression models can also be used for the optimization and prediction of data. The success of a regression model, however, relies heavily on assumptions made by the model builder. In addition, the model may also be influenced by the presence of outliers; a more robust model, which is not as easily affected by outliers, is necessary in making more accurate interpretations about the data. In this research study robust techniques for regression models with minimal assumptions are explored. Mathematical programming techniques such as linear programming, mixed integer linear programming, and piecewise linear regression are used to formulate a nonlinear regression model. Outlier detection and smoothing techniques are included to address the robustness of the model and to improve predictive accuracy. The performance of the model is tested by applying it to a variety of data sets and comparing the results to those of other models. The results of the empirical experiments are also presented in this study. / Thesis (M.Sc. (Computer Science))--North-West University, Potchefstroom Campus, 2011.
39

Robust techniques for regression models with minimal assumptions / M.M. van der Westhuizen

Van der Westhuizen, Magdelena Marianna January 2011 (has links)
Good quality management decisions often rely on the evaluation and interpretation of data. One of the most popular ways to investigate possible relationships in a given data set is to follow a process of fitting models to the data. Regression models are often employed to assist with decision making. In addition to decision making, regression models can also be used for the optimization and prediction of data. The success of a regression model, however, relies heavily on assumptions made by the model builder. In addition, the model may also be influenced by the presence of outliers; a more robust model, which is not as easily affected by outliers, is necessary in making more accurate interpretations about the data. In this research study robust techniques for regression models with minimal assumptions are explored. Mathematical programming techniques such as linear programming, mixed integer linear programming, and piecewise linear regression are used to formulate a nonlinear regression model. Outlier detection and smoothing techniques are included to address the robustness of the model and to improve predictive accuracy. The performance of the model is tested by applying it to a variety of data sets and comparing the results to those of other models. The results of the empirical experiments are also presented in this study. / Thesis (M.Sc. (Computer Science))--North-West University, Potchefstroom Campus, 2011.
40

Χρήση μεθόδων εξόρυξης δεδομένων στη δημιουργία νευρωκανόνων

Αγγελόπουλος, Νικόλαος 03 November 2011 (has links)
Στην εργασία αυτή παρουσιάζεται μια εναλλακτική διαδικασία διάσπασης ενός μη διαχωρίσιμου συνόλου εκπαίδευσης για την παραγωγή νευρωκανόνων. Η υπάρχουσα διαδικασία παρήγαγε νευρωκανόνες από μη γραμμικά σύνολα διασπώντας τα σε δύο υποσύνολα με βάση την «απόσταση» των προτύπων καταλήγοντας συχνά σε πολλαπλή αναπαράσταση της ίδιας γνώσης. Με την παρούσα εργασία διερευνάται η δυνατότητα της διάσπασης ενός μη διαχωρίσιμου συνόλου σε k υποσύνολα με χρήση μεθόδων συσταδοποίησης. Το k μπορεί είτε να αποτελεί είσοδο της διαδικασίας είτε να υπολογίζεται δυναμικά από ένα συγκεκριμένο εύρος τιμών. Η δεύτερη στρατηγική διάσπασης (δυναμικός k-modes) φαίνεται να έχει τα καλύτερα αποτελέσματα, ενώ η πρώτη (τροποποιημένος k-modes) παρουσιάζει συγκρίσιμα αποτελέσματα με την υπάρχουσα μέθοδο για μικρά k. Και οι δύο στρατηγικές διάσπασης μπορούν να συνδυαστούν με μία μέθοδο εύρεσης εκτόπων που αφαιρεί από το αρχικό σύνολο εκπαίδευσης μεμονωμένα παραδείγματα που αποκλίνουν «περισσότερο» από τα υπόλοιπα. / In this thesis we present an alternative splitting policy of a non separable training set used for the production of neurules. The existing method produced neurules from non linear training sets by “breaking” them into two subsets based on “distance” between patterns often leading to multiple representations of the same knowledge. The present thesis looks into the possibility of splitting a non separable training set into k subsets using clustering methods. The number k can be treated as an input to the process or it can be calculated dynamically from a specific range of values. The second splitting strategy (dynamic k-modes) appears to have the best results, while the first one (modified k-modes) gives similar results to the existing method for small values of k. Moreover, both splitting strategies can be combined with an outlier detection process which removes from the initial training set remote examples that deviate more from the rest, thus improving their performance.

Page generated in 0.09 seconds