• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 61
  • 4
  • 4
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 91
  • 91
  • 91
  • 19
  • 18
  • 13
  • 11
  • 10
  • 10
  • 9
  • 9
  • 8
  • 8
  • 8
  • 8
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Efficient Algorithms for Mining Data Streams

Boedihardjo, Arnold Priguna 06 September 2010 (has links)
Data streams are ordered sets of values that are fast, continuous, mutable, and potentially unbounded. Examples of data streams include the pervasive time series which span domains such as finance, medicine, and transportation. Mining data streams require approaches that are efficient, adaptive, and scalable. For several stream mining tasks, knowledge of the data's probability density function (PDF) is essential to deriving usable results. Providing an accurate model for the PDF benefits a variety of stream mining applications and its successful development can have far-reaching impact to the general discipline of stream analysis. Therefore, this research focuses on the construction of efficient and effective approaches for estimating the PDF of data streams. In this work, kernel density estimators (KDEs) are developed that satisfy the stringent computational stipulations of data streams, model unknown and dynamic distributions, and enhance the estimation quality of complex structures. Contributions of this work include: (1) theoretical development of the local region based KDE; (2) construction of a local region based estimation algorithm; (3) design of a generalized local region approach that can be applied to any global bandwidth KDE to enhance estimation accuracy; and (4) application extension of the local region based KDE to multi-scale outlier detection. Theoretical development includes the formulation of the local region concept to effectively approximate the computationally intensive adaptive KDE. This work also analyzes key theoretical properties of the local region based approach which include (amongst others) its expected performance, an alternative local region construction criterion, and its robustness under evolving distributions. Algorithmic design includes the development of a specific estimation technique that reduces the time/space complexities of the adaptive KDE. In order to accelerate mining tasks such as outlier detection, an integrated set of optimizations are proposed for estimating multiple density queries. Additionally, the local region concept is extended to an efficient algorithmic framework which can be applied to any global bandwidth KDEs. The combined solution can significantly improve estimation accuracy while retaining overall linear time/space costs. As an application extension, an outlier detection framework is designed which can effectively detect outliers within multiple data scale representations. / Ph. D.
32

Relating forced climate change to natural variability and emergent dynamics of the climate-economy system

Kellie-Smith, Owen January 2010 (has links)
This thesis is in two parts. The first part considers a theoretical relationship between the natural variability of a stochastic model and its response to a small change in forcing. Over a large enough scale, both the real climate and a climate model are characterised as stochastic dynamical systems. The dynamics of the systems are encoded in the probabilities that the systems move from one state into another. When the systems’ states are discretised and listed, then transition matrices of all these transition probabilities may be formed. The responses of the systems to a small change in forcing are expanded in terms of the eigenfunctions and eigenvalues of the Fokker-Planck equations governing the systems’ transition densities, which may be estimated from the eigenvalues and eigenvectors of the transition matrices. Smoothing the data with a Gaussian kernel improves the estimate of the eigenfunctions, but not the eigenvalues. The significance of differences in two systems’ eigenvalues and eigenfunctions is considered. Three time series from HadCM3 are compared with corresponding series from ERA-40 and the eigenvalues derived from the three pairs of series differ significantly. The second part analyses a model of the coupled climate-economic system, which suggests that the pace of economic growth needs to be reduced and the resilience to climate change needs to be increased in order to avoid a collapse of the human economy. The model condenses the climate-economic system into just three variables: a measure of human wealth, the associated accumulation of greenhouse gases, and the consequent level of global warming. Global warming is assumed to dictate the pace of economic growth. Depending on the sensitivity of economic growth to global warming, the model climate-economy system either reaches an equilibrium or oscillates in century-scale booms and busts.
33

Efficacité, généricité et praticabilité de l'attaque par information mutuelle utilisant la méthode d'estimation de densité par noyau / Efficiency, genericity and practicability of Kerned-based mutual information analysis

Carbone, Mathieu 16 March 2015 (has links)
De nos jours, les attaques par canaux auxiliaires sont facilement réalisables et très puissantes face aux implémentations cryptographiques. Cela pose une sérieuse menace en ce qui concerne la sécurité des crypto-systèmes. En effet, l'exécution d'un algorithme cryptographique produit inévitablement des fuites d'information liées aux données internes manipulées par le cryptosystèmes à travers des canaux auxiliaires (temps, température, consommation de courant, émissions électro-magnétiques, etc.). Certaines d'entre elles étant sensibles, un attaquant peut donc les exploiter afin de retrouver la clé secrète. Une des étapes les plus importantes d'une attaque par canaux auxiliaires est de quantifier la dépendance entre une quantité physique mesurée et un modèle de fuite supposé. Pour se faire, un outil statistique, aussi appelé distingueur, est utilisé dans le but de trouver une estimation de la clé secrète. Dans la littérature, une pléthore de distingueurs a été proposée. Cette thèse porte sur l'attaque utilisant l'information mutuelle comme distingueur, appelé l'attaque par information mutuelle. Dans un premier temps, nous proposons de combler le fossé d'un des problèmes majeurs concernant l'estimation du coefficient d'information mutuelle, lui-même demandant l'estimation de densité. Nos investigations ont été menées en utilisant une méthode non paramétrique pour l'estimation de densité: l'estimation par noyau. Une approche de sélection de la largeur de fenêtre basée sur l'adaptativité est proposée sous forme d'un critère (spécifique au cas des attaques par canaux auxiliaires). Par conséquent, une analyse est menée pour donner une ligne directrice afin de rendre l'attaque par information mutuelle optimale et générique selon la largeur de fenêtre mais aussi d'établir quel contexte (relié au moment statistique de la fuite) est plus favorable pour l'attaque par information mutuelle. Dans un second temps, nous abordons un autre problème lié au temps de calcul élevé (étroitement lié à la largeur de la fenêtre) de l'attaque par information mutuelle utilisant la méthode du noyau. Nous évaluons un algorithme appelé Arbre Dual permettant des évaluations rapides de fonctions noyau. Nous avons aussi montré expérimentalement que l'attaque par information mutuelle dans le domaine fréquentiel, est efficace et rapide quand celle-ci est combinée avec l'utilisation d'un modèle fréquentiel de fuite. En outre, nous avons aussi suggéré une extension d'une méthode déjà existante pour détecter une fuite basée sur un moment statistique d'ordre supérieur. / Nowadays, Side-Channel Analysis (SCA) are easy-to-implement whilst powerful attacks against cryptographic implementations posing a serious threat to the security of cryptosystems for the designers. Indeed, the execution of cryptographic algorithms unvoidably leaks information about internally manipulated data of the cryptosystem through side-channels (time, temperature, power consumption, electromagnetic emanations, etc), for which some of them are sensible(depending on the secret key). One of the most important SCA steps for an adversary is to quantify the dependency between the measured side-channel leakage and an assumed leakage model using a statistical tool, also called distinguisher, in order to find an estimation of the secret key. In the SCA literature, a plethora of distinguishers have been proposed. This thesis focuses on Mutual Information (MI) based attacks, the so-called Mutual Information Analysis (MIA) and proposes to fill the gap of the major practical issue consisting in estimating MI index which itself requires the estimation of underlying distributions. Investigations are conducted using the popular statistical technique for estimating the underlying density distribution with minimal assumptions: Kernel Density Estimation (KDE). First, a bandwidth selection scheme based on an adaptivity criterion is proposed. This criterion is specific to SCA.As a result, an in-depth analysis is conducted in order to provide a guideline to make MIA efficient and generic with respect to this tuning hyperparameter but also to establish which attack context (connected to the statistical moment of leakage) is favorable of MIA. Then, we address another issue of the kernel-based MIA lying in the computational burden through a so-called Dual-Tree algorithm allowing fast evaluations of 'pair-wise` kernel functions. We also showed that MIA running into the frequency domain is really effective and fast when combined with the use of an accurate frequency leakage model. Additionally, we suggested an extension of an existing method to detect leakage embedded on higher-order statistical moments.
34

Resampling Evaluation of Signal Detection and Classification : With Special Reference to Breast Cancer, Computer-Aided Detection and the Free-Response Approach

Bornefalk Hermansson, Anna January 2007 (has links)
<p>The first part of this thesis is concerned with trend modelling of breast cancer mortality rates. By using an age-period-cohort model, the relative contributions of period and cohort effects are evaluated once the unquestionable existence of the age effect is controlled for. The result of such a modelling gives indications in the search for explanatory factors. While this type of modelling is usually performed with 5-year period intervals, the use of 1-year period data, as in Paper I, may be more appropriate.</p><p>The main theme of the thesis is the evaluation of the ability to detect signals in x-ray images of breasts. Early detection is the most important tool to achieve a reduction in breast cancer mortality rates, and computer-aided detection systems can be an aid for the radiologist in the diagnosing process.</p><p>The evaluation of computer-aided detection systems includes the estimation of distributions. One way of obtaining estimates of distributions when no assumptions are at hand is kernel density estimation, or the adaptive version thereof that smoothes to a greater extent in the tails of the distribution, thereby reducing spurious effects caused by outliers. The technique is described in the context of econometrics in Paper II and then applied together with the bootstrap in the breast cancer research area in Papers III-V.</p><p>Here, estimates of the sampling distributions of different parameters are used in a new model for free-response receiver operating characteristic (FROC) curve analysis. Compared to earlier work in the field, this model benefits from the advantage of not assuming independence of detections in the images, and in particular, from the incorporation of the sampling distribution of the system's operating point.</p><p>Confidence intervals obtained from the proposed model with different approaches with respect to the estimation of the distributions and the confidence interval extraction methods are compared in terms of coverage and length of the intervals by simulations of lifelike data.</p>
35

STATISTICS IN THE BILLERA-HOLMES-VOGTMANN TREESPACE

Weyenberg, Grady S. 01 January 2015 (has links)
This dissertation is an effort to adapt two classical non-parametric statistical techniques, kernel density estimation (KDE) and principal components analysis (PCA), to the Billera-Holmes-Vogtmann (BHV) metric space for phylogenetic trees. This adaption gives a more general framework for developing and testing various hypotheses about apparent differences or similarities between sets of phylogenetic trees than currently exists. For example, while the majority of gene histories found in a clade of organisms are expected to be generated by a common evolutionary process, numerous other coexisting processes (e.g. horizontal gene transfers, gene duplication and subsequent neofunctionalization) will cause some genes to exhibit a history quite distinct from the histories of the majority of genes. Such “outlying” gene trees are considered to be biologically interesting and identifying these genes has become an important problem in phylogenetics. The R sofware package kdetrees, developed in Chapter 2, contains an implementation of the kernel density estimation method. The primary theoretical difficulty involved in this adaptation concerns the normalizion of the kernel functions in the BHV metric space. This problem is addressed in Chapter 3. In both chapters, the software package is applied to both simulated and empirical datasets to demonstrate the properties of the method. A few first theoretical steps in adaption of principal components analysis to the BHV space are presented in Chapter 4. It becomes necessary to generalize the notion of a set of perpendicular vectors in Euclidean space to the BHV metric space, but there some ambiguity about how to best proceed. We show that convex hulls are one reasonable approach to the problem. The Nye-PCA- algorithm provides a method of projecting onto arbitrary convex hulls in BHV space, providing the core of a modified PCA-type method.
36

An Analysis Tool for Flight Dynamics Monte Carlo Simulations

Restrepo, Carolina 1982- 16 December 2013 (has links)
Spacecraft design is inherently difficult due to the nonlinearity of the systems involved as well as the expense of testing hardware in a realistic environment. The number and cost of flight tests can be reduced by performing extensive simulation and analysis work to understand vehicle operating limits and identify circumstances that lead to mission failure. A Monte Carlo simulation approach that varies a wide range of physical parameters is typically used to generate thousands of test cases. Currently, the data analysis process for a fully integrated spacecraft is mostly performed manually on a case-by-case basis, often requiring several analysts to write additional scripts in order to sort through the large data sets. There is no single method that can be used to identify these complex variable interactions in a reliable and timely manner as well as be applied to a wide range of flight dynamics problems. This dissertation investigates the feasibility of a unified, general approach to the process of analyzing flight dynamics Monte Carlo data. The main contribution of this work is the development of a systematic approach to finding and ranking the most influential variables and combinations of variables for a given system failure. Specifically, a practical and interactive analysis tool that uses tractable pattern recognition methods to automate the analysis process has been developed. The analysis tool has two main parts: the analysis of individual influential variables and the analysis of influential combinations of variables. This dissertation describes in detail the two main algorithms used: kernel density estimation and nearest neighbors. Both are non-parametric density estimation methods that are used to analyze hundreds of variables and combinations thereof to provide an analyst with insightful information about the potential cause for a specific system failure. Examples of dynamical systems analysis tasks using the tool are provided.
37

Resampling Evaluation of Signal Detection and Classification : With Special Reference to Breast Cancer, Computer-Aided Detection and the Free-Response Approach

Bornefalk Hermansson, Anna January 2007 (has links)
The first part of this thesis is concerned with trend modelling of breast cancer mortality rates. By using an age-period-cohort model, the relative contributions of period and cohort effects are evaluated once the unquestionable existence of the age effect is controlled for. The result of such a modelling gives indications in the search for explanatory factors. While this type of modelling is usually performed with 5-year period intervals, the use of 1-year period data, as in Paper I, may be more appropriate. The main theme of the thesis is the evaluation of the ability to detect signals in x-ray images of breasts. Early detection is the most important tool to achieve a reduction in breast cancer mortality rates, and computer-aided detection systems can be an aid for the radiologist in the diagnosing process. The evaluation of computer-aided detection systems includes the estimation of distributions. One way of obtaining estimates of distributions when no assumptions are at hand is kernel density estimation, or the adaptive version thereof that smoothes to a greater extent in the tails of the distribution, thereby reducing spurious effects caused by outliers. The technique is described in the context of econometrics in Paper II and then applied together with the bootstrap in the breast cancer research area in Papers III-V. Here, estimates of the sampling distributions of different parameters are used in a new model for free-response receiver operating characteristic (FROC) curve analysis. Compared to earlier work in the field, this model benefits from the advantage of not assuming independence of detections in the images, and in particular, from the incorporation of the sampling distribution of the system's operating point. Confidence intervals obtained from the proposed model with different approaches with respect to the estimation of the distributions and the confidence interval extraction methods are compared in terms of coverage and length of the intervals by simulations of lifelike data.
38

Measure of Dependence for Length-Biased Survival Data

Bentoumi, Rachid January 2017 (has links)
In epidemiological studies, subjects with disease (prevalent cases) differ from newly diseased (incident cases). They tend to survive longer due to sampling bias, and related covariates will also be biased. Methods for regression analyses have recently been proposed to measure the potential effects of covariates on survival. The goal is to extend the dependence measure of Kent (1983), based on the information gain, in the context of length-biased sampling. In this regard, to estimate information gain and dependence measure for length-biased data, we propose two different methods namely kernel density estimation with a regression procedure and parametric copulas. We will assess the consistency for all proposed estimators. Algorithms detailing how to generate length-biased data, using kernel density estimation with regression procedure and parametric copulas approaches, are given. Finally, the performances of the estimated information gain and dependence measure, under length-biased sampling, are demonstrated through simulation studies.
39

American Black Bears (Ursus americanus) of the Paunsaugunt Plateau: Movements and Habitat Use

Dungan, Rebekah Adriana Castro 02 December 2019 (has links)
Concerns over human-bear conflict and questions about the ecology of Paunsaugunt Plateau's population of black bears (Ursus americanus) arose due to their visitation to popular recreation sites. Greater insight about bears and their habitat use provides a foundation for conflict mitigation and effective management decisions. Between 2014 and 2017, seventeen black bears (11 female, 6 male) were fitted with global positioning system (GPS) radio-collars so that we could track their locations, daily activity patterns, and ambient temperatures. By analyzing bear locations, we calculated annual and seasonal home ranges for 16 bears, including 25 den sites. Home ranges typically consisted of three dominant vegetation types, Utah juniper, ponderosa pine and Douglas fir. I used mixed effects models to better understand den site selection and found that slope (27.87 ± 2.03) was the most significant factor (p < 0.001). I also used mixed effects models to understand black bear selection of annual and seasonal home ranges. Predictor variables with the greatest effect (p < 0.001) were elevation (2419.99 ± 1.35) and aspect (138.44 ± 0.64), with coefficients of 1.128 and -1.483 respectively. Male annual home ranges (327.20 km2 ± 133.58 km2) were significantly larger (p = 0.035) than female home ranges (175.10 km2 ± 55.37 km2). However, annual home ranges for both sexes were larger than those during hyperphagia (p = 0.003) or mating (p = 0.004) seasonal home ranges, between which there was no difference (p = 0.451). Individual home ranges overlapped for most bears, consistent with their non-territorial nature. I found that bears avoided roads and lower elevations, while showing a preference for sloping terrain throughout the non-denning period. Paunsaugunt black bear home ranges are larger than any other black bear home ranges reported in literature. We determined weekly average distances and directions for all bears. For two bears, one male and one female, we determined daily averages and directions. Nine bears provided daily averages for 12 seasonal units across all four years. Activity patterns indicate the typical crepuscular pattern noted in normal bear populations that lack human habituation. Identifying areas core use areas and potential den sites is helpful to understanding black bear ecology and useful when making decisions about how to plan infrastructure and educate the public. This research indicates that Paunsaugunt black bears avoid human activity; however, we need continued research to help determine specific interactions between bears and anthropomorphic influences.
40

Identifying Untapped Potential: A Geospatial Analysis of Florida and California’s 2009 Recycled Water Production

Archer, Jana E., Luffman, Ingrid, Joyner, T. Andrew, Nandi, A. 01 June 2019 (has links)
Increased water demand attributed to population expansion and reduced freshwater availability caused by saltwater intrusion and drought, may lead to water shortages. These may be addressed, in part, by use of recycled water. Spatial patterns of recycled water use in Florida and California during 2009 were analyzed to detect gaps in distribution and identify potential areas for expansion. Databases of recycled water products and distribution centers for both states were developed by combining the 2008 Clean Water Needs Survey database with Florida’s 2009 Reuse Inventory and California’s 2009 Recycling Survey, respectively. Florida had over twice the number of distribution centers (n 1/4 426) than California (n 1/4 228) and produced a larger volume of recycled water (674.85 vs. 597.48 mgd (3.78 mL/d1/4 1 mgd), respectively). Kernel Density Estimation shows the majority of distribution in central Florida (Orlando and Tampa), California’s Central Valley region (Fresno and Bakersfield), and around major cities in California. Areas for growth were identified in the panhandle and southern regions of Florida, and northern, southwestern, and coastal California. Recycled water is an essential component of integrated water management and broader adoption of recycled water will increase water conservation in water-stressed coastal communities by allocating the recycled water for purposes that once used potable freshwater.

Page generated in 0.145 seconds