31 |
Empirical Mode Decomposition for Noise-Robust Automatic Speech RecognitionWu, Kuo-hao 25 August 2010 (has links)
In this thesis, a novel technique based on the empirical mode decomposition (EMD) methodology
is proposed and examined for the noise-robustness of automatic speech recognition systems. The EMD analysis is a generalization of the Fourier analysis for processing nonlinear and non-stationary time functions, in our case, the speech feature sequences. We use the intrinsic mode functions (IMF), which include the sinusoidal functions as special cases,
obtained from the EMD analysis in the post-processing of the log energy feature. We evaluate
the proposed method on Aurora 2.0 and Aurora 3.0 databases. On Aurora 2.0, we obtain a 44.9% overall relative improvement over the baseline for the mismatched (clean-training) tasks. The results show an overall improvement of 49.5% over the baseline for Aurora 3.0 on the high-mismatch tasks. It shows that our proposed method leads to significant improvement.
|
32 |
Guaranteed Verification of Finite Element Solutions of Heat ConductionWang, Delin 2011 May 1900 (has links)
This dissertation addresses the accuracy of a-posteriori error estimators for finite element solutions of problems with high orthotropy especially for cases where rather
coarse meshes are used, which are often encountered in engineering computations. We present sample computations which indicate lack of robustness of all standard
residual estimators with respect to high orthotropy. The investigation shows that the main culprit behind the lack of robustness of residual estimators is the coarseness
of the finite element meshes relative to the thickness of the boundary and interface layers in the solution.
With the introduction of an elliptic reconstruction procedure, a new error estimator based on the solution of the elliptic reconstruction problem is invented to
estimate the exact error measured in space-time C-norm for both semi-discrete and fully discrete finite element solutions to linear parabolic problem. For a fully discrete solution, a temporal error estimator is also introduced to evaluate the discretization error in the temporal field. In the meantime, the implicit Neumann subdomain residual estimator for elliptic equations, which involves the solution of the local residual
problem, is combined with the elliptic reconstruction procedure to carry out a posteriori error estimation for the linear parabolic problem. Numerical examples are
presented to illustrate the superconvergence properties in the elliptic reconstruction and the performance of the bounds based on the space-time C-norm.
The results show that in the case of L^2 norm for smooth solution there is no superconvergence in elliptic reconstruction for linear element, and for singular solution the superconvergence does not exist for element of any order while in the case of energy norm the superconvergence always exists in elliptic reconstruction. The research also shows that the performance of the bounds based on space-time C-norm is robust, and in the case of fully discrete finite element solution the bounds for the temporal error are sharp.
|
33 |
Learning with high-dimensional noisy dataChen, Yudong 25 September 2013 (has links)
Learning an unknown parameter from data is a problem of fundamental importance across many fields of engineering and science. Rapid development in information technology allows a large amount of data to be collected. The data is often highly non-uniform and noisy, sometimes subject to gross errors and even direct manipulations. Data explosion also highlights the importance of the so-called high-dimensional regime, where the number of variables might exceed the number of samples. Extracting useful information from the data requires high-dimensional learning algorithms that are robust to noise. However, standard algorithms for the high-dimensional regime are often brittle to noise, and the suite of techniques developed in Robust Statistics are often inapplicable to large and high-dimensional data. In this thesis, we study the problem of robust statistical learning in high-dimensions from noisy data. Our goal is to better understand the behaviors and effect of noise in high-dimensional problems, and to develop algorithms that are statistically efficient, computationally tractable, and robust to various types of noise. We forge into this territory by considering three important sub-problems. We first look at the problem of recovering a sparse vector from a few linear measurements, where both the response vector and the covariate matrix are subject to noise. Both stochastic and arbitrary noise are considered. We show that standard approaches are inadequate in these settings. We then develop robust efficient algorithms that provably recover the support and values of the sparse vector under different noise models and require minimum knowledge of the nature of the noise. Next, we study the problem of recovering a low-rank matrix from partially observed entries, with some of the observations arbitrarily corrupted. We consider the entry-wise corruption setting where no row or column has too many entries corrupted, and provide performance guarantees for a natural convex relaxation approach. Our unified guarantees cover both randomly and deterministically located corruptions, and improve upon existing results. We then turn to the column-wise corruption case where all observations from some columns are arbitrarily contaminated. We propose a new convex optimization approach and show that it simultaneously identify the corrupted columns and recover unobserved entries in the uncorrupted columns. Lastly, we consider the graph clustering problem, i.e., arranging the nodes of a graph into clusters such that there are relatively dense connections inside the clusters and sparse connections across different clusters. We propose a semi-random Generalized Stochastic Blockmodel for clustered graphs and develop a new algorithm based on convexified maximum likelihood estimators. We provide theoretical performance guarantees which recover, and sometimes improve on, all exiting results for the classical stochastic blockmodel, the planted k-clique model and the planted coloring models. We extend our algorithm to the case where the clusters are allowed to overlap with each other, and provide theoretical characterization of the performance of the algorithm. A further extension is studied when the graph may change over time. We develop new approaches to incorporate the time dynamics and show that it can identify stable overlapping communities in real-world time-evolving graphs. / text
|
34 |
Separating data from metadata for robustness and scalabilityWang, Yang, active 21st century 09 February 2015 (has links)
When building storage systems that aim to simultaneously provide robustness, scalability, and efficiency, one faces a fundamental tension, as higher robustness typically incurs higher costs and thus hurts both efficiency and scalability. My research shows that an approach to storage system design based on a simple principle—separating data from metadata—can yield systems that address elegantly and effectively that tension in a variety of settings. One observation motivates our approach: much of the cost paid by many strong protection techniques is incurred to detect errors. This observation suggests an opportunity: if we can build a low-cost oracle to detect errors and identify correct data, it may be possible to reduce the cost of protection without weakening its guarantees. This dissertation shows that metadata, if carefully designed, can serve as such an oracle and help a storage system protect its data with minimal cost. This dissertation shows how to effectively apply this idea in three very different systems: Gnothi—a storage replication protocol that combines the high availability of asynchronous replication and the low cost of synchronous replication for a small-scale block storage; Salus—a large-scale block storage with unprecedented guarantees in terms of consistency, availability, and durability in the face of a wide range of server failures; and Exalt—a tool to emulate a large storage system with 100 times fewer machines. / text
|
35 |
Engineering Complex Systems with an Emphasis on Robustness: Utility-Based Analysis with Focus on RobustnessBaxter, Benjamin Andrew 16 December 2013 (has links)
Engineered system complexity continues to increase rapidly, concurrent with the requirement for the engineered system to be robust. Robustness is often considered a critical attribute of complex engineered systems, but an exact definition of robustness is not agreed upon within the systems engineering community. Lack of a clear definition, makes it difficult to develop or utilize a quantitative measure of robustness. Having a formal measure for robustness may not be considered necessary, but a lack of a specific measure results in the inability to communicate the desired level of robustness, inability to measure how various options impact robustness, and makes it difficult to measure tradeoffs between robustness and other engineering parameters.
The objective of this research is to examine robustness and how it can be attained in systems engineering. In order to accomplish this objective, data from several scientific communities is examined to develop the meaning of robustness. While definitions between and even within each community differ, a key attribute is present in each definition: A robust system needs to maintain its core functions in the presence of internal and external changes. The key component of the characteristic is that each function within a system has its own measure of robustness.
When robustness and engineering are discussed, Robust Design must be examined. The scientific community uses variance as its measure for robustness. The Robust Design method has the adverse characteristic of forcing preferences upon the designer. Examining the mean-variance approach with utility theory shows that it imposes an increasingly risk averse position upon the designer. This position may not be compatible with the designer’s true risk attitude, causing issues when applying the method.
To contend with this issue, a novel utility-based approach is suggested. The approach focuses on generating functional models of the proposed systems, which provide the designer with insight into which perturbations are relevant to the system and subsystems. Additionally this approach incorporates utility theory to allow the designer to convey their preferences. The utility-based approach allows the designer to convey their own preferences, while incorporating steps to ensure the final design is robust.
|
36 |
Accurate and robust algorithms for microarray data classificationHu, Hong January 2008 (has links)
[Abstract]Microarray data classification is used primarily to predict unseen data using a model built on categorized existing Microarray data. One of the major challenges is that Microarray data contains a large number of genes with a small number of samples. This high dimensionality problem has prevented many existing classification methods from directly dealing with this type of data. Moreover, the small number of samples increases the overfitting problem of Classification, as a result leading to lower accuracy classification performance. Another major challenge is that of the uncertainty of Microarraydata quality. Microarray data contains various levels of noise and quite often high levels of noise, and these data lead to unreliable and low accuracy analysis as well as the high dimensionality problem. Most current classification methods are not robust enough to handle these type of data properly.In our research, accuracy and noise resistance or robustness issues are focused on. Our approach is to design a robust classification method for Microarray data classification.An algorithm, called diversified multiple decision trees (DMDT) is proposed, which makes use of a set of unique trees in the decision committee. The DMDT method has increased the diversity of ensemble committees andtherefore the accuracy performance has been enhanced by avoiding overlapping genes among alternative trees.Some strategies to eliminate noisy data have been looked at. Our method ensures no overlapping genes among alternative trees in an ensemble committee, so a noise gene included in the ensemble committee can affect onetree only; other trees in the committee are not affected at all. This design increases the robustness of Microarray classification in terms of resistance to noise data, and therefore reduces the instability caused by overlapping genes in current ensemble methods.The effectiveness of gene selection methods for improving the performance of Microarray classification methods are also discussed.We conclude that the proposed method DMDT substantially outperforms the other well-known ensemble methods, such as Bagging, Boosting and Random Forests, in terms of accuracy and robustness performance. DMDT is more tolerant to noise than Cascading-and-Sharing trees (CS4), particularywith increasing levels of noise in the data. The results also indicate that some classification methods are insensitive to gene selection while some methodsdepend on particular gene selection methods to improve their performance of classification.
|
37 |
Robustness of Social-ecological System Under Global Change: Insights from Community Irrigation and Forestry SystemsJanuary 2015 (has links)
abstract: Social-ecological systems (SES) are replete with hard and soft human-made components (or infrastructures) that are consciously-designed to perform specific functions valued by humans. How these infrastructures mediate human-environment interactions is thus a key determinant of many sustainability problems in present-day SES. This dissertation examines the question of how some of the designed aspects of physical and social infrastructures influence the robustness of SES under global change. Due to the fragility of rural livelihood systems, locally-managed common-pool resource systems that depend on infrastructure, such as irrigated agriculture and community forestry, are of particular importance to address this sustainability question. This dissertation presents three studies that explored the robustness of communal irrigation and forestry systems to economic or environmental shocks. The first study examined how the design of irrigation infrastructure affects the robustness of system performance to an economic shock. Using a stylized dynamic model of an irrigation system as a testing ground, this study shows that changes in infrastructure design can induce fundamental changes in qualitative system behavior (i.e., regime shifts) as well as altered robustness characteristics. The second study explored how connectedness among social units (a kind of social infrastructure) influenced the post-failure transformations of large-N forest commons under economic globalization. Using inferential statistics, the second study argues that some attributes of the social connectedness that helped system robustness in the past made the system more vulnerable to undesirable transformations in the current era. The third study explored the question of how to guide adaptive management of SES for more robustness under uncertainty. This study used an existing laboratory behavioral experiment in which human-subjects tackle a decision problem on collective management of an irrigation system under environmental uncertainty. The contents of group communication and the decisions of individuals were analyzed to understand how configurations of learning-by-doing and other adaptability-related conditions may be causally linked to robustness under environmental uncertainty. The results show that robust systems are characterized by two conditions: active learning-by-doing through outer-loop processes, i.e., frequent updating of shared assumptions or goals that underlie specific group strategies, and frequent monitoring and reflection of past outcomes. / Dissertation/Thesis / Doctoral Dissertation Sustainability 2015
|
38 |
Diagnóstico de influência para uma família de modelos de regressão para dados de taxas e proporçõesALENCAR, Francisco Hildemar Calixto de 03 February 2016 (has links)
Submitted by Irene Nascimento (irene.kessia@ufpe.br) on 2016-06-20T16:57:47Z
No. of bitstreams: 2
license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5)
Dissertacao_Hildemar.pdf: 2038188 bytes, checksum: 0f6e076aa9b22564edadc07e4585d987 (MD5) / Made available in DSpace on 2016-06-20T16:57:47Z (GMT). No. of bitstreams: 2
license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5)
Dissertacao_Hildemar.pdf: 2038188 bytes, checksum: 0f6e076aa9b22564edadc07e4585d987 (MD5)
Previous issue date: 2016-02-03 / capes / Existem situações na modelagem estatística em que a variável de interesse é contínua e restrita no intervalo aberto (0, 1), tais como taxas e proporções. Esses tipos de variáveis tipicamente apresentam características de assimetria e heteroscedasticidade, sendo assim inapropriado o uso do modelo normal linear. Kieschnick e McCullough (2003) indicaram após estudos de diferentes estratégias para modelar tais variáveis, o uso do modelo de regressão beta. Contudo, Hahn (2008) e García et al. (2011) observaram que a distribuição beta não é apropriada para o caso em que há ocorrência de eventos extremos; isto é, eventos que possam ocorrer na cauda da distribuição. Com o intuito de obter maior flexibilidade no modelo de regressão beta, Bayes et al. (2012) propuseram o modelo de regressão beta retangular considerando a distribuição beta retangular proposta por Hahn (2008). Este modelo possui como casos particulares o modelo de regressão beta proposto por Ferrari e Cribari-Neto (2004) e o modelo de regressão beta com dispersão variável proposto por Smithson e Verkuilen (2006). Esta dissertação tem como proposta avaliar o uso das divergências Kullback-Leibler e χ 2 , bem como, das distâncias estocásticas Kullback-Leibler, χ 2 , Bhattacharyya, Hellinger, triangular e média-harmônica e da distância L1 norm na detecção de observações atípicas nos modelos de regressão beta e beta retangular. Com este fim, realizamos um estudo de simulação de Monte Carlo em que ajustamos, sob o enfoque Bayesiano esses dois modelos. Nesse estudo, observamos que a divergência χ 2 demonstrou maior eficiência, que as demais medidas, na detecção de observações atípicas. A introdução dos pontos atípicos foi feita em ambas as variáveis, dependente e regressora. Por fim, apresentamos uma aplicação utilizando o conjunto de dados AIS (Australian Institute of Sport). / There are situations in the statistical modeling where the interested variable is continuous and restricted in the open interval (0,1) as rates and proportions. This type of variables typically show characteristics of asymmetry and heterocedasticity, this way unappropriated the use of the linear normal model. Kieschnick e McCullough (2003) after studies of different strategies to model variables of rates and proportions indicate the use of the regression model based on the beta distribution. However, Hahn (2008) and García et al. (2011) observed which the beta distribution is not appropriated for case where there are events in the tail of the distribution. In order to obtain greater flexibility in the beta regression model Bayes et al. (2012) proposed the rectangular beta regression model based on rectangular beta distribution proposed by Hahn (2008). This model has as particular cases the beta regression model proposed by Ferrari e Cribari-Neto (2004) and the beta regression model with variable dispersion proposed by Smithson e Verkuilen (2006). This dissertation has the purpose of evaluate the use of divergences Kullback-Leibler and χ 2 as well of stochastic distances Kullback-Leibler, χ 2 , Bhattacharyya, Hellinger, triangular and média-harmônica, and distance L1 norm in detecting atypical points in beta regression model and beta rectangular regression model. To this end, we conducted a Monte Carlo simulation study in which fitted under the Bayesian approach, these two models. This study showed that the difference χ 2 demonstrated higher efficiency than the other measures, the detection of atypical observations.The introduction of atypical points was carried out in both variables, dependent and independent. Finally, we present an application using the set of AIS (Australian Institute of Sport).
|
39 |
Check Your Other Door: Creating Backdoor Attacks in the Frequency DomainHammoud, Hasan Abed Al Kader 04 1900 (has links)
Deep Neural Networks (DNNs) are ubiquitous and span a variety of applications ranging from image classification and facial recognition to medical image analysis and real-time object detection. As DNN models become more sophisticated and complex, the computational cost of training these models becomes a burden. For this reason, outsourcing the training process has been the go-to option for many DNN users. Unfortunately, this comes at the cost of vulnerability to backdoor attacks. These attacks aim at establishing hidden backdoors in the DNN such that it performs well on clean samples but outputs a particular target label when a trigger is applied to the input. Current backdoor attacks generate triggers in the spatial domain; however, as we show in this work, it is not the only domain to exploit and one should always "check the other doors". To the best of our knowledge, this work is the first to propose a pipeline for generating a spatially dynamic (changing) and invisible (low norm) backdoor attack in the frequency domain. We show the advantages of utilizing the frequency domain for creating undetectable and powerful backdoor attacks through extensive experiments on various datasets and network architectures. Unlike most spatial domain attacks, frequency-based backdoor attacks can achieve high attack success rates with low poisoning rates and little to no drop in performance while remaining imperceptible to the human eye. Moreover, we show that the backdoored models (poisoned by our attacks) are resistant to various state-of-the-art (SOTA) defenses, and so we contribute two possible defenses that can successfully evade the attack. We conclude the work with some remarks regarding a network’s learning capacity and the capability of embedding a backdoor attack in the model.
|
40 |
Les contre-mesures par masquage contre les attaques HO-DPA : évaluation et amélioration de la sécurité en utilisant des encodages spécifiques / Masking countermeasures against HO-DPA : security evaluation and enhancement by specific mask encodingsMaghrebi, Houssem 21 December 2012 (has links)
Les circuits électroniques réalisés avec les méthodes de conception assisté par ordinateur usuelles présentent une piètre résistance par rapport aux attaques physiques. Parmi les attaques physiques les plus redoutables figurent les attaques sur les canaux cachés, comme la ``timing attack'' ou la DPA, qui consistent à enregistrer une quantité physique (temps, consommation) fuie par le circuit pendant qu’il calcule. Cette information peut être exploité pour remonter aux secrets utilisés dans des calculs de chiffrement ou de signature. Plusieurs méthodes de durcissement des circuits contre les attaques sur les canaux cachés ont été proposées. On peut en distinguer deux catégories : Les contre-mesures par dissimulation (ou par logique différentielle), visant à rendre la fuite constante, donc statiquement indépendante des secrets. Les contre-mesures par masquage, visant à rendre la fuite aléatoire, donc statistiquement indépendante des secrets. La contre-mesure par masquage est la moins complexe et la plus simple à mettre en oeuvre, car elle peut s’appliquer au niveau algorithmique comme au niveau logique. Idéalement, le concepteur s’affranchit donc d’un placement-routage manuel, comme cela est le cas des contre-mesures statiques. En revanche elle est la cible d’attaques du second ordre, voire d’ordre plus élevé, permettant d’exhiber le secret en attaquant plusieurs variables simultanément. Cette thèse se fixe comme objectifs l'analyse en robustesse et complexité des implémentations de contre-mesures par masquage et la proposition des nouvelles structures de masquage qui permettent de faire face aux attaques d'ordre élevé. / Side channel attacks take advantage of the fact that the power consumption of a cryptographic device depends on the internally used secret key. A very common countermeasure against side channel attacks is masking. It consists in splitting the sensitive variable of cryptographic algorithms into random shares (the masked data and the random mask) so that the knowledge on a subpart of the shares does not give information on the sensitive data itself. However, other attacks, such as higher-order side channel attacks, can defeat masking schemes. These attacks consist in combining the shares in order to cancel (at least partially) the effects of the mask. The overall goal of this thesis is to give a deep analysis of higher-order attacks and to improve the robustness of masking schemes.The first part of this thesis focuses on higher-order attacks. We propose three novel distinguishers. Theoretical and experimental results show the advantages of these attacks when applied to a masking countermeasure. The second part of this thesis is devoted to a formal security evaluation of hardware masking schemes. We propose a new side channel metric to jointly cover the attacks efficiency and the leakage estimation.In the last part, we propose three novel masking schemes remaining more efficient than the state-of-the-art masking. They remove (or at least reduce) the dependency between the leakage and the sensitive variable when the leakage function is known e.g. the Hamming weight or the Hamming distance leakage model). The new solutions have been evaluated within a security framework proving their excellent resistance against higher-order attacks.
|
Page generated in 0.0512 seconds