• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 91
  • 9
  • 9
  • 5
  • 4
  • 4
  • 2
  • 1
  • 1
  • Tagged with
  • 153
  • 153
  • 40
  • 38
  • 36
  • 22
  • 20
  • 20
  • 18
  • 18
  • 18
  • 17
  • 17
  • 15
  • 15
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
131

Moment Matching and Modal Truncation for Linear Systems

Hergenroeder, AJ 24 July 2013 (has links)
While moment matching can effectively reduce the dimension of a linear, time-invariant system, it can simultaneously fail to improve the stable time-step for the forward Euler scheme. In the context of a semi-discrete heat equation with spatially smooth forcing, the high frequency modes are virtually insignificant. Eliminating such modes dramatically improves the stable time-step without sacrificing output accuracy. This is accomplished by modal filtration, whose computational cost is relatively palatable when applied following an initial reduction stage by moment matching. A bound on the norm of the difference between the transfer functions of the moment-matched system and its modally-filtered counterpart yields an intelligent choice for the mode of truncation. The dual-stage algorithm disappoints in the context of highly nonnormal semi-discrete convection-diffusion equations. There, moment matching can be ineffective in dimension reduction, precluding a cost-effective modal filtering step.
132

重疊法應用於蛋白質質譜儀資料 / Overlap Technique on Protein Mass Spectrometry Data

徐竣建, Hsu, Chun-Chien Unknown Date (has links)
癌症至今已連續蟬聯並高居國人十大死因之首,由於癌症初期病患接受適時治療的存活率較高,因此若能「早期發現,早期診斷,早期治療」則可降低死亡率。本文所引用的資料庫,是經由「表面強化雷射解吸電離飛行質譜技術」(SELDI-TOF-MS)所擷取建置的蛋白質質譜儀資料,包括兩筆高維度資料:一筆為攝護腺癌症,另一筆則為頭頸癌症。然而蛋白質質譜儀資料常因維度變數繁雜眾多,對於資料的存取容量及運算時間而言,往往造成相當沉重的負擔與不便;有鑑於此,本文之目的即在探討將高維度資料經由維度縮減後,找出分錯率最小化之分析方法,希冀提高癌症病例資料分類的準確性。 本研究分為實驗組及對照組兩部分,實驗組是以主成份分析(Principal Component Analysis,PCA)進行維度縮減,再利用支持向量機(Support Vector Machine,SVM)予以分類,最後藉由重疊法(Overlap)以期改善分類效果;對照組則是以支持向量機直接進行分類。分析結果顯示,重疊法對於攝護腺癌症具有顯著的改善效果,但對於頭頸癌症的改善效果卻不明顯。此外,本研究也探討關於蛋白質質譜儀資料之質量範圍,藉以確認專家學者所建議的質量範圍是否與分析結果相互一致。在攝護腺癌症中的原始資料,專家學者所建議的質量範圍以外,似乎仍隱藏著重要的相關資訊;在頭頸癌症中的原始資料,專家學者所建議的質量範圍以外,對於研究分析而言則並沒有實質上的幫助。 / Cancer has been the number one leading cause of death in Taiwan for the past 24 years. Early detection of this disease would significantly reduce the mortality rate. The database adopted in this study is from the Protein Mass Spectrometry Data Sets acquired and established by “Surface-Enhanced Laser Desorption/Ionization Time-of-Flight Mass Spectrometry” (SELDI-TOF-MS) technique, including the Prostate Cancer and Head/Neck Cancer Data Sets. However, because of its high dimensionality, dealing the analysis of the raw data is not easy. Therefore, the purpose of this thesis is to search a feasible method, putting the dimension reduction and minimizing classification errors in the same time. The data sets are separated into the experimental and controlled groups. The first step of the experimental group is to use dimension reduction by Principal Component Analysis (PCA), following by Support Vector Machine (SVM) for classification, and finally Overlap Method is used to reduce classification errors. For comparison, the controlled group uses SVM for classification. The empirical results indicate that the improvement of Overlap Method is significant in the Prostate Cancer case, but not in that of the Head/Neck case. We also study data range suggested according to the expert opinions. We find that there is information hidden outside the data range suggested by the experts in the Prostate Cancer case, but not in the Head/Neck case.
133

Semiparametric Structure Guided by Prior Knowledge with Applications in Economics / Durch Vorwissen gesteuerte semiparametrische Struktur mit wirtschaftswissenschaftlichen Anwendungen

Scholz, Michael 08 April 2011 (has links)
No description available.
134

Inférence statistique en grande dimension pour des modèles structurels. Modèles linéaires généralisés parcimonieux, méthode PLS et polynômes orthogonaux et détection de communautés dans des graphes. / Statistical inference for structural models in high dimension. Sparse generalized linear models, PLS through orthogonal polynomials and community detection in graphs

Blazere, Melanie 01 July 2015 (has links)
Cette thèse s'inscrit dans le cadre de l'analyse statistique de données en grande dimension. Nous avons en effet aujourd'hui accès à un nombre toujours plus important d'information. L'enjeu majeur repose alors sur notre capacité à explorer de vastes quantités de données et à en inférer notamment les structures de dépendance. L'objet de cette thèse est d'étudier et d'apporter des garanties théoriques à certaines méthodes d'estimation de structures de dépendance de données en grande dimension.La première partie de la thèse est consacrée à l'étude de modèles parcimonieux et aux méthodes de type Lasso. Après avoir présenté les résultats importants sur ce sujet dans le chapitre 1, nous généralisons le cas gaussien à des modèles exponentiels généraux. La contribution majeure à cette partie est présentée dans le chapitre 2 et consiste en l'établissement d'inégalités oracles pour une procédure Group Lasso appliquée aux modèles linéaires généralisés. Ces résultats montrent les bonnes performances de cet estimateur sous certaines conditions sur le modèle et sont illustrés dans le cas du modèle Poissonien. Dans la deuxième partie de la thèse, nous revenons au modèle de régression linéaire, toujours en grande dimension mais l'hypothèse de parcimonie est cette fois remplacée par l'existence d'une structure de faible dimension sous-jacente aux données. Nous nous penchons dans cette partie plus particulièrement sur la méthode PLS qui cherche à trouver une décomposition optimale des prédicteurs étant donné un vecteur réponse. Nous rappelons les fondements de la méthode dans le chapitre 3. La contribution majeure à cette partie consiste en l'établissement pour la PLS d'une expression analytique explicite de la structure de dépendance liant les prédicteurs à la réponse. Les deux chapitres suivants illustrent la puissance de cette formule aux travers de nouveaux résultats théoriques sur la PLS . Dans une troisième et dernière partie, nous nous intéressons à la modélisation de structures au travers de graphes et plus particulièrement à la détection de communautés. Après avoir dressé un état de l'art du sujet, nous portons notre attention sur une méthode en particulier connue sous le nom de spectral clustering et qui permet de partitionner les noeuds d'un graphe en se basant sur une matrice de similarité. Nous proposons dans cette thèse une adaptation de cette méthode basée sur l'utilisation d'une pénalité de type l1. Nous illustrons notre méthode sur des simulations. / This thesis falls within the context of high-dimensional data analysis. Nowadays we have access to an increasing amount of information. The major challenge relies on our ability to explore a huge amount of data and to infer their dependency structures.The purpose of this thesis is to study and provide theoretical guarantees to some specific methods that aim at estimating dependency structures for high-dimensional data. The first part of the thesis is devoted to the study of sparse models through Lasso-type methods. In Chapter 1, we present the main results on this topic and then we generalize the Gaussian case to any distribution from the exponential family. The major contribution to this field is presented in Chapter 2 and consists in oracle inequalities for a Group Lasso procedure applied to generalized linear models. These results show that this estimator achieves good performances under some specific conditions on the model. We illustrate this part by considering the case of the Poisson model. The second part concerns linear regression in high dimension but the sparsity assumptions is replaced by a low dimensional structure underlying the data. We focus in particular on the PLS method that attempts to find an optimal decomposition of the predictors given a response. We recall the main idea in Chapter 3. The major contribution to this part consists in a new explicit analytical expression of the dependency structure that links the predictors to the response. The next two chapters illustrate the power of this formula by emphasising new theoretical results for PLS. The third and last part is dedicated to graphs modelling and especially to community detection. After presenting the main trends on this topic, we draw our attention to Spectral Clustering that allows to cluster nodes of a graph with respect to a similarity matrix. In this thesis, we suggest an alternative to this method by considering a $l_1$ penalty. We illustrate this method through simulations.
135

Aplica??o de superf?cies seletivas em frequ?ncia para melhoria de resposta de arranjos de antenas planares

Almeida Filho, Valdez Arag?o de 12 March 2014 (has links)
Made available in DSpace on 2014-12-17T14:55:20Z (GMT). No. of bitstreams: 1 ValdezAAF_TESE.pdf: 2001050 bytes, checksum: d0f0b88178102c3f48880303c1c6d765 (MD5) Previous issue date: 2014-03-12 / Coordena??o de Aperfei?oamento de Pessoal de N?vel Superior / This work aims to show how the application of frequency selective surfaces (FSS) in planar antenna arrays become an alternative to obtain desired radiation characteristics from changes in radiation parameters of the arrays, such as bandwidth, gain and directivity. In addition to analyzing these parameters is also made a study of the mutual coupling between the elements of the array. To accomplish this study, were designed a microstrip antenna array with two patch elements, fed by a network feed. Another change made in the array was the use of the truncated ground plane, with the objective of increasing the bandwidth and miniaturize the elements of the array. In order to study the behavior of frequency selective surfaces applied in antenna arrays, three different layouts were proposed. The first layout uses the FSS as a superstrate (above the array). The second layout uses the FSS as reflector element (below the array). The third layout is placed between two FSS. Numerical and experimental results for each of the proposed configurations are presented in order to validate the research / Este trabalho tem como objetivo apresentar como a aplica??o de superf?cies seletivas em frequ?ncia (FSS) em arranjos de antenas planares se torna uma alternativa interessante para se obter caracter?sticas de radia??o desejadas, a partir de altera??es em par?metros de radia??o dos arranjos, tais como largura de banda, ganho e diretividade. Al?m de analisar esses par?metros, tamb?m ? feito o estudo do acoplamento m?tuo entre os elementos do arranjo. Para realizar tal estudo, foi projetado um arranjo de antenas de microfita, com dois elementos do tipo patch, alimentado por uma rede de alimenta??o. Outra modifica??o feita no arranjo foi a utiliza??o do plano de terra truncado, com o objetivo de aumentar a largura de banda e miniaturizar os elementos do arranjo. Para poder estudar o comportamento das superf?cies seletivas em frequ?ncia aplicadas em arranjos de antenas, foram propostos tr?s layouts diferentes. O primeiro layout consiste em utilizar a FSS como superstrato (acima do arranjo). O segundo consiste em utilizar a FSS como elemento refletor (abaixo do arranjo). O terceiro layout consiste em colocar o arranjo entre duas camadas de FSS, tanto em cima quanto abaixo. Resultados num?ricos e experimentais para cada uma das configura??es propostas s?o apresentados
136

Méthodes de réduction de dimension pour la construction d'indicateurs de qualité de vie / Dimension reduction methods to construct quality of life indicators

Labenne, Amaury 20 November 2015 (has links)
L’objectif de cette thèse est de développer et de proposer de nouvellesméthodes de réduction de dimension pour la construction d’indicateurs composites dequalité de vie à l’échelle communale. La méthodologie statistique développée met l’accentsur la prise en compte de la multidimensionnalité du concept de qualité de vie, avecune attention particulière sur le traitement de la mixité des données (variables quantitativeset qualitatives) et l’introduction des conditions environnementales. Nous optonspour une approche par classification de variables et pour une méthode multi-tableaux(analyse factorielle multiple pour données mixtes). Ces deux méthodes permettent deconstruire des indicateurs composites que nous proposons comme mesure des conditionsde vie à l’échelle communale. Afin de faciliter l’interprétation des indicateurscomposites construits, une méthode de sélection de variables de type bootstrap estintroduite en analyse factorielle multiple. Enfin nous proposons la méthode hclustgeode classification d’observations qui intègre des contraintes de proximité géographiqueafin de mieux appréhender la spatialité des phénomènes mis en jeu. / The purpose of this thesis is to develop and suggest new dimensionreduction methods to construct composite indicators on a municipal scale. The developedstatistical methodology highlights the consideration of the multi-dimensionalityof the quality of life concept, with a particular attention on the treatment of mixeddata (quantitative and qualitative variables) and the introduction of environmentalconditions. We opt for a variable clustering approach and for a multi-table method(multiple factorial analysis for mixed data). These two methods allow to build compositeindicators that we propose as a measure of living conditions at the municipalscale. In order to facilitate the interpretation of the created composite indicators, weintroduce a method of selections of variables based on a bootstrap approach. Finally,we suggest the clustering of observations method, named hclustgeo, which integratesgeographical proximity constraints in the clustering procedure, in order to apprehendthe spatiality specificities better.
137

Multivariate analysis of high-throughput sequencing data / Analyses multivariées de données de séquençage à haut débit

Durif, Ghislain 13 December 2016 (has links)
L'analyse statistique de données de séquençage à haut débit (NGS) pose des questions computationnelles concernant la modélisation et l'inférence, en particulier à cause de la grande dimension des données. Le travail de recherche dans ce manuscrit porte sur des méthodes de réductions de dimension hybrides, basées sur des approches de compression (représentation dans un espace de faible dimension) et de sélection de variables. Des développements sont menés concernant la régression "Partial Least Squares" parcimonieuse (supervisée) et les méthodes de factorisation parcimonieuse de matrices (non supervisée). Dans les deux cas, notre objectif sera la reconstruction et la visualisation des données. Nous présenterons une nouvelle approche de type PLS parcimonieuse, basée sur une pénalité adaptative, pour la régression logistique. Cette approche sera utilisée pour des problèmes de prédiction (devenir de patients ou type cellulaire) à partir de l'expression des gènes. La principale problématique sera de prendre en compte la réponse pour écarter les variables non pertinentes. Nous mettrons en avant le lien entre la construction des algorithmes et la fiabilité des résultats.Dans une seconde partie, motivés par des questions relatives à l'analyse de données "single-cell", nous proposons une approche probabiliste pour la factorisation de matrices de comptage, laquelle prend en compte la sur-dispersion et l'amplification des zéros (caractéristiques des données single-cell). Nous développerons une procédure d'estimation basée sur l'inférence variationnelle. Nous introduirons également une procédure de sélection de variables probabiliste basée sur un modèle "spike-and-slab". L'intérêt de notre méthode pour la reconstruction, la visualisation et le clustering de données sera illustré par des simulations et par des résultats préliminaires concernant une analyse de données "single-cell". Toutes les méthodes proposées sont implémentées dans deux packages R: plsgenomics et CMF / The statistical analysis of Next-Generation Sequencing data raises many computational challenges regarding modeling and inference, especially because of the high dimensionality of genomic data. The research work in this manuscript concerns hybrid dimension reduction methods that rely on both compression (representation of the data into a lower dimensional space) and variable selection. Developments are made concerning: the sparse Partial Least Squares (PLS) regression framework for supervised classification, and the sparse matrix factorization framework for unsupervised exploration. In both situations, our main purpose will be to focus on the reconstruction and visualization of the data. First, we will present a new sparse PLS approach, based on an adaptive sparsity-inducing penalty, that is suitable for logistic regression to predict the label of a discrete outcome. For instance, such a method will be used for prediction (fate of patients or specific type of unidentified single cells) based on gene expression profiles. The main issue in such framework is to account for the response to discard irrelevant variables. We will highlight the direct link between the derivation of the algorithms and the reliability of the results. Then, motivated by questions regarding single-cell data analysis, we propose a flexible model-based approach for the factorization of count matrices, that accounts for over-dispersion as well as zero-inflation (both characteristic of single-cell data), for which we derive an estimation procedure based on variational inference. In this scheme, we consider probabilistic variable selection based on a spike-and-slab model suitable for count data. The interest of our procedure for data reconstruction, visualization and clustering will be illustrated by simulation experiments and by preliminary results on single-cell data analysis. All proposed methods were implemented into two R-packages "plsgenomics" and "CMF" based on high performance computing
138

Homogenization of reaction-diffusion problems with nonlinear drift in thin structures

Raveendran, Vishnu January 2022 (has links)
We study the question of periodic homogenization of a variably scaled reaction-diffusion equation with non-linear drift of polynomial type. The non-linear drift was derived as hydrodynamic limit of a totally asymmetric simple exclusion process (TASEP) for a population of interacting particles crossing a domain with obstacle. We consider three different geometries: (i) Bounded domain crossed by a finitely thin flat composite layer; (ii) Bounded domain crossed by an infinitely thin flat composite layer; (iii) Unbounded composite domain.\end{itemize} For the thin layer cases, we consider our reaction-diffusion problem endowed with slow or moderate drift. Using energy-type estimates as well as concepts like thin-layer convergence and two-scale convergence, we derive homogenized evolution equations and the corresponding effective model parameters. Special attention is paid to the derivation of the effective transmission conditions across the separating limit interfaces. As a special scaling, the problem with large drift is treated separately for an unbounded composite domain. Because of the imposed large drift, this nonlinearity is expected to explode in the limit of a vanishing scaling parameter. To deal with this special case, we employ two-scale formal homogenization asymptotics with drift to derive the corresponding upscaled model equations as well as the structure of the effective transport tensors. Finally, we use Schauder's fixed point Theorem as well as monotonicity arguments to study the weak solvability of the upscaled model posed in the unbounded domain. This study wants to contribute with theoretical understanding needed when designing thin composite materials which are resistant to slow, moderate, and high velocity impacts. / We study the question of periodic homogenization of a variably scaled reaction-diffusion equation with non-linear drift of polynomial type. The non-linear drift was derived as hydrodynamic limit of a totally asymmetric simple exclusion process (TASEP) for a population of interacting particles crossing a domain with obstacle. We consider three different geometries: (i) Bounded domain crossed by a finitely thin composite layer; (ii) Bounded domain crossed by an infinitely thin composite  layer; (iii) Unbounded composite domain. For the thin layer cases, we consider our reaction-diffusion problem endowed with slow or moderate drift. Using energy-type  estimates, concepts like thin-layer convergence and two-scale convergence, we derive  homogenized  equations. Special attention is paid to the derivation of the effective transmission conditions across the separating limit interfaces. The problem with large drift is treated separately for an unbounded composite domain. Because of the imposed large drift, this nonlinearity is expected to explode in the limit of a vanishing scaling parameter.  This study wants to contribute with theoretical understanding needed when designing thin composite materials which are resistant to slow, moderate, and high velocity impacts.
139

Advances on Dimension Reduction for Multivariate Linear Regression

Guo, Wenxing January 2020 (has links)
Multivariate linear regression methods are widely used statistical tools in data analysis, and were developed when some response variables are studied simultaneously, in which our aim is to study the relationship between predictor variables and response variables through the regression coefficient matrix. The rapid improvements of information technology have brought us a large number of large-scale data, but also brought us great challenges in data processing. When dealing with high dimensional data, the classical least squares estimation is not applicable in multivariate linear regression analysis. In recent years, some approaches have been developed to deal with high-dimensional data problems, among which dimension reduction is one of the main approaches. In some literature, random projection methods were used to reduce dimension in large datasets. In Chapter 2, a new random projection method, with low-rank matrix approximation, is proposed to reduce the dimension of the parameter space in high-dimensional multivariate linear regression model. Some statistical properties of the proposed method are studied and explicit expressions are then derived for the accuracy loss of the method with Gaussian random projection and orthogonal random projection. These expressions are precise rather than being bounds up to constants. In multivariate regression analysis, reduced rank regression is also a dimension reduction method, which has become an important tool for achieving dimension reduction goals due to its simplicity, computational efficiency and good predictive performance. In practical situations, however, the performance of the reduced rank estimator is not satisfactory when the predictor variables are highly correlated or the ratio of signal to noise is small. To overcome this problem, in Chapter 3, we incorporate matrix projections into reduced rank regression method, and then develop reduced rank regression estimators based on random projection and orthogonal projection in high-dimensional multivariate linear regression models. We also propose a consistent estimator of the rank of the coefficient matrix and achieve prediction performance bounds for the proposed estimators based on mean squared errors. Envelope technology is also a popular method in recent years to reduce estimative and predictive variations in multivariate regression, including a class of methods to improve the efficiency without changing the traditional objectives. Variable selection is the process of selecting a subset of relevant features variables for use in model construction. The purpose of using this technology is to avoid the curse of dimensionality, simplify models to make them easier to interpret, shorten training time and reduce overfitting. In Chapter 4, we combine envelope models and a group variable selection method to propose an envelope-based sparse reduced rank regression estimator in high-dimensional multivariate linear regression models, and then establish its consistency, asymptotic normality and oracle property. Tensor data are in frequent use today in a variety of fields in science and engineering. Processing tensor data is a practical but challenging problem. Recently, the prevalence of tensor data has resulted in several envelope tensor versions. In Chapter 5, we incorporate envelope technique into tensor regression analysis and propose a partial tensor envelope model, which leads to a parsimonious version for tensor response regression when some predictors are of special interest, and then consistency and asymptotic normality of the coefficient estimators are proved. The proposed method achieves significant gains in efficiency compared to the standard tensor response regression model in terms of the estimation of the coefficients for the selected predictors. Finally, in Chapter 6, we summarize the work carried out in the thesis, and then suggest some problems of further research interest. / Dissertation / Doctor of Philosophy (PhD)
140

Regroupement de textes avec des approches simples et efficaces exploitant la représentation vectorielle contextuelle SBERT

Petricevic, Uros 12 1900 (has links)
Le regroupement est une tâche non supervisée consistant à rassembler les éléments semblables sous un même groupe et les éléments différents dans des groupes distincts. Le regroupement de textes est effectué en représentant les textes dans un espace vectoriel et en étudiant leur similarité dans cet espace. Les meilleurs résultats sont obtenus à l’aide de modèles neuronaux qui affinent une représentation vectorielle contextuelle de manière non supervisée. Or, cette technique peuvent nécessiter un temps d’entraînement important et sa performance n’est pas comparée à des techniques plus simples ne nécessitant pas l’entraînement de modèles neuronaux. Nous proposons, dans ce mémoire, une étude de l’état actuel du domaine. Tout d’abord, nous étudions les meilleures métriques d’évaluation pour le regroupement de textes. Puis, nous évaluons l’état de l’art et portons un regard critique sur leur protocole d’entraînement. Nous proposons également une analyse de certains choix d’implémentation en regroupement de textes, tels que le choix de l’algorithme de regroupement, de la mesure de similarité, de la représentation vectorielle ou de l’affinage non supervisé de la représentation vectorielle. Finalement, nous testons la combinaison de certaines techniques ne nécessitant pas d’entraînement avec la représentation vectorielle contextuelle telles que le prétraitement des données, la réduction de dimensionnalité ou l’inclusion de Tf-idf. Nos expériences démontrent certaines lacunes dans l’état de l’art quant aux choix des métriques d’évaluation et au protocole d’entraînement. De plus, nous démontrons que l’utilisation de techniques simples permet d’obtenir des résultats meilleurs ou semblables à des méthodes sophistiquées nécessitant l’entraînement de modèles neuronaux. Nos expériences sont évaluées sur huit corpus issus de différents domaines. / Clustering is an unsupervised task of bringing similar elements in the same cluster and different elements in distinct groups. Text clustering is performed by representing texts in a vector space and studying their similarity in this space. The best results are obtained using neural models that fine-tune contextual embeddings in an unsupervised manner. However, these techniques require a significant amount of training time and their performance is not compared to simpler techniques that do not require training of neural models. In this master’s thesis, we propose a study of the current state of the art. First, we study the best evaluation metrics for text clustering. Then, we evaluate the state of the art and take a critical look at their training protocol. We also propose an analysis of some implementation choices in text clustering, such as the choice of clustering algorithm, similarity measure, contextual embeddings or unsupervised fine-tuning of the contextual embeddings. Finally, we test the combination of contextual embeddings with some techniques that don’t require training such as data preprocessing, dimensionality reduction or Tf-idf inclusion. Our experiments demonstrate some shortcomings in the state of the art regarding the choice of evaluation metrics and the training protocol. Furthermore, we demonstrate that the use of simple techniques yields better or similar results to sophisticated methods requiring the training of neural models. Our experiments are evaluated on eight benchmark datasets from different domains.

Page generated in 0.0454 seconds