1 
Greedy Strategies for Convex MinimizationNguyen, Hao Thanh 16 December 2013 (has links)
We have investigated two greedy strategies for finding an approximation to the minimum of a convex function E, defined on a Hilbert space H. We have proved convergence rates for a modification of the orthogonal matching pursuit and its weak version under suitable conditions on the objective function E. These conditions involve the behavior of the moduli of smoothness and the modulus of uniform convexity of E.

2 
The selection of compounds for screening in pharmaceutical researchHarper, Gavin January 1999 (has links)
No description available.

3 
A Practical Comprehensive Approach to PMU Placement for Full ObservabilityAltman, James Ross 27 March 2008 (has links)
In recent years, the placement of phasor measurement units (PMUs) in electric transmission systems has gained much attention. Engineers and mathematicians have developed a variety of algorithms to determine the best locations for PMU installation. But often these placement algorithms are not practical for real systems and do not cover the whole process. This thesis presents a strategy that is practical and addresses three important topics: system preparation, placement algorithm, and installation scheduling. To be practical, a PMU strategy should strive for full observability, work well within the heterogeneous nature of power system topology, and enable system planners to adapt the strategy to meet their unique needs and system configuration. Practical considerations for the three placement topics are discussed, and a specific strategy based on these considerations is developed and demonstrated on real transmission system models. / Master of Science

4 
Vehicle sensorbased pedestrian position identification in V2V environmentHuang, Zhi 03 December 2016 (has links)
Indiana UniversityPurdue University Indianapolis (IUPUI) / This thesis presents a method to accurately determine the location and amount of pedestrians detected by different vehicles equipped with a Pedestrian Autonomous Emergency Braking (PAEB) system, taking into consideration the inherent inaccuracy of the pedestrian sensing from these vehicles. In the thesis, a mathematical model of the pedestrian information generated by the PAEB system in the V2V network is developed. The GreedyMedoids clustering algorithm and constrained hierarchical clustering are applied to recognize and reconstruct actual pedestrians, which enables a subject vehicle to approximate the number of the pedestrians and their estimated locations from a larger number of pedestrian alert messages received from many nearby vehicles through the V2V network and the subject vehicle itself. The proposed methods determines the possible number of actual pedestrians by grouping the nearby pedestrians information broadcasted by different vehicles and considers them as one pedestrian. Computer simulations illustrate the effectiveness and applicability of the proposed methods. The results are more integrated and accurate information for vehicle Autonomous Emergency Braking (AEB) systems to make better decisions earlier to avoid crashing into pedestrians.

5 
Greedy structure learning of Markov Random FieldsJohnson, Christopher Carroll 04 November 2011 (has links)
Probabilistic graphical models are used in a variety of domains to capture and represent general dependencies in joint probability distributions. In this document we examine the problem of learning the structure of an undirected graphical model, also called a Markov Random Field (MRF), given a set of independent and identically distributed (i.i.d.) samples. Specifically, we introduce an adaptive forwardbackward greedy algorithm for learning the structure of a discrete, pairwise MRF given a high dimensional set of i.i.d. samples. The algorithm works by greedily estimating the neighborhood of each node independently through a series of forward and backward steps. By imposing a restricted strong convexity condition on the structure of the learned graph we show that the structure can be fully learned with high probability given $n=\Omega(d\log (p))$ samples where $d$ is the dimension of the graph and $p$ is the number of nodes. This is a significant improvement over existing convexoptimization based algorithms that require a sample complexity of $n=\Omega(d^2\log(p))$ and a stronger irrepresentability condition. We further support these claims with an empirical comparison of the greedy algorithm to nodewise $\ell_1$regularized logistic regression as well as provide a real data analysis of the greedy algorithm using the Audioscrobbler music listener dataset. The results of this document provide an additional representation of work submitted by A. Jalali, C. Johnson, and P. Ravikumar to NIPS 2011. / text

6 
Greedy algorithms for multichannel sparse recoveryDeterme, JeanFrançois 16 January 2018 (has links)
During the last decade, research has shown compressive sensing (CS) to be a promising theoretical framework for reconstructing highdimensional sparse signals. Leveraging a sparsity hypothesis, algorithms based on CS reconstruct signals on the basis of a limited set of (often random) measurements. Such algorithms require fewer measurements than conventional techniques to fully reconstruct a sparse signal, thereby saving time and hardware resources. This thesis addresses several challenges. The first is to theoretically understand how some parameters—such as noise variance—affect the performance of simultaneous orthogonal matching pursuit (SOMP), a greedy support recovery algorithm tailored to multiple measurement vector signal models. Chapters 4 and 5 detail novel improvements in understanding the performance of SOMP. Chapter 4 presents analyses of SOMP for noiseless measurements; using those analyses, Chapter 5 extensively studies the performance of SOMP in the noisy case. A second challenge consists in optimally weighting the impact of each measurement vector on the decisions of SOMP. If measurement vectors feature unequal signaltonoise ratios, properly weighting their impact improves the performance of SOMP. Chapter 6 introduces a novel weighting strategy from which SOMP benefits. The chapter describes the novel weighting strategy, derives theoretically optimal weights for it, and presents both theoretical and numerical evidence that the strategy improves the performance of SOMP. Finally, Chapter 7 deals with the tendency for support recovery algorithms to pick support indices solely for mapping a particular noise realization. To ensure that such algorithms pick all the correct support indices, researchers often make the algorithms pick more support indices than the number strictly required. Chapter 7 presents a support reduction technique, that is, a technique removing from a support the supernumerary indices solely mapping noise. The advantage of the technique, which relies on crossvalidation, is that it is universal, in that it makes no assumption regarding the support recovery algorithm generating the support. Theoretical results demonstrate that the technique is reliable. Furthermore, numerical evidence proves that the proposed technique performs similarly to orthogonal matching pursuit with crossvalidation (OMPCV), a stateoftheart algorithm for support reduction. / Doctorat en Sciences de l'ingénieur et technologie / info:eurepo/semantics/nonPublished

7 
Greedy Representative Selection for Unsupervised Data AnalysisHelwa, Ahmed Khairy Farahat January 2012 (has links)
In recent years, the advance of information and communication technologies has allowed the storage and transfer of massive amounts of data. The availability of this overwhelming amount of data stimulates a growing need to develop fast and accurate algorithms to discover useful information hidden in the data. This need is even more acute for unsupervised data, which lacks information about the categories of different instances.
This dissertation addresses a crucial problem in unsupervised data analysis, which is the selection of representative instances and/or features from the data. This problem can be generally defined as the selection of the most representative columns of a data matrix, which is formally known as the Column Subset Selection (CSS) problem. Algorithms for column subset selection can be directly used for data analysis or as a preprocessing step to enhance other data mining algorithms, such as clustering. The contributions of this dissertation can be summarized as outlined below.
First, a fast and accurate algorithm is proposed to greedily select a subset of columns of a data matrix such that the reconstruction error of the matrix based on the subset of selected columns is minimized. The algorithm is based on a novel recursive formula for calculating the reconstruction error, which allows the development of time and memoryefficient algorithms for greedy column subset selection. Experiments on real data sets demonstrate the effectiveness and efficiency of the proposed algorithms in comparison to the stateoftheart methods for column subset selection.
Second, a kernelbased algorithm is presented for column subset selection. The algorithm greedily selects representative columns using information about their pairwise similarities. The algorithm can also calculate a Nyström approximation for a large kernel matrix based on the subset of selected columns. In comparison to different Nyström methods, the greedy Nyström method has been empirically shown to achieve significant improvements in approximating kernel matrices, with minimum overhead in run time.
Third, two algorithms are proposed for fast approximate kmeans and spectral clustering. These algorithms employ the greedy column subset selection method to embed all data points in the subspace of a few representative points, where the clustering is performed. The approximate algorithms run much faster than their exact counterparts while achieving comparable clustering performance.
Fourth, a fast and accurate greedy algorithm for unsupervised feature selection is proposed. The algorithm is an application of the greedy column subset selection method presented in this dissertation. Similarly, the features are greedily selected such that the reconstruction error of the data matrix is minimized. Experiments on benchmark data sets show that the greedy algorithm outperforms stateoftheart methods for unsupervised feature selection in the clustering task.
Finally, the dissertation studies the connection between the column subset selection problem and other related problems in statistical data analysis, and it presents a unified framework which allows the use of the greedy algorithms presented in this dissertation to solve different related problems.

8 
Greedy Representative Selection for Unsupervised Data AnalysisHelwa, Ahmed Khairy Farahat January 2012 (has links)
In recent years, the advance of information and communication technologies has allowed the storage and transfer of massive amounts of data. The availability of this overwhelming amount of data stimulates a growing need to develop fast and accurate algorithms to discover useful information hidden in the data. This need is even more acute for unsupervised data, which lacks information about the categories of different instances.
This dissertation addresses a crucial problem in unsupervised data analysis, which is the selection of representative instances and/or features from the data. This problem can be generally defined as the selection of the most representative columns of a data matrix, which is formally known as the Column Subset Selection (CSS) problem. Algorithms for column subset selection can be directly used for data analysis or as a preprocessing step to enhance other data mining algorithms, such as clustering. The contributions of this dissertation can be summarized as outlined below.
First, a fast and accurate algorithm is proposed to greedily select a subset of columns of a data matrix such that the reconstruction error of the matrix based on the subset of selected columns is minimized. The algorithm is based on a novel recursive formula for calculating the reconstruction error, which allows the development of time and memoryefficient algorithms for greedy column subset selection. Experiments on real data sets demonstrate the effectiveness and efficiency of the proposed algorithms in comparison to the stateoftheart methods for column subset selection.
Second, a kernelbased algorithm is presented for column subset selection. The algorithm greedily selects representative columns using information about their pairwise similarities. The algorithm can also calculate a Nyström approximation for a large kernel matrix based on the subset of selected columns. In comparison to different Nyström methods, the greedy Nyström method has been empirically shown to achieve significant improvements in approximating kernel matrices, with minimum overhead in run time.
Third, two algorithms are proposed for fast approximate kmeans and spectral clustering. These algorithms employ the greedy column subset selection method to embed all data points in the subspace of a few representative points, where the clustering is performed. The approximate algorithms run much faster than their exact counterparts while achieving comparable clustering performance.
Fourth, a fast and accurate greedy algorithm for unsupervised feature selection is proposed. The algorithm is an application of the greedy column subset selection method presented in this dissertation. Similarly, the features are greedily selected such that the reconstruction error of the data matrix is minimized. Experiments on benchmark data sets show that the greedy algorithm outperforms stateoftheart methods for unsupervised feature selection in the clustering task.
Finally, the dissertation studies the connection between the column subset selection problem and other related problems in statistical data analysis, and it presents a unified framework which allows the use of the greedy algorithms presented in this dissertation to solve different related problems.

9 
Placement de graphes de tâches de grande taille sur architectures massivement multicoeurs / Mapping of large task network on manycore architectureBerger, KarlEduard 08 December 2015 (has links)
Ce travail de thèse de doctorat est dédié à l'étude d'un problème de placement de tâches dans le domaine de la compilation d'applications pour des architectures massivement parallèles. Ce problème vient en réponse à certains besoins industriels tels que l'économie d'énergie, la demande de performances pour les applications de type flots de données synchrones. Ce problème de placement doit être résolu dans le respect de trois critères: les algorithmes doivent être capable de traiter des applications de tailles variables, ils doivent répondre aux contraintes de capacités des processeurs et prendre en compte la topologie des architectures cibles. Dans cette thèse, les tâches sont organisées en réseaux de communication, modélisés sous forme de graphes. Pour évaluer la qualité des solutions produites par les algorithmes, les placements obtenus sont comparés avec un placement aléatoire. Cette comparaison sert de métrique d'évaluation des placements des différentes méthodes proposées. Afin de résoudre à ce problème, deux algorithmes de placement de réseaux de tâches de grande taille sur des architectures clusterisées de processeurs de type manycoeurs ont été développés. Ils s'appliquent dans des cas où les poids des tâches et des arêtes sont unitaires. Le premier algorithme, nommé Taskwise Placement, place les tâches une par une en se servant d'une notion d'affinité entre les tâches. Le second, intitulé Subgraphwise Placement, rassemble les tâches en groupes puis place les groupes de tâches sur les processeurs en se servant d'une relation d'affinité entre les groupes et les tâches déjà affectées. Ces algorithmes ont été testés sur des graphes, représentants des applications, possédant des topologies de types grilles ou de réseaux de portes logiques. Les résultats des placements sont comparés avec un algorithme de placement, présent dans la littérature qui place des graphes de tailles modérée et ce à l'aide de la métrique définie précédemment. Les cas d'application des algorithmes de placement sont ensuite orientés vers des graphes dans lesquels les poids des tâches et des arêtes sont variables similairement aux valeurs qu'on peut retrouver dans des cas industriels. Une heuristique de construction progressive basée sur la théorie des jeux a été développée. Cet algorithme, nommé Regret Based Approach, place les tâches une par une. Le coût de placement de chaque tâche en fonction des autres tâches déjà placées est calculée. La phase de sélection de la tâche se base sur une notion de regret présente dans la théorie des jeux. La tâche qu'on regrettera le plus de ne pas avoir placée est déterminée et placée en priorité. Afin de vérifier la robustesse de l'algorithme, différents types de graphes de tâches (grilles, logic gate networks, seriesparallèles, aléatoires, matrices creuses) de tailles variables ont été générés. Les poids des tâches et des arêtes ont été générés aléatoirement en utilisant une loi bimodale paramétrée de manière à obtenir des valeurs similaires à celles des applications industrielles. Les résultats de l'algorithme ont également été comparés avec l'algorithme TaskWise Placement, qui a été spécialement adapté pour les valeurs non unitaires. Les résultats sont également évalués en utilisant la métrique de placement aléatoire. / This Ph.D thesis is devoted to the study of the mapping problem related to massively parallel embedded architectures. This problem arises from industrial needs like energy savings, performance demands for synchronous dataflow applications. This problem has to be solved considering three criteria: heuristics should be able to deal with applications with various sizes, they must meet the constraints of capacities of processors and they have to take into account the target architecture topologies. In this thesis, tasks are organized in communication networks, modeled as graphs. In order to determine a way of evaluating the efficiency of the developed heuristics, mappings, obtained by the heuristics, are compared to a random mapping. This comparison is used as an evaluation metric throughout this thesis. The existence of this metric is motivated by the fact that no comparative heuristics can be found in the literature at the time of writing of this thesis. In order to address this problem, two heuristics are proposed. They are able to solve a dataflow process network mapping problem, where a network of communicating tasks is placed into a set of processors with limited resource capacities, while minimizing the overall communication bandwidth between processors. They are applied on task graphs where weights of tasks and edges are unitary set. The first heuristic, denoted as Taskwise Placement, places tasks one after another using a notion of task affinities. The second algorithm, named Subgraphwise Placement, gathers tasks in small groups then place the different groups on processors using a notion of affinities between groups and processors. These algorithms are tested on tasks graphs with grid or logic gates network topologies. Obtained results are then compared to an algorithm present in the literature. This algorithm maps task graphs with moderated size on massively parallel architectures. In addition, the random based mapping metric is used in order to evaluate results of both heuristics. Then, in a will to address problems that can be found in industrial cases, application cases are widen to tasks graphs with tasks and edges weights values similar to those that can be found in the industry. A progressive construction heuristic named Regret Based Approach, based on game theory, is proposed. This heuristic maps tasks one after another. The costs of mapping tasks according to already mapped tasks are computed. The process of task selection is based on a notion of regret, present in game theory. The task with the highest value of regret for not placing it, is pointed out and is placed in priority. In order to check the strength of the algorithm, many types of task graphs (grids, logic gates networks, seriesparallel, random, sparse matrices) with various size are generated. Tasks and edges weights are randomly chosen using a bimodal law parameterized in order to have similar values than industrial applications. Obtained results are compared to the Task Wise placement, especially adapted for nonunitary values. Moreover, results are evaluated using the metric defined above.

10 
Estimation non paramétrique de densités conditionnelles : grande dimension, parcimonie et algorithmes gloutons. / Nonparametric estimation of sparse conditional densities in moderately large dimensions by greedy algorithms.Nguyen, MinhLien Jeanne 08 July 2019 (has links)
Nous considérons le problème d’estimation de densités conditionnelles en modérément grandes dimensions. Beaucoup plus informatives que les fonctions de régression, les densités condi tionnelles sont d’un intérêt majeur dans les méthodes récentes, notamment dans le cadre bayésien (étude de la distribution postérieure, recherche de ses modes...). Après avoir rappelé les problèmes liés à l’estimation en grande dimension dans l’introduction, les deux chapitres suivants développent deux méthodes qui s’attaquent au fléau de la dimension en demandant : d’être efficace computation nellement grâce à une procédure itérative gloutonne, de détecter les variables pertinentes sous une hypothèse de parcimonie, et converger à vitesse minimax quasioptimale. Plus précisément, les deux méthodes considèrent des estimateurs à noyau bien adaptés à l’estimation de densités conditionnelles et sélectionnent une fenêtre multivariée ponctuelle en revisitant l’algorithme glouton RODEO (Re gularisation Of Derivative Expectation Operator). La première méthode ayant des problèmes d’ini tialisation et des facteurs logarithmiques supplémentaires dans la vitesse de convergence, la seconde méthode résout ces problèmes, tout en ajoutant l’adaptation à la régularité. Dans l’avantdernier cha pitre, on traite de la calibration et des performances numériques de ces deux procédures, avant de donner quelques commentaires et perspectives dans le dernier chapitre. / We consider the problem of conditional density estimation in moderately large dimen sions. Much more informative than regression functions, conditional densities are of main interest in recent methods, particularly in the Bayesian framework (studying the posterior distribution, find ing its modes...). After recalling the estimation issues in high dimension in the introduction, the two following chapters develop on two methods which address the issues of the curse of dimensionality: being computationally efficient by a greedy iterative procedure, detecting under some suitably defined sparsity conditions the relevant variables, while converging at a quasioptimal minimax rate. More precisely, the two methods consider kernel estimators welladapted for conditional density estimation and select a pointwise multivariate bandwidth by revisiting the greedy algorithm RODEO (Regular isation Of Derivative Expectation Operator). The first method having some initialization problems and extra logarithmic factors in its convergence rate, the second method solves these problems, while adding adaptation to the smoothness. In the penultimate chapter, we discuss the calibration and nu merical performance of these two procedures, before giving some comments and perspectives in the last chapter.

Page generated in 0.0854 seconds