Global ETD Search

61	Decaimento dos autovalores de operadores integrais gerados por núcleos positivos definidos / Decay rates for eigenvalues of integral operators generated by positive definite kernels Jose Claudinei Ferreira 11 February 2008 (has links) Inicialmente, estudamos alguns resultados clássicos da teoria dos núcleos positivos definidos e alguns resultados pertinentes. Estudamos em seguida, o Teorema de Mercer e algumas de suas generalizações e conseqüências, incluindo a caracterização da transformada de Fourier de um núcleo positivo definido com domínio Rm£Rm, m ¸ 1. O trabalho traz um enfoque especial nos núcleos cujo domínio é um subconjunto não-compacto de Rm £ Rm, uma vez que os demais casos são considerados de maneira extensiva na literatura. Aplicamos esses estudos na análise do decaimento dos autovalores de operadores integrais gerados por núcleos positivos definidos / Firstly, we study some classical results from the theory of positive definite kernels along with some related results. Secondly, we focus on generalizations of Mercer\'s theorem and some of their implications. Special attention is given to the cases where the domain of the kernel is not compact, once the other cases are considered consistently in the literature. We include a characterization for the Fourier transform of a positive definite kernel on Rm£Rm, m ¸ 1. Finally, we apply the previous study in the analysis of decay rates for eigenvalues of integral operators generated by positive definite kernels Autovalores Núcleos positivos definidos Operadores integrais Teoria de Mercer Eigenvalues Integral operators Mercer theory Positive definite kernels
62	Decaimento dos autovalores de operadores integrais gerados por séries de potências / Eigenvalue decay of integral operators generated by power series Douglas Azevedo Sant\'Anna 25 February 2013 (has links) O principal objetivo deste trabalho e descrever o decaimento dos autovalores de operadores integrais gerados por núcleos definidos por séries de potências, mediante hipóteses sobre os coeficientes na série que representa o núcleo gerador. A análise e implementada em duas frentes: inicialmente, consideramos o caso em que o núcleo esta definido sobre a esfera unitária de \'R POT. m+1\', estendendo posteriormente a análise, para o caso da bola unitária do mesmo espaço. Em seguida, visando primordialmente o caso em que o núcleo esta definido sobre a esfera unitaria em \'C POT. m+1\', abordamos um caso mais geral, aquele no qual o núcleo esta definido por uma série de funções \'L POT. 2\'(X, u)-ortogonais, sendo (X, u) um espaço de medida arbitrário / The main target in this work is to deduce eigenvalue decay for integral operators generated by power series kernels, under general assumptions on the coefficients in the series representing the kernel. The analysis is twofold: firstly, we consider generating kernels defined on the unit sphere in \'R POT. m+1\', replacing the sphere with the unit ball in a subsequent stage. Secondly, we consider generating kernels defined on a general measure space (X, u) and possessing an \'L POT. 2\'(X, u)-orthogonal expansion there, an attempt to cover the case in which the kernel is defined on the unit sphere in \'C POT. m+1\' Autovalores Operadores integrais Séries de potências Eigenvalues Integral operators Power series kernels
63	Benchmarking a DSP processor / Benchmarking av en DSP processor Lennartsson, Per, Nordlander, Lars January 2002 (has links) This Master thesis describes the benchmarking of a DSP processor. Benchmarking means measuring the performance in some way. In this report, we have focused on the number of instruction cycles needed to execute certain algorithms. The algorithms we have used in the benchmark are all very common in signal processing today. The results we have reached in this thesis have been compared to benchmarks for other processors, performed by Berkeley Design Technology, Inc. The algorithms were programmed in assembly code and then executed on the instruction set simulator. After that, we proposed changes to the instruction set, with the aim to reduce the execution time for the algorithms. The results from the benchmark show that our processor is at the same level as the ones tested by BDTI. Probably would a more experienced programmer be able to reduce the cycle count even more, especially for some of the more complex benchmarks. Datorteknik Benchmarking DSP processor digital signal processing algorithm kernels DSP Datorteknik Computer Engineering Datorteknik
64	Compression guidée par automate et noyaux rationnels / Compression guided by automata and rational kernels Amarni, Ahmed 11 May 2015 (has links) En raison de l'expansion des données, les algorithmes de compression sont désormais cruciaux. Nous abordons ici le problème de trouver des algorithmes de compression optimaux par rapport à une source de Markov donnée. A cet effet, nous étendons l'algorithme de Huffman classique. Pour se faire premièrement on applique Huffman localement à chaque état de la source Markovienne, en donnant le résultat de l'efficacité obtenue pour cet algorithme. Mais pour bien approfondir et optimiser quasiment l'efficacité de l'algorithme, on donne un autre algorithme qui est toujours appliqué localement à chaque états de la source Markovienne, mais cette fois ci en codant les facteurs partant de ces états de la source Markovienne de sorte à ce que la probabilité du facteur soit une puissance de 1/2 (sachant que l'algorithme de Huffman est optimal si et seulement si tous les symboles à coder ont une probabilité puissance de 1/2). En perspective de ce chapitre on donne un autre algorithme (restreint à la compression de l'étoile) pour coder une expression à multiplicité, en attendant dans l'avenir à coder une expression complète / Due to the expansion of datas, compression algorithms are now crucial algorithms. We address here the problem of finding an optimal compression algorithm with respect to a given Markovian source. To this purpose, we extend the classical Huffman algorithm. The kernels are popular methods to measure the similarity between words for classication and learning. We generalize the definition of rational kernels in order to apply kernels to the comparison of languages. We study this generalization for factor and subsequence kerneland prove that these kernels are defined for parameters chosen in an appropriate interval. We give different methods to build weighted transducers which compute these kernels Compression Huffman Automates a poids Noyaux rationnel Transducteurs Compression Huffman Weighted automata Kernels Transducers
65	Efficient Dynamic Automatic Memory Management And Concurrent Kernel Execution For General-Purpose Programs On Graphics Processing Units Pai, Sreepathi 11 1900 (has links) (PDF) Modern supercomputers now use accelerators to achieve their performance with the most widely used accelerator being the Graphics Processing Unit (GPU). However, achieving the performance potential of systems that combine a GPU and CPU is an arduous task which could be made easier with the assistance of the compiler or runtime. In particular, exploiting two features of GPU architectures -- distributed memory and concurrent kernel execution -- is critical to achieve good performance, but in current GPU programming systems, programmers must exploit them manually. This can lead to poor performance. In this thesis, we propose automatic techniques that: i) perform data transfers between the CPU and GPU, ii) allocate resources for concurrent kernels, and iii) schedule concurrent kernels efficiently without programmer intervention. <p>Most GPU programs access data in GPU memory for performance. Manually inserting data transfers that move data to and from this GPU memory is an error-prone and tedious task. In this work, we develop a software coherence mechanism to fully automate all data transfers between the CPU and GPU without any assistance from the programmer. Our mechanism uses compiler analysis to identify potential stale data accesses and uses a runtime to initiate transfers as necessary. This avoids redundant transfers that are exhibited by all other existing automatic memory management proposals for general purpose programs. We integrate our automatic memory manager into the X10 compiler and runtime, and find that it not only results in smaller and simpler programs, but also eliminates redundant memory transfers. Tested on eight programs ported from the Rodinia benchmark suite it achieves (i) a 1.06x speedup over hand-tuned manual memory management, and (ii) a 1.29x speedup over another recently proposed compiler--runtime automatic memory management system. Compared to other existing runtime-only (ADSM) and compiler-only (OpenMPC) proposals, it also transfers 2.2x to 13.3x less data on average. <p>Each new generation of GPUs vastly increases the resources available to GPGPU programs. GPU programming models (like CUDA) were designed to scale to use these resources. However, we find that CUDA programs actually do not scale to utilize all available resources, with over 30% of resources going unused on average for programs of the Parboil2 suite. Current GPUs therefore allow concurrent execution of kernels to improve utilization. We study concurrent execution of GPU kernels using multiprogrammed workloads on current NVIDIA Fermi GPUs. On two-program workloads from Parboil2 we find concurrent execution is often no better than serialized execution. We identify lack of control over resource allocation to kernels as a major serialization bottleneck. We propose transformations that convert CUDA kernels into elastic kernels which permit fine-grained control over their resource usage. We then propose several elastic-kernel aware runtime concurrency policies that offer significantly better performance and concurrency than the current CUDA policy. We evaluate our proposals on real hardware using multiprogrammed workloads constructed from benchmarks in the Parboil2 suite. On average, our proposals increase system throughput (STP) by 1.21x and improve the average normalized turnaround time (ANTT) by 3.73x for two-program workloads over the current CUDA concurrency implementation. <p>Recent NVIDIA GPUs use a FIFO policy in their thread block scheduler (TBS) to schedule thread blocks of concurrent kernels. We show that FIFO leaves performance to chance, resulting in significant loss of performance and fairness. To improve performance and fairness, we propose use of the Shortest Remaining Time First (SRTF) policy instead. Since SRTF requires an estimate of runtime (i.e. execution time), we introduce Structural Runtime Prediction that uses the grid structure of GPU programs for predicting runtimes. Using a novel Staircase model of GPU kernel execution, we show that kernel runtime can be predicted by profiling only the first few thread blocks. We evaluate an online predictor based on this model on benchmarks from ERCBench and find that predictions made after the execution of single thread block are between 0.48x to 1.08x of actual runtime. %Next, we design a thread block scheduler that is both concurrent kernel-aware and incorporates this predictor. We implement the SRTF policy for concurrent kernels that uses this predictor and evaluate it on two-program workloads from ERCBench. SRTF improves STP by 1.18x and ANTT by 2.25x over FIFO. Compared to MPMax, a state-of-the-art resource allocation policy for concurrent kernels, SRTF improves STP by 1.16x and ANTT by 1.3x. To improve fairness, we also propose SRTF/Adaptive which controls resource usage of concurrently executing kernels to maximize fairness. SRTF/Adaptive improves STP by 1.12x, ANTT by 2.23x and Fairness by 2.95x compared to FIFO. Overall, our implementation of SRTF achieves STP to within 12.64% of Shortest Job First (SJF, an oracle optimal scheduling policy), bridging 49% of the gap between FIFO and SJF. GPGPU Automatic Memory Management Concurrent Kernel Graphics Processing Unit (GPU) Elastic Kernels GPGPU Computer and Information Science
66	Modelo matemático para o estudo do efeito Allee sobre a dispersão de plantas por agentes e em meios heterogêneos / Mathematical model for the study of the Allee effect on the dispersal of plants by agents and in heterogeneous environments Lou Vega, Salvador, 1972- 04 May 2013 (has links) Orientador: Wilson Castro Ferreira Junior / Tese (doutorado) - Universidade Estadual de Campinas, Instituto de Matemática Estatística e Computação Científica / Made available in DSpace on 2018-08-22T04:48:33Z (GMT). No. of bitstreams: 1 LouVega_Salvador_D.pdf: 14692586 bytes, checksum: f745ca7fb3da3cfa09dfb8a8dd9f37bb (MD5) Previous issue date: 2013 / Resumo: Apresentamos um modelo integro - recursivo para a dispersão de uma planta que acopla uma dinâmica de reprodução, com efeito, Allee e uma dinâmica de dispersão em um meio heterogêneo. Propomos um modelo de difusão e sedimentação para derivar núcleos de dispersão teóricos, que representem o padrão de dispersão de sementes gerado por pássaros frugívoros em um meio heterogêneo. O núcleo gerado através do modelo _e capaz de reproduzir o padrão espacial de agregação de sementes gerado pelos pássaros frugívoros sob condições naturais. Enquanto _a dinâmica de reprodução, consideramos um efeito Allee devido à limitação de pólen, que reduz a produção de sementes. Introduzimos o efeito Allee através de uma função de probabilidade que depende da densidade local de plantas. Analisa-se o comportamento da expansão da planta, e estima-se a velocidade média de expansão. O modelo mostra uma invasão através de pulsos, que atribuímos ao efeito Allee e ao comportamento de dispersão da planta / Abstract: We present an integro-difference model for a plant dispersal, which couples a reproductive dynamic with Allee effect and dispersal dynamic in a heterogeneous environment. We propose diffusion and settling model to derive theoretical dispersal kernels that represent the seed dispersal pattern generated by frugivores birds in a heterogeneous environment. The dispersal kernel derived through the model is able to reproduce the aggregate seed dispersal pattern generated by the frugivores birds under field conditions. As for the reproductive dynamic, we consider an Allee effect due to pollen limitation, which reduces seed production. We introduce the Allee effect through a probability function, which depends on the local plant density. The plant expansion behavior is analyzed, and the average expansion speed is estimated. The model shows a pulsed invasion, which we attribute to the Allee effect and the plant dispersal behavior / Doutorado / Matematica Aplicada / Doutor em Matemática Aplicada Efeito Allee Núcleo de dispersão Plantas - Dispersão Allee effect Dispersal kernels Plants - Dispersal
67	Architecture-aware Algorithm Design of Sparse Tensor/Matrix Primitives for GPUs Nisa, Israt 02 October 2019 (has links) No description available. Computer Science
68	With or without context : Automatic text categorization using semantic kernels Eklund, Johan January 2016 (has links) In this thesis text categorization is investigated in four dimensions of analysis: theoretically as well as empirically, and as a manual as well as a machine-based process. In the first four chapters we look at the theoretical foundation of subject classification of text documents, with a certain focus on classification as a procedure for organizing documents in libraries. A working hypothesis used in the theoretical analysis is that classification of documents is a process that involves translations between statements in different languages, both natural and artificial. We further investigate the close relationships between structures in classification languages and the order relations and topological structures that arise from classification. A classification algorithm that gets a special focus in the subsequent chapters is the support vector machine (SVM), which in its original formulation is a binary classifier in linear vector spaces, but has been extended to handle classification problems for which the categories are not linearly separable. To this end the algorithm utilizes a category of functions called kernels, which induce feature spaces by means of high-dimensional and often non-linear maps. For the empirical part of this study we investigate the classification performance of semantic kernels generated by different measures of semantic similarity. One category of such measures is based on the latent semantic analysis and the random indexing methods, which generates term vectors by using co-occurrence data from text collections. Another semantic measure used in this study is pointwise mutual information. In addition to the empirical study of semantic kernels we also investigate the performance of a term weighting scheme called divergence from randomness, that has hitherto received little attention within the area of automatic text categorization. The result of the empirical part of this study shows that the semantic kernels generally outperform the “standard” (non-semantic) linear kernel, especially for small training sets. A conclusion that can be drawn with respect to the investigated datasets is therefore that semantic information in the kernel in general improves its classification performance, and that the difference between the standard kernel and the semantic kernels is particularly large for small training sets. Another clear trend in the result is that the divergence from randomness weighting scheme yields a classification performance surpassing that of the common tf-idf weighting scheme. automatic text categorization subject classification machine learning computational linguistics support vector machines semantic kernels term weighting divergence from randomness
69	Multi-objective ROC learning for classification Clark, Andrew Robert James January 2011 (has links) Receiver operating characteristic (ROC) curves are widely used for evaluating classifier performance, having been applied to e.g. signal detection, medical diagnostics and safety critical systems. They allow examination of the trade-offs between true and false positive rates as misclassification costs are varied. Examination of the resulting graphs and calcu- lation of the area under the ROC curve (AUC) allows assessment of how well a classifier is able to separate two classes and allows selection of an operating point with full knowledge of the available trade-offs. In this thesis a multi-objective evolutionary algorithm (MOEA) is used to find clas- sifiers whose ROC graph locations are Pareto optimal. The Relevance Vector Machine (RVM) is a state-of-the-art classifier that produces sparse Bayesian models, but is unfor- tunately prone to overfitting. Using the MOEA, hyper-parameters for RVM classifiers are set, optimising them not only in terms of true and false positive rates but also a novel measure of RVM complexity, thus encouraging sparseness, and producing approximations to the Pareto front. Several methods for regularising the RVM during the MOEA train- ing process are examined and their performance evaluated on a number of benchmark datasets demonstrating they possess the capability to avoid overfitting whilst producing performance equivalent to that of the maximum likelihood trained RVM. A common task in bioinformatics is to identify genes associated with various genetic conditions by finding those genes useful for classifying a condition against a baseline. Typ- ically, datasets contain large numbers of gene expressions measured in relatively few sub- jects. As a result of the high dimensionality and sparsity of examples, it can be very easy to find classifiers with near perfect training accuracies but which have poor generalisation capability. Additionally, depending on the condition and treatment involved, evaluation over a range of costs will often be desirable. An MOEA is used to identify genes for clas- sification by simultaneously maximising the area under the ROC curve whilst minimising model complexity. This method is illustrated on a number of well-studied datasets and ap- plied to a recent bioinformatics database resulting from the current InChianti population study. Many classifiers produce “hard”, non-probabilistic classifications and are trained to find a single set of parameters, whose values are inevitably uncertain due to limited available training data. In a Bayesian framework it is possible to ameliorate the effects of this parameter uncertainty by averaging over classifiers weighted by their posterior probabil- ity. Unfortunately, the required posterior probability is not readily computed for hard classifiers. In this thesis an Approximate Bayesian Computation Markov Chain Monte Carlo algorithm is used to sample model parameters for a hard classifier using the AUC as a measure of performance. The ability to produce ROC curves close to the Bayes op- timal ROC curve is demonstrated on a synthetic dataset. Due to the large numbers of sampled parametrisations, averaging over them when rapid classification is needed may be impractical and thus methods for producing sparse weightings are investigated. 519.6
70	Um Compilador para a linguagem RS distribuída / A compiler for distributed RS language Librelotto, Giovani Rubert January 2001 (has links) A Linguagem RS é destinada a programação de núcleos reativos centralizados. Tais núcleos são responsáveis por toda a lógica de um sistema reativo, manipulando os sinais de entrada, realizando as reações e gerando os sinais de saída. Sendo sua idéia inicial tratar apenas processos centralizados, não houve a preocupação com a distribuição. Este trabalho tem como principal objetivo apresentar os aspectos introduzidos de uma nova versão para a Linguagem e para o Compilador RS, que possibilitam a execução de programas distribuídos. Além da possibilidade de execução de sistemas reativos distribuídos, foi acrescentado à Linguagem RS extensões já previstas na sua criação, como sinais inibidores, regras de exclusão mútua e concomitância, a possibilidade de disparo de mais de uma regra em um mesmo instante e a limpeza léxica do código fonte RS. As modificações incorporadas nesta nova versão da linguagem, foram efetivadas através de um novo compilador, chamado de Compilador RS 5.0. O protótipo implementado oferece a geração de três formatos de código: o formato padrão da linguagem RS (os autômatos e as regras correspondentes), códigos na linguagem C para a simulação dos autômatos (tanto para programas distribuídos quanto não-distribuídos) e arquivos no formato portável OC, que é um formato de código objeto padrão para as linguagens reativas. Para a distribuição e implementação da Linguagem RS foi necessária a criação de um novo núcleo de comunicação do MDX, que é responsável pela comunicação dos autômatos RSD. Este núcleo é dividido em três partes. A primeira trata da definição de um modelo formal com as mudanças necessárias para que a linguagem RS consiga trabalhar de forma distribuída, a segunda mostra o projeto do novo núcleo MDX e a terceira apresenta a implementação em C e MDX dos autômatos gerados pelo Compilador RS 5.0. Por fim, exemplos de aplicação desta nova linguagem são apresentados, onde podem ser vistos a importância e o acréscimo proporcionado por este trabalho tanto à linguagem RS quanto à programação de sistemas reativos síncronos. / The RS language is intended to the programming of centralized reactive kernels. Such kernels are responsible for the logic of a reactive system, manipulating the input signals, carrying through the reactions and generating the output signals. Being its initial idea to treat only centered processes, it did not have the concern with the distribution. The main objective of this work is to describe the process of creation of a new version for the Language and Compiler RS, that make possible the execution of distributed programs. Beyond the possibility of execution distributed reactive systems, it was added to RS language foreseen extensions already in its creation, as inhibiting signals, rules of manual exclusion and concurrence, the possibility of detonation of more than a rule in one exactly instant and the lexical cleanness of the RS code source. The modifications incorporated in this new version of the language, had been accomplished through a new compiler, called Compiler RS 5.0. The implemented archetype offers the generation of three formats of code: the standard format of RS language (the corresponding automatons and rules), codes in the language C for the simulation of the automatons and archives in OC portable format, that is a object format code standard for the reactive languages. For the distribution and implementation of Language RS was necessary the creation of a new kernel of communication of the MDX, that is responsible for the communication of RSD automatons. It is divided in three parts. The first one deals with the definition of a formal model that defines the necessary changes so that RS language obtains to work of distributed form, the second shows the design of new MDX kernel and third presents the implementation in C and MDX of the automatons generated for Compiler RS 5.0. Finally, examples of application of this new language are presented, where the importance and the proportionate upgrade for this work to RS language how to the programming of synchronous reactive systems can in such a way be seen. Compiladores Sistemas reativos RS Processamento distribuido RS language Reactive systems Reactive kernels Real-time systems Parallel Distributed programming MDX

Search results