Global ETD Search

51	A dynamic scheduling runtime and tuning system for heterogeneous multi and many-core desktop platforms / Um sistema de escalonamento dinâmico e tuning em tempo de execução para plataformas desktop heterogêneas de múltiplos núcleos Binotto, Alécio Pedro Delazari January 2011 (has links) Atualmente, o computador pessoal (PC) moderno poder ser considerado como um cluster heterogênedo de um nodo, o qual processa simultâneamente inúmeras tarefas provenientes das aplicações. O PC pode ser composto por Unidades de Processamento (PUs) assimétricas, como a Unidade Central de Processamento (CPU), composta de múltiplos núcleos, a Unidade de Processamento Gráfico (GPU), composta por inúmeros núcleos e que tem sido um dos principais co-processadores que contribuiram para a computação de alto desempenho em PCs, entre outras. Neste sentido, uma plataforma de execução heterogênea é formada em um PC para efetuar cálculos intensivos em um grande número de dados. Na perspectiva desta tese, a distribuição da carga de trabalho de uma aplicação nas PUs é um fator importante para melhorar o desempenho das aplicações e explorar tal heterogeneidade. Esta questão apresenta desafios uma vez que o custo de execução de uma tarefa de alto nível em uma PU é não-determinístico e pode ser afetado por uma série de parâmetros não conhecidos a priori, como o tamanho do domínio do problema e a precisão da solução, entre outros. Nesse escopo, esta pesquisa de doutorado apresenta um sistema sensível ao contexto e de adaptação em tempo de execução com base em um compromisso entre a redução do tempo de execução das aplicações - devido a um escalonamento dinâmico adequado de tarefas de alto nível - e o custo de computação do próprio escalonamento aplicados em uma plataforma composta de CPU e GPU. Esta abordagem combina um modelo para um primeiro escalonamento baseado em perfis de desempenho adquiridos em préprocessamento com um modelo online, o qual mantém o controle do tempo de execução real de novas tarefas e escalona dinâmicamente e de modo eficaz novas instâncias das tarefas de alto nível em uma plataforma de execução composta de CPU e de GPU. Para isso, é proposto um conjunto de heurísticas para escalonar tarefas em uma CPU e uma GPU e uma estratégia genérica e eficiente de escalonamento que considera várias unidades de processamento. A abordagem proposta é aplicada em um estudo de caso utilizando uma plataforma de execução composta por CPU e GPU para computação de métodos iterativos focados na solução de Sistemas de Equações Lineares que se utilizam de um cálculo de stencil especialmente concebido para explorar as características das GPUs modernas. A solução utiliza o número de incógnitas como o principal parâmetro para a decisão de escalonamento. Ao escalonar tarefas para a CPU e para a GPU, um ganho de 21,77% em desempenho é obtido em comparação com o escalonamento estático de todas as tarefas para a GPU (o qual é utilizado por modelos de programação atuais, como OpenCL e CUDA para Nvidia) com um erro de escalonamento de apenas 0,25% em relação à combinação exaustiva. / A modern personal computer can be now considered as a one-node heterogeneous cluster that simultaneously processes several applications’ tasks. It can be composed by asymmetric Processing Units (PUs), like the multi-core Central Processing Unit (CPU), the many-core Graphics Processing Units (GPUs) - which have become one of the main co-processors that contributed towards high performance computing - and other PUs. This way, a powerful heterogeneous execution platform is built on a desktop for data intensive calculations. In the perspective of this thesis, to improve the performance of applications and explore such heterogeneity, a workload distribution over the PUs plays a key role in such systems. This issue presents challenges since the execution cost of a task at a PU is non-deterministic and can be affected by a number of parameters not known a priori, like the problem size domain and the precision of the solution, among others. Within this scope, this doctoral research introduces a context-aware runtime and performance tuning system based on a compromise between reducing the execution time of the applications - due to appropriate dynamic scheduling of high-level tasks - and the cost of computing such scheduling applied on a platform composed of CPU and GPUs. This approach combines a model for a first scheduling based on an off-line task performance profile benchmark with a runtime model that keeps track of the tasks’ real execution time and efficiently schedules new instances of the high-level tasks dynamically over the CPU/GPU execution platform. For that, it is proposed a set of heuristics to schedule tasks over one CPU and one GPU and a generic and efficient scheduling strategy that considers several processing units. The proposed approach is applied in a case study using a CPU-GPU execution platform for computing iterative solvers for Systems of Linear Equations using a stencil code specially designed to explore the characteristics of modern GPUs. The solution uses the number of unknowns as the main parameter for assignment decision. By scheduling tasks to the CPU and to the GPU, it is achieved a performance gain of 21.77% in comparison to the static assignment of all tasks to the GPU (which is done by current programming models, such as OpenCL and CUDA for Nvidia) with a scheduling error of only 0.25% compared to exhaustive search. Processamento paralelo Microeletrônica Processamento : Imagem Processamento : Alto desempenho High-performance computing Scheduling Dynamic load-balancing Heterogenous systems Graphics processors Solvers for systems of linear equations
52	Variedades afins e aplicaÃÃes / Affine varieties and applications Diego Ponciano de Oliveira Lima 03 August 2013 (has links) In this paper, we consider affine varieties in vector space to analyze and understand the geometric behavior of sets solutions of systems of linear equations, solutions of linear ordinary differential equations of second order resulting from mathematical modeling of systems, etc. We observed characteristics of affine varieties in vector spaces as a subspaces vector transferred to any vector belonging to affine variety and do a comparison of geometric representations of the solution sets of problem situations, cited above, with such features. / Neste trabalho, consideramos variedades afins no espaÃo vetorial para analisar e compreender o comportamento geomÃtrico de conjuntos soluÃÃes de sistemas de equaÃÃes lineares, de soluÃÃes de equaÃÃes diferenciais ordinÃrias lineares de segunda ordem resultantes de modelagens matemÃticas de sistemas, etc. Verificamos caracterÃsticas das variedades afins em espaÃos vetoriais como um subespaÃo vetorial transladado de qualquer vetor pertencente Ã variedade afim e fazemos uma comparaÃÃo das representaÃÃes geomÃtricas dos conjuntos soluÃÃes das situaÃÃes-problema, citados acima, com tais caracterÃsticas. variedade afim espaÃo vetorial subespaÃo vetorial equaÃÃes lineares equaÃÃes diferenciais affine variety vector space vector subspace linear equations differential equations MATEMATICA
53	A dynamic scheduling runtime and tuning system for heterogeneous multi and many-core desktop platforms / Um sistema de escalonamento dinâmico e tuning em tempo de execução para plataformas desktop heterogêneas de múltiplos núcleos Binotto, Alécio Pedro Delazari January 2011 (has links) Atualmente, o computador pessoal (PC) moderno poder ser considerado como um cluster heterogênedo de um nodo, o qual processa simultâneamente inúmeras tarefas provenientes das aplicações. O PC pode ser composto por Unidades de Processamento (PUs) assimétricas, como a Unidade Central de Processamento (CPU), composta de múltiplos núcleos, a Unidade de Processamento Gráfico (GPU), composta por inúmeros núcleos e que tem sido um dos principais co-processadores que contribuiram para a computação de alto desempenho em PCs, entre outras. Neste sentido, uma plataforma de execução heterogênea é formada em um PC para efetuar cálculos intensivos em um grande número de dados. Na perspectiva desta tese, a distribuição da carga de trabalho de uma aplicação nas PUs é um fator importante para melhorar o desempenho das aplicações e explorar tal heterogeneidade. Esta questão apresenta desafios uma vez que o custo de execução de uma tarefa de alto nível em uma PU é não-determinístico e pode ser afetado por uma série de parâmetros não conhecidos a priori, como o tamanho do domínio do problema e a precisão da solução, entre outros. Nesse escopo, esta pesquisa de doutorado apresenta um sistema sensível ao contexto e de adaptação em tempo de execução com base em um compromisso entre a redução do tempo de execução das aplicações - devido a um escalonamento dinâmico adequado de tarefas de alto nível - e o custo de computação do próprio escalonamento aplicados em uma plataforma composta de CPU e GPU. Esta abordagem combina um modelo para um primeiro escalonamento baseado em perfis de desempenho adquiridos em préprocessamento com um modelo online, o qual mantém o controle do tempo de execução real de novas tarefas e escalona dinâmicamente e de modo eficaz novas instâncias das tarefas de alto nível em uma plataforma de execução composta de CPU e de GPU. Para isso, é proposto um conjunto de heurísticas para escalonar tarefas em uma CPU e uma GPU e uma estratégia genérica e eficiente de escalonamento que considera várias unidades de processamento. A abordagem proposta é aplicada em um estudo de caso utilizando uma plataforma de execução composta por CPU e GPU para computação de métodos iterativos focados na solução de Sistemas de Equações Lineares que se utilizam de um cálculo de stencil especialmente concebido para explorar as características das GPUs modernas. A solução utiliza o número de incógnitas como o principal parâmetro para a decisão de escalonamento. Ao escalonar tarefas para a CPU e para a GPU, um ganho de 21,77% em desempenho é obtido em comparação com o escalonamento estático de todas as tarefas para a GPU (o qual é utilizado por modelos de programação atuais, como OpenCL e CUDA para Nvidia) com um erro de escalonamento de apenas 0,25% em relação à combinação exaustiva. / A modern personal computer can be now considered as a one-node heterogeneous cluster that simultaneously processes several applications’ tasks. It can be composed by asymmetric Processing Units (PUs), like the multi-core Central Processing Unit (CPU), the many-core Graphics Processing Units (GPUs) - which have become one of the main co-processors that contributed towards high performance computing - and other PUs. This way, a powerful heterogeneous execution platform is built on a desktop for data intensive calculations. In the perspective of this thesis, to improve the performance of applications and explore such heterogeneity, a workload distribution over the PUs plays a key role in such systems. This issue presents challenges since the execution cost of a task at a PU is non-deterministic and can be affected by a number of parameters not known a priori, like the problem size domain and the precision of the solution, among others. Within this scope, this doctoral research introduces a context-aware runtime and performance tuning system based on a compromise between reducing the execution time of the applications - due to appropriate dynamic scheduling of high-level tasks - and the cost of computing such scheduling applied on a platform composed of CPU and GPUs. This approach combines a model for a first scheduling based on an off-line task performance profile benchmark with a runtime model that keeps track of the tasks’ real execution time and efficiently schedules new instances of the high-level tasks dynamically over the CPU/GPU execution platform. For that, it is proposed a set of heuristics to schedule tasks over one CPU and one GPU and a generic and efficient scheduling strategy that considers several processing units. The proposed approach is applied in a case study using a CPU-GPU execution platform for computing iterative solvers for Systems of Linear Equations using a stencil code specially designed to explore the characteristics of modern GPUs. The solution uses the number of unknowns as the main parameter for assignment decision. By scheduling tasks to the CPU and to the GPU, it is achieved a performance gain of 21.77% in comparison to the static assignment of all tasks to the GPU (which is done by current programming models, such as OpenCL and CUDA for Nvidia) with a scheduling error of only 0.25% compared to exhaustive search. Processamento paralelo Microeletrônica Processamento : Imagem Processamento : Alto desempenho High-performance computing Scheduling Dynamic load-balancing Heterogenous systems Graphics processors Solvers for systems of linear equations
54	[en] RESULTS OF AMBROSETTI-PRODI TYPE FOR NON-SELFADJOINT ELLIPTIC OPERATORS / [pt] RESULTADOS DO TIPO AMBROSETTI-PRODI PARA OPERADORES ELÍTICOS NÃO AUTO-ADJUNTOS ANDRE ZACCUR UCHOA CAVALCANTI 13 April 2018 (has links) [pt] O célebre teorema de Ambrosetti-Prodi estuda perturbações do Laplaciano sob condições de Dirichlet por funções não lineares que saltam sobre o autovalor principal do operador. Diversas extensões desse resultado foram obtidos para operadores auto-adjuntos, em particular por Berger-Podolak em 1975, que deram uma descrição geométrica do conjunto solução. Nós empregamos técnicas baseadas no princípio do máximo que nos permite obter novos resultados inclusive para o cenário auto-adjunto. Em particular, nós mostramos que o operador semi-linear é uma dobra global. Obtemos também uma contagem exata de soluções para esses operadores ainda quando a perturbação não é suave. / [en] The celebrated Ambrosetti-Prodi theorem studies perturbations of the Dirichlet Laplacian by a nonlinear function jumping over the principal eigenvalue of the operator. Various extensions of this landmark result were obtained for self-adjoint operators, in particular by Berger-Podolak in 1975, who gave a geometrical description of the solution set. In this thesis we show that similar theorems are valid for non self-adjoint operators. We employ techniques based on the maximum principle, which even let us obtain new results in the self-adjoint setting. In particular, we show that the semilinear operator is a fold. As a consequence, we obtain exact count of solutions for these operators even when the perturbation is non-smooth. [pt] OPERADORES ELITICOS [en] ELLIPTIC OPERATORS [pt] AMBROSETTI-PRODI [en] AMBROSETTI-PRODI [pt] EQUACOES NAO LINEARES [en] NON-LINEAR EQUATIONS [pt] DOBRAS GLOBAIS [en] GLOBAL FOLDS
55	Didaktické přístupy k výuce některých témat v matematice na základní škole v řeči učitelů / Didactic approaches to the teaching of some mathematical topics at the primary school in teachers ́ diskurse Vencovská, Jaroslava January 2017 (has links) The aim of the thesis was through a new analysis of interviews with teachers of mathematics, to describe didactic practices used by teachers while teaching selected topics (namely, proportions, linear equations, divisibility, percent, symmetry, Pythagorean theorem ) and compare them with the practices reported in textbooks and other literature. First, teaching methods, teaching forms and the mechanism of concept development by M. Hejný are given. Based on the analysis of more than thirty interview, it was found that teachers use the usual didactic practices but also create their own methods and procedures. These methods and techniques are provided for each critical issue separately in the fourth chapter of the thesis. Furthermore, the content analysis of selected textbooks is given for each topic. Identified practices of teachers which they use in their teaching practice, form the result of my work.
56	An Investigation of the Effect of Using Twitter by High School Mathematics Students Learning Linear Equations in Algebra 1 Vilchez, Manuel 28 March 2016 (has links) The purpose of this quasi-experimental study was to investigate the effect of using Twitter by high school mathematics students learning linear equations in Algebra 1. This quasi-experimental study used ninth grade Algebra 1 classes that were learning linear equations for 18 school days. First, the nonequivalent control group design, a pretest-posttest quasi-experimental design, was used in this quasi-experimental study. The research hypotheses were tested using a factorial analysis of covariance (ANCOVA) with the pretest on linear equations score as the covariate. The control group had three classes (n = 73) and the experimental group had three classes (n = 78). The experimental group received tweets on a daily basis as students learned linear equations. The tweets contained mathematical content, classroom logistics, or both. Lastly, the control group received the same information in class. The quantitative findings of this quasi-experimental study show that overall Twitter, content tweets, logistics tweets, and tweets containing both (content and logistics) did not have a statistically significant effect on the mean linear equations posttest score. Second, this quasi-experimental study looked at students’ performance on various subtopics throughout the unit. The ANCOVA showed that there were no statistically significant differences between the control group and the experimental groups in most of the quizzes. However, statistically significant differences were found in Quiz #2 and Quiz #4 among the logistics groups. Third, the experimental group took a 10-item survey. The purpose of survey was to understand the students’ opinion of using Twitter as they learned course content in Algebra 1. It can be concluded from the results of that survey that students had, for the most part, a positive attitude towards using Twitter as part of learning mathematics in high school. In conclusion, the use of Twitter is not likely to show an increase in students’ mean posttest linear equations score. However, the findings of the survey conducted after the study did show that the use of Twitter might be able to increase student motivation. The results of this quasi-experimental study made major contributions to the literature by investigating the effects of using Twitter in high school Algebra 1. Learning Technology Educational Technology Twitter Social Networking High School Math Algebra 1 Linear Equations Curriculum and Instruction Instructional Media Design Online and Distance Education Other Education
57	Scaling Context-Sensitive Points-To Analysis Nasre, Rupesh 02 1900 (has links) (PDF) Pointer analysis is one of the key static analyses during compilation. The efficiency of several compiler optimizations and transformations depends directly on the scalability and precision of the underlying pointer analysis. Recent advances still lack an efficient and scalable context-sensitive inclusion-based pointer analysis. In this work, we propose four novel techniques to improve the scalability of context-sensitive points-to analysis for C/C++ programs. First, we develop an efficient way of storing the approximate points-to information using a multi-dimensional bloom filter (multibloom). By making use of fast hash functions and exploiting the spatial locality of the points-to information, our multibloom-based points-to analysis offers significant savings in both analysis time and memory requirement. Since the representation never resets any bit in the multibloom, no points-to information is ever lost; and the analysis is sound, though approximate. This allows a client to trade off a minimal amount of precision but gain huge savings(two orders less) in memory requirement. By making use of multiple random and independent hash functions, the algorithm also achieves high precision and runs, on an average,2×faster than Andersen’s points-to analysis. Using Mod/Ref analysis as a client, we illustrate that the precision is above 98% of that of Andersen’s analysis. Second, we devise a sound randomized algorithm that processes a group of constraints in a less precise but efficient manner and the remaining constraints in a more precise manner. By randomly choosing different groups of constraints across different runs, the analysis results in different points-to information, each of which is guaranteed to be sound. By joining the results of a few runs, the analysis obtains an approximation that is very close to the one obtained by the more precise analysis and still proves efficient in terms of the analysis time. We instantiate our technique to develop a randomized context-sensitive points-to analysis. By varying the level of randomization, a client of points-to analysis can trade off minimal precision (less than 5%) for large gain in efficiency(over 50% reduction in analysis time). We also develop an adaptive version of the randomized algorithm that carefully varies the randomization across different runs to achieve maximum benefit in terms of analysis time and precision without pre-setting the randomization percentage and the number of runs. Third, we transform the points-to analysis problem into finding a solution to a system of linear equations. By making novel use of prime factorization, we illustrate how to transform complex points-to constraints into a set of linear equations and transform the solution back as a points-to solution. We prove that our algorithm is sound and show that our technique is 1.8×faster than Andersen’s analysis for large benchmarks. Finally, we observe that the order in which points-to constraints are processed plays a vital role in the algorithm efficiency. We prove that finding an optimal ordering to compute the fixpoint solution is NP-Hard. We then propose a greedy heuristic based on the amount of points-to information computed by a constraint to prioritize the constraints. This results in a dynamic ordering of the constraint evaluation which, in turn, results in skewed evaluation of constraints where each constraint is evaluated repeatedly and different number of times in a single iteration. Our prioritized analysis achieves, on an average, an improvement of 33% over Andersen’s points-to analysis. We illustrate that our algorithms help in scaling the state-of-the-art pointer analyses. We also believe that the techniques developed would be useful for other program analyses and transformations. Compilers Pointer Analysis (Computer Science) Bloom Filters (Multibloom) Scalable Pointer Analysis Randomized Pointer Analysis Context-Sensitive Pointer Analysis Points-to Analysis Linear Equations Computer Science
58	Akcelerace numerického výpočtu vedení tepla v tuhých tělesech v inverzních úlohách / Acceleration of numerical computation of heat conduction in solids in inverse tasks Ondruch, Tomáš January 2019 (has links) The master's thesis deals with possible ways of accelerating numerical computations, which are present in problems related to heat conduction in solids. The thesis summarizes basic characteristics of heat transfer phenomena with emphasis on heat conduction. Theoretical principles of control volume method are utilized to convert a direct heat conduction problem into a sparse linear system. Relevant fundamentals from the field of inverse heat conduction problems are presented with reference to intensive computations of direct problems of such kind. Numerical methods which are well-suited to find a solution of direct heat conduction problems are described. Remarks on practical implementation of time-efficient computations are made in relation with a two-dimensional heat conduction model. The results are compared and discussed with respect to obtained computational time for several tested methods.
59	Řešení problému nejmenších čtverců s maticemi o proměnlivé hustotě nenulových prvků / Least-squares problems with sparse-dense matrices Riegerová, Ilona January 2020 (has links) Problém nejmenších čtverc· (dále jen LS problém) je aproximační úloha řešení soustav lineárních algebraických rovnic, které jsou z nějakého d·vodu za- tíženy chybami. Existence a jednoznačnost řešení a metody řešení jsou známé pro r·zné typy matic, kterými tyto soustavy reprezentujeme. Typicky jsou ma- tice řídké a obrovských dimenzí, ale velmi často dostáváme z praxe i úlohy s maticemi o proměnlivé hustotě nenulových prvk·. Těmi se myslí řídké matice s jedním nebo více hustými řádky. Zde rozebíráme metody řešení tohoto LS pro- blému. Obvykle jsou založeny na rozdělení úlohy na hustou a řídkou část, které řeší odděleně. Tak pro řídkou část m·že přestat platit předpoklad plné sloupcové hodnosti, který je potřebný pro většinu metod. Proto se zde speciálně zabýváme postupy, které tento problém řeší. 1
60	A Numerical Investigation Of The Canonical Duality Method For Non-Convex Variational Problems Yu, Haofeng 07 October 2011 (has links) This thesis represents a theoretical and numerical investigation of the canonical duality theory, which has been recently proposed as an alternative to the classic and direct methods for non-convex variational problems. These non-convex variational problems arise in a wide range of scientific and engineering applications, such as phase transitions, post-buckling of large deformed beam models, nonlinear field theory, and superconductivity. The numerical discretization of these non-convex variational problems leads to global minimization problems in a finite dimensional space. The primary goal of this thesis is to apply the newly developed canonical duality theory to two non-convex variational problems: a modified version of Ericksen's bar and a problem of Landau-Ginzburg type. The canonical duality theory is investigated numerically and compared with classic methods of numerical nature. Both advantages and shortcomings of the canonical duality theory are discussed. A major component of this critical numerical investigation is a careful sensitivity study of the various approaches with respect to changes in parameters, boundary conditions and initial conditions. / Ph. D. Ericksen's Bar Semi-linear Equations Global Optimization Canonical Duality Theory Canonical Dual Finite Element Method Landau-Ginzburg Problem Duality Non-convex Variational Problems

Search results