Global ETD Search

41	High-Performance Accurate and Approximate Multipliers for FPGA-Based Hardware Accelerators Ullah, Salim, Rehman, Semeen, Shafique, Muhammad, Kumar, Akash 07 February 2023 (has links) Multiplication is one of the widely used arithmetic operations in a variety of applications, such as image/video processing and machine learning. FPGA vendors provide high-performance multipliers in the form of DSP blocks. These multipliers are not only limited in number and have fixed locations on FPGAs but can also create additional routing delays and may prove inefficient for smaller bit-width multiplications. Therefore, FPGA vendors additionally provide optimized soft IP cores for multiplication. However, in this work, we advocate that these soft multiplier IP cores for FPGAs still need better designs to provide high-performance and resource efficiency. Toward this, we present generic area-optimized, low-latency accurate, and approximate softcore multiplier architectures, which exploit the underlying architectural features of FPGAs, i.e., lookup table (LUT) structures and fast-carry chains to reduce the overall critical path delay (CPD) and resource utilization of multipliers. Compared to Xilinx multiplier LogiCORE IP, our proposed unsigned and signed accurate architecture provides up to 25% and 53% reduction in LUT utilization, respectively, for different sizes of multipliers. Moreover, with our unsigned approximate multiplier architectures, a reduction of up to 51% in the CPD can be achieved with an insignificant loss in output accuracy when compared with the LogiCORE IP. For illustration, we have deployed the proposed multiplier architecture in accelerators used in image and video applications, and evaluated them for area and performance gains. Our library of accurate and approximate multipliers is opensource and available online at https://cfaed.tu-dresden.de/pd-downloads to fuel further research and development in this area, facilitate reproducible research, and thereby enabling a new research direction for the FPGA community. info:eu-repo/classification/ddc/620 ddc:620
42	Client-server based statistical computing Lehmann, Heiko 18 May 2004 (has links) Viele statistische Problemstellungen erfordern in der heutigen Zeit den Einsatz von Computern. Der von uns in dieser Dissertation vorgestellte Ansatz kombiniert die Fähigkeiten der statistischen Softwareumgebung XploRe, mit den Vorteilen einer verteilten Client/Server Anwendung und den Möglichkeiten, die sich durch das Internet eröffnen. Um den Client einer großen Gruppe von Anwendern zugänglich zu machen, wurde Java zu seiner Realisierung verwendet. Das Ergebnis ist ein Statistikpaket, nutzbar via World Wide Web, das wie ein herkömmliches statistisches Softwarepaket verwendet werden kann, ohne die Notwendigkeit, das gesamte Softwarepaket zu installieren. Die vorliegende Arbeit gibt einen Überblick über die notwendige Softwareumgebung und erläutert die generelle Struktur der XploRe Quantlet Client/Server Architektur. Die Arbeit zeigt außerdem Anwendungen, in die diese Architektur bereits integriert wurde. / In today’s world, many statistical questions require the use of computational assistance. Our approach, presented in this thesis, combines the possibilities of a powerful statistical software environment, with the advantages of distributed client/server applications, and the opportunities offered by the Internet. In order to offer the client access to a large community, the Java language is used to implement the client’s functionalities. The result is a statistical package - usable via the World Wide Web - that can be used like a traditional statistical software package, but without the need for installing the entire software package on the user’s computer. This thesis provides an overview of the desired software environment, and illustrates the general structure with the implementation of the XploRe Quantlet Client/Server architecture. It also shows applications, in which this architecture has already been integrated. Client/Server Architektur XploRe Java Statistisches Rechnen Elektronische Bücher Interaktives Lehren Java Client/Server Architecture XploRe Statistical Computing Electronic Books Interactive Teaching 330 Wirtschaft 17 Wirtschaft ddc:330
43	Interactive in situ visualization of large volume data Gupta, Aryaman 10 January 2024 (has links) Three-dimensional volume data is routinely produced, at increasingly high spatial resolution, in computer simulations and image acquisition tasks. In-situ visualization, the visualization of an experiment or simulation while it is running, enables new modes of interaction, including simulation steering and experiment control. These can provide the scientist a deeper understanding of the underlying phenomena, but require interactive visualization with smooth viewpoint changes and zooming to convey depth perception and spatial understanding. As the size of the volume data increases, however, it is increasingly challenging to achieve interactive visualization with smooth viewpoint changes. This thesis presents an end-to-end solution for interactive in-situ visualization based on novel extensions proposed to the Volumetric Depth Image (VDI) representation. VDIs are view-dependent, compact representations of volume data than can be rendered faster than the original data. Novel methods are proposed in this thesis for generating VDIs on large data and for rendering them faster. Together, they enable interactive in situ visualization with smooth viewpoint changes and zooming for large volume data. The generation of VDIs involves decomposing the volume rendering integral along rays into segments that store composited color and opacity, forming a representation much smaller than the volume data. This thesis introduces a technique to automatically determine the sensitivity parameter that governs the decomposition of rays, eliminating the need for manual parameter tuning in the generation of a VDI. Further, a method is proposed for sort-last parallel generation and compositing of VDIs on distributed computers, enabling their in situ generation with distributed numerical simulations. A low latency architecture is proposed for the sharing of data and hardware resources with a running simulation. The resulting VDI can be streamed for interactive visualization. A novel raycasting method is proposed for rendering VDIs. Properties of perspective projection are exploited to simplify the intersection of rays with the view-dependent segments contained within the VDI. Spatial smoothness in volume data is leveraged to minimize memory accesses. Benchmarks are performed showing that the method significantly outperforms existing methods for rendering the VDI, and achieves responsive frame rates for High Definition (HD) display resolutions near the viewpoint of generation. Further, a method is proposed to subsample the VDI for preview rendering, maintaining high frame rates even for large viewpoint deviations. The quality and performance of the approach are analyzed on multiple datasets, and the contributions are provided as extensions of established open-source tools. The thesis concludes with a discussion on the strengths, limitations, and future directions for the proposed approach. info:eu-repo/classification/ddc/006 ddc:006
44	Neurocognitive evidence for cultural recycling of cortical maps in numerical cognition Knops, André 06 March 2015 (has links) Das Kernsystem zur approximativen Verarbeitung numerischer Informationen - das approximative Mengensystem (AMS) - ist, ebenso wie Systeme zur Verarbeitung räumlicher Informationen, im parietalen Cortex (PC) implementiert. Hier integriere ich 9 experimentelle Studien in vier Teilen und zeige, wie abstrakte mathematische Fähigkeiten mit dem AMS zusammenhängen. Die Hypothese ist, dass die mathematischen Leistungen des Menschen auf grundlegenden Konzepten (Raum, Zahl) aufbauen indem sie kortikale Areale ko-optieren, deren ursprüngliche Organisation für die neuen kulturellen Bedürfnisse geeignet erscheinen. Teil eins zeigt mittels des Operationalen Momentum Effekts, dass (nicht-)symbolisches Rechnen auf das AMS zurückgreift und Kopfrechnen evolutionär alte Strukturen im PC ko-optiert: Durch Anwendung multivariater Lernalgorithmen auf funktionelle Gehirnaktivierungen im posterioren PC während basaler perzeptueller Aufgaben (Sakkaden) konnte ich später ausgeführter Additionen von Subtraktionen unterscheiden. Dies ist ein Hinweis auf das kulturelle Recycling kortikaler Karten für kulturell bedingte kognitive Funktionen. Teil zwei untersucht die Folgen der Implementierung numerischer Informationen im PC. Die Verarbeitung numerischer Informationen konnte auch unter Crowding-Bedingungen nachgewiesen werden, was auf einen bevorzugten, nicht-bewusst vermittelten Zugang numerischer Informationen zum kognitiven System deuten könnte, wie sie bereits für andere visuelle Informationen, die im PC verarbeitet werden gezeigt wurde. Auch die Interferenz zwischen räumlichen und numerischen Informationen kann als Konsequenz der kortikalen und repräsentationalen Überlappung verstanden werden. In Teil drei und vier argumentiere ich, dass Kopfrechenfähigkeiten durch die Befähigung, Ordinalität zu verarbeiten, im AMS verankert sind und zeige technische, Stimulus-inhärente Faktoren auf, die problematisch bei der Unterscheidung zwischen approximativem und exaktem Rechnen sein können. / A plethora of evidence supports the idea of a core system in the parietal cortex (PC) of the human brain that enables us to approximately process numerical information, the approximate number system (ANS). By synthesizing nine experimental studies in four parts, I argue how abstract mathematical competencies are linked to the ANS and PC. The hypothesis is that human mathematics builds from foundational concepts (space, number) by progressively co-opting cortical areas whose prior organization fits with the cultural need. In part one the operational momentum effect demonstrates that (non-)symbolic approximate calculation partly relies on the ANS, and that mental arithmetic co-opts evolutionarily older cortical systems in PC. Low-level perceptual processes such as saccades lead to spatial patterns of activation in posterior parts of PC that are predictive of patterns during abstract approximate calculation processes. This is interpreted in terms of cultural recycling of cortical maps for cognitive purposes that go beyond the evolutionary scope of a given region. Part two investigates the consequences of the parietal implementation of numerical magnitude information. Akin to other visual properties that are processed in PC this may favour a privileged, non-conscious access of numerical information to the cognitive system even under a crowding regime. Also, the interference between spatial and numerical information can be interpreted as a consequence of a representational and cortical overlap. Part three elucidates the grounding of mental arithmetic abilities in the ANS and argues for a mediation of the association between ANS and symbolic arithmetic via numerical ordering abilities, which in turn rely on neural circuits in right-hemispheric prefrontal cortex. In part four I will argue that the involvement of approximate calculation in high-level symbolic calculation remains elusive due to a number of technical issues with stimulus-inherent numerical features. numerische Kognition Kopfrechnen kulturelles Recycling Zahl-Raum Assoziation approximatives Rechnen Operationales Momentum numerical cognition mental arithmetic cultural recycling number-space association approximate calculation operational momentum 150 Psychologie 11 Psychologie CZ 1320 ddc:150
45	Effiziente parallele Sortier- und Datenumverteilungsverfahren für Partikelsimulationen auf Parallelrechnern mit verteiltem Speicher / Efficient Parallel Sorting and Data Redistribution Methods for Particle Codes on Distributed Memory Systems Hofmann, Michael 16 April 2012 (has links) (PDF) Partikelsimulationen repräsentieren eine Klasse von daten- und rechenintensiven Simulationsanwendungen, die in unterschiedlichen Bereichen der Wissenschaft und der industriellen Forschung zum Einsatz kommen. Der hohe Berechnungsaufwand der eingesetzten Lösungsmethoden und die großen Datenmengen, die zur Modellierung realistischer Probleme benötigt werden, machen die Nutzung paralleler Rechentechnik hierfür unverzichtbar. Parallelrechner mit verteiltem Speicher stellen dabei eine weit verbreitete Architektur dar, bei der eine Vielzahl an parallel arbeitenden Rechenknoten über ein Verbindungsnetzwerk miteinander Daten austauschen können. Die Berechnung von Wechselwirkungen zwischen Partikeln stellt oft den Hauptaufwand einer Partikelsimulation dar und wird mit Hilfe schneller Lösungsmethoden, wie dem Barnes-Hut-Algorithmus oder der Schnellen Multipolmethode, durchgeführt. Effiziente parallele Implementierungen dieser Algorithmen benötigen dabei eine Sortierung der Partikel nach ihren räumlichen Positionen. Die Sortierung ist sowohl notwendig, um einen effizienten Zugriff auf die Partikeldaten zu erhalten, als auch Teil von Optimierungen zur Erhöhung der Lokalität von Speicherzugriffen, zur Minimierung der Kommunikation und zur Verbesserung der Lastbalancierung paralleler Berechnungen. Die vorliegende Dissertation beschäftigt sich mit der Entwicklung eines effizienten parallelen Sortierverfahrens und der dafür benötigten Kommunikationsoperationen zur Datenumverteilung in Partikelsimulationen. Hierzu werden eine Vielzahl existierender paralleler Sortierverfahren für verteilten Speicher analysiert und mit den Anforderungen von Seiten der Partikelsimulationsanwendungen verglichen. Besondere Herausforderungen ergeben sich dabei hinsichtlich der Aufteilung der Partikeldaten auf verteilten Speicher, der Gewichtung zu sortierender Daten zur verbesserten Lastbalancierung, dem Umgang mit doppelten Schlüsselwerten sowie der Verfügbarkeit und Nutzung speichereffizienter Kommunikationsoperationen. Um diese Anforderungen zu erfüllen, wird ein neues paralleles Sortierverfahren entwickelt und in die betrachteten Anwendungsprogramme integriert. Darüber hinaus wird ein neuer In-place-Algorithmus für der MPI_Alltoallv-Kommunikationsoperation vorgestellt, mit dem der Speicherverbrauch für die notwendige Datenumverteilung innerhalb der parallelen Sortierung deutlich reduziert werden kann. Das Verhalten aller entwickelten Verfahren wird jeweils isoliert und im praxisrelevanten Einsatz innerhalb verschiedener Anwendungsprogramme und unter Verwendung unterschiedlicher, insbesondere auch hochskalierbarer Parallelrechner untersucht. Paralleles Sortieren Datenumverteilung Partikelsimulation Performance-Optimierung Verteilter Speicher Message-Passing-Programmierung parallel sorting data redistribution particle simulation performance optimization distributed memory message passing programming ddc:005 Sortierverfahren Parallelverarbeitung Computersimulation Verteilter Speicher Wissenschaftliches Rechnen
46	Auswirkungen von akkumulierten Rückmeldungsformen mit einem Computer-Lernprogramm über Textaufgaben bei Kindern aus fünften und sechsten Klassen: / Eine empirische Studie zum Computergestützten Lernen / The Effects of Accumulative Forms of Feedback with a Computer based Learning Program for Tasks of Text on Children between 5th and 6th Grade: / An Empirical Study on Computer-Assisted-Learning Abdelaal, Sabry Mohamed Ismail Attia 28 January 2005 (has links) No description available. 300 Sozialwissenschaften Soziologie Social Sciences Computer-Lernprogramm Rückmeldung akkumulierten Rückmeldungsformen Textaufgaben Lesen-Denken-Rechnen Computergestützten Lernen Computer based Learning Program Feedback Accumulative Forms Feedback Tasks of Text 80
47	Towards Next Generation Sequential and Parallel SAT Solvers / Hin zur nächsten Generation Sequentieller und Paralleler SAT-Solver Manthey, Norbert 08 January 2015 (has links) (PDF) This thesis focuses on improving the SAT solving technology. The improvements focus on two major subjects: sequential SAT solving and parallel SAT solving. To better understand sequential SAT algorithms, the abstract reduction system Generic CDCL is introduced. With Generic CDCL, the soundness of solving techniques can be modeled. Next, the conflict driven clause learning algorithm is extended with the three techniques local look-ahead, local probing and all UIP learning that allow more global reasoning during search. These techniques improve the performance of the sequential SAT solver Riss. Then, the formula simplification techniques bounded variable addition, covered literal elimination and an advanced cardinality constraint extraction are introduced. By using these techniques, the reasoning of the overall SAT solving tool chain becomes stronger than plain resolution. When using these three techniques in the formula simplification tool Coprocessor before using Riss to solve a formula, the performance can be improved further. Due to the increasing number of cores in CPUs, the scalable parallel SAT solving approach iterative partitioning has been implemented in Pcasso for the multi-core architecture. Related work on parallel SAT solving has been studied to extract main ideas that can improve Pcasso. Besides parallel formula simplification with bounded variable elimination, the major extension is the extended clause sharing level based clause tagging, which builds the basis for conflict driven node killing. The latter allows to better identify unsatisfiable search space partitions. Another improvement is to combine scattering and look-ahead as a superior search space partitioning function. In combination with Coprocessor, the introduced extensions increase the performance of the parallel solver Pcasso. The implemented system turns out to be scalable for the multi-core architecture. Hence iterative partitioning is interesting for future parallel SAT solvers. The implemented solvers participated in international SAT competitions. In 2013 and 2014 Pcasso showed a good performance. Riss in combination with Copro- cessor won several first, second and third prices, including two Kurt-Gödel-Medals. Hence, the introduced algorithms improved modern SAT solving technology. Künstliche Intelligenz Automatisches Schließen Suche Aussagenlogik Erfüllbarkeitsproblem Paralleles Rechnen Formelvereinfachung Reduktionssystem Artificial Intelligence Automated Reasoning Search Propositional Logic Satisfiability Testing Parallel Computing Constraint Programming Formula Simplification Pseudo Boolean Reduction System ddc:004 rvk:ST 300 Erfüllbarkeitsproblem Künstliche Intelligenz Suche Aussagenlogik Reduktionssystem
48	Using the Media as a Means to Develop Students’ Statistical Concepts Kemp, Marian 02 May 2012 (has links) (PDF) In this era of increasingly fast communication people are being exposed to quantitative information, from national and international sources, through a range of media including newspapers, magazines, television, radio, pod-casts, YouTube and other areas of the Internet. Contexts include health statistics, environmental issues, traffic statistics, wars, gun laws and so on. It is becoming more and more important that citizens are able to critically read and interpret this information, and to do so requires an understanding of statistical concepts. Research has shown that students are motivated and engaged in learning through the use of authentic, real life tasks. The media provides current information, which can be used to help develop both students’ awareness of how social issues are constructed as well as vital statistical concepts. This paper proposes that secondary school students\' application of a model for statistical analysis to material taken from media sources, enhances their understanding of statistical concepts. This model, called the Five Step Framework, is described and exemplified for the particular context of opinion polling. Fünf-Stufen-Rahmen Medien Statistiken Tabellen Kompetenz Rechnen Meinungsforschung Schüler der Sekundarstufe Five Step Framework Media Statistics Tables Quantitative literacy Numeracy Opinion polling Secondary students ddc:510 rvk:SD 2009
49	Paper&Pencil Skills in the 21st Century, a Dichotomy? Meissner, Hartwig, Diephaus, Annabella 07 May 2012 (has links) (PDF) There is a worldwide development, better to say a non-development: We teach paper & pencil skills in primary schools almost like we did 30 or 50 or 100 years ago. Till today the primary school teachers spend up to more than 100 hours in the class room to teach and to train old fashioned algorithms though in daily life situations and for business purposes everybody uses a calculator. Why do we waste so much time of our children to teach them things which later on they will not need? We see an emotional dichotomy. Despite the research results from many research projects in many countries there still is the fear that the use of calculators in primary grades will harm mental arithmetic and estimation skills. To explain and to overcome that fear we will reflect the nature of number sense and of paper&pencil skills more carefully. We realize that the development of number sense is an intuitive and unconscious mental process while the ability to get an exact calculation result is trained logically and consciously. To overcome the above dichotomy we must solve the hidden dichotomy number sense versus precise calculation result. We need a new balance. Different types of examples will be given how we can further the development of number sense in a technology dominated curriculum. Dichotomie Zahlengefühl Taschenrechner Kopfrechnen Schriftliches Rechnen Schätzen Überschlagen Versuchen und Probieren dichotomy number sense calculators mental arithmetic paper-and-pencil skills estimation approximation One-Way-Principle ddc:510 rvk:SD 2009
50	Efficient Broadcast for Multicast-Capable Interconnection Networks Siebert, Christian 30 September 2006 (has links) The broadcast function MPI_Bcast() from the MPI-1.1 standard is one of the most heavily used collective operations for the message passing programming paradigm. This diploma thesis makes use of a feature called "Multicast", which is supported by several network technologies (like Ethernet or InfiniBand), to create an efficient MPI_Bcast() implementation, especially for large communicators and small-sized messages. A preceding analysis of existing real-world applications leads to an algorithm which does not only perform well for synthetical benchmarks but also even better for a wide class of parallel applications. The finally derived broadcast has been implemented for the open source MPI library "Open MPI" using IP multicast. The achieved results prove that the new broadcast is usually always better than existing point-to-point implementations, as soon as the number of MPI processes exceeds the 8 node boundary. The performance gain reaches a factor of 4.9 on 342 nodes, because the new algorithm scales practically independently of the number of involved processes. / Die Broadcastfunktion MPI_Bcast() aus dem MPI-1.1 Standard ist eine der meistgenutzten kollektiven Kommunikationsoperationen des nachrichtenbasierten Programmierparadigmas. Diese Diplomarbeit nutzt die Multicastfähigkeit, die von mehreren Netzwerktechnologien (wie Ethernet oder InfiniBand) bereitgestellt wird, um eine effiziente MPI_Bcast() Implementation zu erschaffen, insbesondere für große Kommunikatoren und kleinere Nachrichtengrößen. Eine vorhergehende Analyse von existierenden parallelen Anwendungen führte dazu, dass der neue Algorithmus nicht nur bei synthetischen Benchmarks gut abschneidet, sondern sein Potential bei echten Anwendungen noch besser entfalten kann. Der letztendlich daraus entstandene Broadcast wurde für die Open-Source MPI Bibliothek "Open MPI" entwickelt und basiert auf IP Multicast. Die erreichten Ergebnisse belegen, dass der neue Broadcast üblicherweise immer besser als jegliche Punkt-zu-Punkt Implementierungen ist, sobald die Anzahl von MPI Prozessen die Grenze von 8 Knoten überschreitet. Der Geschwindigkeitszuwachs erreicht einen Faktor von 4,9 bei 342 Knoten, da der neue Algorithmus praktisch unabhängig von der Knotenzahl skaliert. info:eu-repo/classification/ddc/004 ddc:004 Benchmark Broadcastingverfahren Cluster <Rechnernetz> Ethernet Hochleistungsrechnen MPI <Schnittstelle> Multicastingverfahren Open Source Wissenschaftliches Rechnen Collective Operations Kollektive Operationen MPI_Bcast Open MPI

Search results