Global ETD Search

1	mustafa_ali_dissertation.pdf Mustafa Fayez Ahmed Ali (14171313) 30 November 2022 (has links) <p>Energy efficient machine learning accelerator design</p> Electrical circuits and systems Digital processor architectures Deep learning machine learning-based VLSI circuit
2	EFFICIENT AND PRODUCTIVE GPU PROGRAMMING Mengchi Zhang (13109886) 28 July 2022 (has links) <p> </p> <p>Productive programmable accelerators, like GPUs, have been developed for generations to support programming features. The ever-increasing performance improves the usability of programming features on GPUs, and these programming features further ease the porting of code and data structure from CPU to GPU. However, GPU programming features, such as function call or runtime polymorphism, have not been well explored or optimized.</p> <p>I identify efficient and productive GPU programming as a potential area to exploit. Although many programming paradigms are well studied and efficiently supported on CPU architectures, their performance on novel accelerators, like GPUs, has never been studied, evaluated, and made perfect. For instance, programming with functions is a commonplace programming paradigm that shapes software programs with modularity and simplifies code with reusability. A large amount of work has been proposed to alleviate function calling overhead on CPUs, however, less paper talked about its deficiencies on GPUs. On the other hand, polymorphism amplifies an object’s behaviors at runtime. A body of work targets</p> <p>efficient polymorphism on CPUs, but no work has ever discussed this feature under GPU contexts.</p> <p><br></p> <p>In this dissertation, I discussed those two programming features on GPU architectures. First, I performed the first study to identify the deficiency of GPU polymorphism. I created micro-benchmarks to evaluate virtual function overhead in controlled settings and the first GPU polymorphic benchmark suite, ParaPoly, to investigate real-world scenarios. The micro-benchmarks indicated that the virtual function overhead is usually negligible but can</p> <p>cause up to a 7x slowdown. Virtual functions in ParaPoly show a geometric meaning of 77% overhead on GPUs compared to the function’s inlined version. Second, I proposed two novel techniques that determine an object’s type only by its address pointer to improve GPU polymorphism. The first technique, Coordinated Object</p> <p>Allocation and function Lookup (COAL) is a software-only technique that uses the object’s address to determine its type. The second technique, TypePointer, needs hardware modification to embed the object’s type information into its address pointer. COAL achieves 80% and 6% improvements, and TypePointer achieves 90% and 12% over contemporary CUDA and our type-based SharedOA.</p> <p>Considering the growth of GPU programs, function calls become a pervasive paradigm to be consistently used on GPUs. I also identified the overhead of excessive register spilling with function calls on GPU. To diminish this cost, I proposed a novel Massively Multithreaded Register Windowing technique with Variable Size Register Window and Register-Conscious Warp Scheduling. Our techniques improve the representative workloads with a geometric</p> <p>mean of 1.18x with only 1.8% hardware storage overhead.</p> Digital processor architectures Distributed systems and algorithms Operating systems Programming languages GPU Programmability Function Virtual Function Polymorphism Object-Oriented Programming
3	The Design of an Oncology Knowledge Base from an Online Health Forum Omar Ramadan (12446526) 22 April 2022 (has links) <p>Knowledge base completion is an important task that allows scientists to reason over knowledge bases and discover new facts. In this thesis, a patient-centric knowledge base</p> <p>is designed and constructed using medical entities and relations extracted from the health forum r/cancer. The knowledge base stores information in binary relation triplets. It is enhanced with an is-a relation that is able to represent the hierarchical relationship between different medical entities. An enhanced Neural Tensor Network that utilizes the frequency of occurrence of relation triplets in the dataset is then developed to infer new facts from</p> <p>the enhanced knowledge base. The results show that when the enhanced inference model uses the enhanced knowledge base, a higher accuracy (73.2 %) and recall@10 (35.4%) are obtained. In addition, this thesis describes a methodology for knowledge base and associated</p> <p>inference model design that can be applied to other chronic diseases.</p> Digital processor architectures Knowledge Base Completion Oncology Knowledge Base Neural Tensor Network Computer Engineering
4	SENSOR FUSION IN NEURAL NETWORKS FOR OBJECT DETECTION Sheetal Prasanna (12447189) 12 July 2022 (has links) <p>Object detection is an increasingly popular tool used in many fields, especially in the<br> development of autonomous vehicles. The task of object detections involves the localization<br> of objects in an image, constructing a bounding box to determine the presence and loca-<br> tion of the object, and classifying each object into its appropriate class. Object detection<br> applications are commonly implemented using convolutional neural networks along with the<br> construction of feature pyramid networks to extract data.<br> Another commonly used technique in the automotive industry is sensor fusion. Each<br> automotive sensor – camera, radar, and lidar – have their own advantages and disadvantages.<br> Fusing two or more sensors together and using the combined information is a popular method<br> of balancing the strengths and weakness of each independent sensor. Together, using sensor<br> fusion within an object detection network has been found to be an effective method of<br> obtaining accurate models. Accurate detections and classifications of images is a vital step<br> in the development of autonomous vehicles or self-driving cars.<br> Many studies have proposed methods to improve neural networks or object detection<br> networks. Some of these techniques involve data augmentation and hyperparameter opti-<br> mization. This thesis achieves the goal of improving a camera and radar fusion network by<br> implementing various techniques within these areas. Additionally, a novel idea of integrating<br> a third sensor, the lidar, into an existing camera and radar fusion network is explored in this<br> research work.<br> The models were trained on the Nuscenes dataset, one of the biggest automotive datasets<br> available today. Using the concepts of augmentation, hyperparameter optimization, sensor<br> fusion, and annotation filters, the CRF-Net was trained to achieve an accuracy score that<br> was 69.13% higher than the baseline</p> Digital processor architectures Computer vision Object detection Sensor Fusion Nuscenes Machine Learning Autonomous Vehicles Radar Lidar Camera Computer Engineering Computer Vision
5	MULTI-SOURCE AND SOURCE-PRIVATE CROSS-DOMAIN LEARNING FOR VISUAL RECOGNITION Qucheng Peng (12426570) 12 July 2022 (has links) <p>Domain adaptation is one of the hottest directions in solving annotation insufficiency problem of deep learning. General domain adaptation is not consistent with the practical scenarios in the industry. In this thesis, we focus on two concerns as below.</p> <p> </p> <p> First is that labeled data are generally collected from multiple domains. In other words, multi-source adaptation is a more common situation. Simply extending these single-source approaches to the multi-source cases could cause sub-optimal inference, so specialized multi-source adaptation methods are essential. The main challenge in the multi-source scenario is a more complex divergence situation. Not only the divergence between target and each source plays a role, but the divergences among distinct sources matter as well. However, the significance of maintaining consistency among multiple sources didn't gain enough attention in previous work. In this thesis, we propose an Enhanced Consistency Multi-Source Adaptation (EC-MSA) framework to address it from three perspectives. First, we mitigate feature-level discrepancy by cross-domain conditional alignment, narrowing the divergence between each source and target domain class-wisely. Second, we enhance multi-source consistency via dual mix-up, diminishing the disagreements among different sources. Third, we deploy a target distilling mechanism to handle the uncertainty of target prediction, aiming to provide high-quality pseudo-labeled target samples to benefit the previous two aspects. Extensive experiments are conducted on several common benchmark datasets and demonstrate that our model outperforms the state-of-the-art methods.</p> <p> </p> <p> Second is that data privacy and security is necessary in practice. That is, we hope to keep the raw data stored locally while can still obtain a satisfied model. In such a case, the risk of data leakage greatly decreases. Therefore, it is natural for us to combine the federated learning paradigm with domain adaptation. Under the source-private setting, the main challenge for us is to expose information from the source domain to the target domain while make sure that the communication process is safe enough. In this thesis, we propose a method named Fourier Transform-Assisted Federated Domain Adaptation (FTA-FDA) to alleviate the difficulties in two ways. We apply Fast Fourier Transform to the raw data and transfer only the amplitude spectra during the communication. Then frequency space interpolations between these two domains are conducted, minimizing the discrepancies while ensuring the contact of them and keeping raw data safe. What's more, we make prototype alignments by using the model weights together with target features, trying to reduce the discrepancy in the class level. Experiments on Office-31 demonstrate the effectiveness and competitiveness of our approach, and further analyses prove that our algorithm can help protect privacy and security.</p> Digital processor architectures Transfer learning (TL) Domain Adaptation Deep Learning Theory image classification methods Computer Engineering
6	COMPARING PSO-BASED CLUSTERING OVER CONTEXTUAL VECTOR EMBEDDINGS TO MODERN TOPIC MODELING Samuel Jacob Miles (12462660) 26 April 2022 (has links) <p>Efficient topic modeling is needed to support applications that aim at identifying main themes from a collection of documents. In this thesis, a reduced vector embedding representation and particle swarm optimization (PSO) are combined to develop a topic modeling strategy that is able to identify representative themes from a large collection of documents. Documents are encoded using a reduced, contextual vector embedding from a general-purpose pre-trained language model (sBERT). A modified PSO algorithm (pPSO) that tracks particle fitness on a dimension-by-dimension basis is then applied to these embeddings to create clusters of related documents. The proposed methodology is demonstrated on three datasets across different domains. The first dataset consists of posts from the online health forum r/Cancer. The second dataset is a collection of NY Times abstracts and is used to compare</p> <p>the proposed model to LDA. The third is a standard benchmark dataset for topic modeling which consists of a collection of messages posted to 20 different news groups. It is used to compare state-of-the-art generative document models (i.e., ETM and NVDM) to pPSO. The results show that pPSO is able to produce interpretable clusters. Moreover, pPSO is able to capture both common topics as well as emergent topics. The topic coherence of pPSO is comparable to that of ETM and its topic diversity is comparable to NVDM. The assignment parity of pPSO on a document completion task exceeded 90% for the 20News-Groups dataset. This rate drops to approximately 30% when pPSO is applied to the same Skip-Gram embedding derived from a limited, corpus specific vocabulary which is used by ETM and NVDM.</p> Digital processor architectures Particle Swarm Optimization Algorithm Topic Modelling Vector Embedding Natural Language Processing Computer Engineering
7	EXPLORATION OF NOVEL EDUCATIONAL TOOLS BASED ON VISUALIZATION Abel Andres Reyes Angulo (11237160) 06 August 2021 (has links) <div>The dynamic on how teaching is performed has changed abruptly in the past few years. Even before the COVID-19 pandemic, class modalities were changing. Instructors were adopting new modalities for lectures, like online and hybrid classes, and the use of collaborative resources were getting more popular over time. The current situation was just a catalyst of an event that was already started, which is the beginning of a new era for education.</div><div><br></div><div>This new education era implies new areas of study and the implementation of tools that promote an efficient learning process by adapting to everything involved in this change. Sciences, technology, engineering, and mathematics education (STEM) and healthcare fields are areas with noticeable demand for professionals in industry around the world. Therefore, the need to have more people academically prepared in these areas is highly prioritized. New tools to be used for learning to complement the mentioned field must show features related to the adoption of new technologies as well as the fact that this is currently a digital era. Emergent specialities like artificial intelligence and data science are traditionally being taught at the university level, due to the complexity of some concepts and the background needed to develop skills related to these areas. However, with the current technology available, tools can be used as complementary learning resources for complex subjects. Visualization helps the users to learn by sharpening the sense of sight and making evident things that are hard to illustrate by words or numbers. Therefore, the use of software for education based on visualization could be the new tools needed for these emergent specialities aligned to this new educational era. Features like intractability, gaming, and multimedia resources can help to make these tools more robust and completed.</div><div><br></div><div>In this work, the implementations of novel educational tools based on visualization for emergent specialization areas like machine learning in STEM and pathophysiology in heathcare were explored. This work summarizes the implementation of three different projects to illustrate the general purpose of this work, showing the relevance of the mentioned areas and proposes educational tools based on visualization, adapting the proposal for each speciality and having in mind different target populations. The projects related to each of the proposed tools includes the analysis to elaborate the content within the tool, the review of the software development, and the testing sessions to identify strengths and weaknesses of the tools. The tools are intended to be designed as frameworks in such a way that the deliverable content could be customized over the time and cover different educational needs.</div> Digital processor architectures Educational tools Nursing education Machine Learning education Software development Computer Engineering
8	DATA-CENTRIC DECISION SUPPORT SYSTEM FRAMEWORK FOR SELECTED APPLICATIONS Xiang Gu (11090106) 15 December 2021 (has links) <p>The web and digital technologies have been continuously growing in the recent five years. The data generated from the Internet of Things (IoT) devices are heterogeneous, increasing data storage and management difficulties. The thesis developed user-friendly data management system frameworks in the local environment and cloud platform. The two frameworks applied to two applications in the industrial field: the agriculture informatics system and the personal healthcare management system. The systems are capable of information management and two-way communication through a user-friendly interface. </p> Data communications Digital processor architectures Framework Design Informatics System User-friendly Data Management System Amazon Web Services MongoDB Computer Engineering Data Communications
9	Quantum Emulation with Probabilistic Computers Shuvro Chowdhury (14030571) 31 October 2022 (has links) <p>The recent groundbreaking demonstrations of quantum supremacy in noisy intermediate scale quantum (NISQ) computing era has triggered an intense activity in establishing finer boundaries between classical and quantum computing. In this dissertation, we use established techniques based on quantum Monte Carlo (QMC) to map quantum problems into probabilistic networks where the fundamental unit of computation, p-bit, is inherently probabilistic and can be tuned to fluctuate between ‘0’ and ‘1’ with desired probability. We can view this mapped network as a Boltzmann machine whose states each represent a Feynman path leading from an initial configuration of q-bits to a final configuration. Each such path, in general, has a complex amplitude, ψ which can be associated with a complex energy. The real part of this energy can be used to generate samples of Feynman paths in the usual way, while the imaginary part is accounted for by treating the samples as complex entities, unlike ordinary Boltzmann machines where samples are positive. This mapping of a quantum circuit onto a Boltzmann machine with complex energies should be particularly useful in view of the advent of special-purpose hardware accelerators known as Ising Machines which can obtain a very large number of samples per second through massively parallel operation. We also demonstrate this acceleration using a recently used quantum problem and speeding its QMC simulation by a factor of ∼ 1000× compared to a highly optimized CPU program. Although this speed-up has been demonstrated using a graph colored architecture in FPGA, we project another ∼ 100× improvement with an architecture that utilizes clockless analog circuits. We believe that this will contribute significantly to the growing efforts to push the boundaries of the simulability of quantum circuits with classical/probabilistic resources and comparing them with NISQ-era quantum computers. </p> Digital processor architectures Nanoelectronics Quantum computation Quantum Computing Probabilistic Computing p-bit qubit quantum circuits Shor's algorithm transverse field Ising Hadamard Boltzmann machine Metropolis–Hasting algorithm Suzuki-Trotter Transform Partition Function Heisenberg Model
10	Internet of Things Architecture Design and Implementation for Immersive Interfaces Javier Belmonte (9193829) 09 September 2022 (has links) <div>The coming of the Internet of things (IoT) has enabled manufacturers, teachers, machine operators, makers, and researchers to design and use new workflows, fabricate parts efficiently and effectively, and interact with systems and devices in ways that were not possible before.</div><div>These networked systems have changed the way in which input is received from and data is outputted to humans. Context-awareness and autonomy are characteristics of these devices that result in automated processes, faster production times, and more intuitive interfaces. Direct manipulation is an intuitive and natural human-computer interaction (HCI) that enables its users an easy and fast learning.</div><div>In this thesis, an Internet of things architecture is designed and implemented to enable control and data visualization in machines and devices through immersive interfaces using direct manipulation. The proposed architecture and interfaces are tested and validated approaching three different categories of systems; namely, systems that need to be modified to be IoT ready, systems that are IoT ready, and systems that have not yet been constructed. For the latter case, a custom system has been made to evaluate and test the whole architecture and its implementation. The knowledge acquired in developing this architecture and the design rationale behind the development of immersive interfaces, are summarized and presented as a series of guidelines and recommendations for IoT systems manufacturers to follow to include immersive interfaces in their designs.</div> Digital processor architectures IoT Industry 4.0 Immersive Interfaces Teleoperation Human Computer Interaction (HCI Mechanical Engineering Computer Engineering

Search results