Global ETD Search

631	Code Generation for Accelerating Data Flow : Enhancing Pentaho Data Integration Performance / Kodgenerering för snabbare dataflöden : Prestandaförbättring av Pentaho Data Integrations Svensson, Alexander January 2023 (has links) Pentaho Data Integration, called Kettle, is an ETL tool that functions as a no-code program. The tool, implemented in Java, enables users to create data flow structures via a graphical user interface and store them as XML files, which can be edited or executed. In some applications, the current execution method does not provide satisfactory performance. To speed up execution times, we propose a Java code generator that works by analyzing the existing XML setup and Kettle’s existing source code.We also conduct some exploratory work with Apache Hop, another Kettle-based ETL tool, and provide comparative insights.Our analysis demonstrates the potential for significant speed improvements, with times reduced by 60% or even more. We consider this method’s challenges and limitations and propose solutions to overcome them. Overall, our research contributes to the field of no-code programming by highlighting the potential for using code generation to optimize performance in data engineering processes. Computer Systems Datorsystem Communication Systems Kommunikationssystem Övrig annan teknik
632	Minimizing the clock drift in partially synchronized heterogeneous TSN networks Yusuf, Balqis January 2023 (has links) The new generation of embedded systems will increase interaction between the environment, people, and autonomous devices. This will increase their need for communication, particularly in meeting real-time requirements. To address the real-time requirements of embedded systems, a communication network capable of providing high bandwidth, low latency, and deterministic behaviour is necessary. Time-Sensitive Networking (TSN) was developed by the IEEE 802.1 TSN Task Group and is a set of standards providing deterministic service over standard Ethernet and is an attractive option for achieving this. TSN leverages the advantages of IEEE Ethernet standards, including low hardware cost, high bandwidth, and deterministic behaviour. TSN uses time synchronization, traffic shaping, strict priority, and resource reservation mechanisms to provide a reliable and deterministic network environment suitable for real-time applications. However, for these mechanisms to work and TSN to achieve high performance, the network must be fully synchronized. In this thesis, we aim to integrate existing legacy devices into a TSN network without incorporating TSN functionality into them, as implementing all TSN standards requires significant investments in time, financial resources, and infrastructure upgrades. However, as the legacy devices don’t have TSN capabilities and cannot implement TSN synchronization protocols, they cannot synchronize with the TSN switches, which causes negative adverse such as clock drift between the TSN switches and the legacy end-stations. In this thesis, we aim to minimize the clock drift in the partially synchronized heterogeneous network, allowing researchers and organizations to take advantage of the benefits of adopting TSN into a legacy network without facing those issues. To solve the clock drift that occurs between the legacy end-stations and the TSN switches, we implemented one solution by combining those proposed solutions in the previous work [9] by using the Drift Detector (DD) and the Centralised Network Configuration element (CNC). This will be resolved by DD measuring and calculating the difference between the expected and actual reception of the messages from the receiver end-station. The CNC later uses the variation values detected by the DD to modify the TSN schedule and updates the network with the new period. In this way, we could minimize the negative consequences caused by partial synchronization in the network. Computer Systems Datorsystem Communication Systems Kommunikationssystem Telecommunications Telekommunikation Computer Engineering Datorteknik
633	An Integrated Room Booking and Access Control System for Public Spaces Kamil, Jaffar, Amer, Mohamed January 2023 (has links) Public spaces, especially educational institutions like universities, encounter challenges with their room booking and access control systems. These challenges commonly manifest as overlapping bookings and unauthorized entry. The latter issue, unauthorized access, specifically stems from inadequate integration between the respective systems. This bachelor thesis introduces a proof-of-concept for a cohesive room booking and access control system to address these issues. The proposed solution encompasses two mobile applications, one as the room reservation platform and the other as the access control mechanism. By integrating the management of bookings and access control, this proof-of-concept aims to overcome the prevalent shortcomings in existing systems. Halmstad University's IT department was consulted during the requirement definition phase to ensure a comprehensive understanding of the common problems, their underlying causes, and possible solutions. The proposed system utilizes common technologies such as NodeJS, Android Studio, and PostgreSQL. Additionally, Mobile BankID is integrated as a unique feature for secure user authentication, providing a trusted and widely-accepted method to verify users' identities. The final results were tested in a simulated environment and indicate that the developed system satisfies the initial requirements, addressing the problems of double bookings and unauthorized access identified during the consultation with the IT department. Integrated Systems Room Booking System Access Control System Bank ID Mobile Applications Computer Systems Datorsystem
634	A systematic study of the class imbalance problem in convolutional neural networks Buda, Mateusz January 2017 (has links) In this study, we systematically investigate the impact of class imbalance on classification performance of convolutional neural networks and compare frequently used methods to address the issue. Class imbalance refers to significantly different number of examples among classes in a training set. It is a common problem that has been comprehensively studied in classical machine learning, yet very limited systematic research is available in the context of deep learning. We define and parameterize two representative types of imbalance, i.e. step and linear. Using three benchmark datasets of increasing complexity, MNIST, CIFAR-10 and ImageNet, we investigate the effects of imbalance on classification and perform an extensive comparison of several methods to address the issue: oversampling, undersampling, two-phase training, and thresholding that compensates for prior class probabilities. Our main evaluation metric is area under the receiver operating characteristic curve (ROC AUC) adjusted to multi-class tasks since overall accuracy metric is associated with notable difficulties in the context of imbalanced data. Based on results from our experiments we conclude that (i) the effect of class imbalance on classification performance is detrimental and increases with the extent of imbalance and the scale of a task; (ii) the method of addressing class imbalance that emerged as dominant in almost all analyzed scenarios was oversampling; (iii) oversampling should be applied to the level that totally eliminates the imbalance, whereas undersampling can perform better when the imbalance is only removed to some extent; (iv) thresholding should be applied to compensate for prior class probabilities when overall number of properly classified cases is of interest; (v) as opposed to some classical machine learning models, oversampling does not necessarily cause overfitting of convolutional neural networks. / I den här studien undersöker vi systematiskt effekten av klassobalans på prestandan för klassificering hos konvolutionsnätverk och jämför vanliga metoder för att åtgärda problemet. Klassobalans avser betydlig ojämvikt hos antalet exempel per klass i ett träningsset. Det är ett vanligt problem som har studerats utförligt inom maskininlärning, men tillgången av systematisk forskning inom djupinlärning är starkt begränsad. Vi definerar och parametriserar två representiva typer av obalans, steg och linjär. Med hjälpav tre dataset med ökande komplexitet, MNIST, CTFAR-10 och ImageNet, undersöker vi effekterna av obalans på klassificering och utför en omfattande jämförelse av flera metoder för att åtgärda problemen: översampling, undersampling, tvåfasträning och avgränsning för tidigare klass-sannolikheter. Vår huvudsakliga utvärderingsmetod är arean under mottagarens karaktäristiska kurva (ROC AUC) justerat för multi-klass-syften, eftersom den övergripande noggrannheten är förenad med anmärkningsvärda svårigheter i samband med obalanserade data. Baserat på experimentens resultat drar vi slutsatserna att (i) effekten av klassens obalans påklassificeringprestanda är skadlig och ökar med mängden obalans och omfattningen av uppgiften; (ii) metoden att ta itu med klassobalans som framträdde som dominant i nästan samtliga analyserade scenarier var översampling; (iii) översampling bör tillämpas till den nivå som helt eliminerar obalansen, medan undersampling kan prestera bättre när obalansen bara avlägsnas i en viss utsträckning; (iv) avgränsning bör tillämpas för att kompensera för tidigare sannolikheter när det totala antalet korrekt klassificerade fall är av intresse; (v) i motsats till hos vissa klassiska maskininlärningsmodeller orsakar översampling inte nödvändigtvis överanpassning av konvolutionsnätverk. Class Imbalance Convolutional Neural Networks Deep Learning Image Classification Computer Systems Datorsystem
635	Performance comparison between OOD and DOD with multithreading in games Wingqvist, David, Wickström, Filip January 2022 (has links) Background. The frame rate of a game is important for both the end-user and the developer. Maintaining at least 60 FPS in a PC game is the current standard, and demands for efficient game applications rise. Currently, the industry standard within programming is to use Object-Oriented Design (OOD). But with the trend of larger sized games, this frame rate might not be maintainable using OOD. A design pattern that mitigates this is the Data-Oriented Design (DOD) which focuses on utilizing the CPU and memory efficiently. These design patterns differ in how they handle the data associated with them. Objectives. In this thesis, two games were created with two versions that used either OOD or DOD. The first game had multithreading included. New hardware utilizes several CPU cores, therefore, this thesis compares both singlethreaded and multithreaded versions of these design patterns.Methods. Experiments were made to measure the execution time and cache misses on the CPU. Each experiment started with a baseline that was gradually increased to stress the systems under test.Results. The results gathered from the experiments showed that the sections of the code that used DOD were significantly faster than OOD. DOD also had a better affinity with multithreading and was able to achieve at certain parts up to 13 times the speed of equivalent conditioned OOD. In the special case comparison DOD, even though it had larger objects, proved to be faster than OOD.Conclusions. DOD has shown to be significantly faster in execution time with fewer cache misses compared to OOD. Using multithreading for DOD presented to be the most efficient. Game development C++ Execution time CPU cache OpenMP Computer Systems Datorsystem
636	Virtual reality to evaluate UAV based street lights for improving traffic safety Flemark, Adam, Paulander, Axel January 2022 (has links) This thesis will cover the entirety of the development process of Skara Skyddsängels virtual reality test platform. The platform means to support their efforts in creating a lighting drone service that assists pedestrians and cyclists in areas lacking proper lighting infrastructure while at the same time promoting sustainable modes of transportation. Unreal Engine 4 was used since it provides all the tools needed in order to create the desired product. All changeable parameters requested by the supervisor are fully implemented, such as drone altitude, light angle, and light intensity. Beyond the required parameters, some additional miscellaneous features, such as ambient sound, are also implemented to increase the immersion further. The result is a functioning test platform with an environment resembling a rural road in Skara that can assist the Skara Skyddsängel project in accelerating its testing. / Denna kandidatrapport handlar om den kompletta utvecklingsprocessen för Skara Skyddsängels virtual reality testplatform. Denna plattform skapas för att assistera Skara Skyddsängel i deras arbete att utveckla och utvärdera belysningsdrönare för fotgängare och cyklister i områden där traditionell belysningsinfrastruktur inte är lönsam. Genom att erbjuda alternativ infrastuktur strävar projektet mot en ökad användning av mer miljövänliga transportmedel. Samtliga av projektledarens krav är implementerade, såsom till exempel drönarhöjd, ljusvinkel och ljusintensitet. Dessutom finns några övriga funktioner implementerade, så som ljud från omgivningen för att göra miljön mer verklighetstrogen. Den slutgiltliga produkten är en fungerade testplatform med en miljö som efterliknar en väg på landsbygden utanför Skara. Denna plattform kommer hjälpa Skara Skyddsängel i att göra tester för att hitta optimala drönarinställningar / Skara Skyddsängel Virtual Reality Lighting Drone Infrastructure Testing Platform Virtual Reality Belysning Drönare Infrastruktur Testplattform Computer Systems Datorsystem
637	Agile regression system testing / Agilt systemregressionstest Nordvall, Andreas January 2012 (has links) This report describes the work on automating the testing of nodes at CCS (Common Control System) in Ericsson. The goal was to every three hours configure nodes with the latest build and run the tests. This process is to be fully automatic without user in-put. The existing configuration tool CICC (Core Integration node Control Center) is to be used for configuration. Before work started fault reports were analyzed and creating a usecase for testing restarts should reduce some faults.The first step was to make the configuration tool CICC automated. To schedule the test-ing the continuous integration tool Jenkins was used. But Jenkins can’t by itself run CICC nor interpret the result. Therefore a wrapper layer was implemented. When the wrapper is finished it stores the results of the configuration run in a XML (eXtensible Markup Language) file, which Jenkins reads. Results can then be seen in Jenkins through web interface. If there were any failures during configuration or testing the failed step will have an error message.The project shows that automation is possible. Automating the testing reduce the time for correcting errors because they are more likely to be found early in the process. Be-fore implementing this project in production some improvements should be made. The most significant improvement is making the configuration and testing of each node par-allel with each other, in order to make the time limit for configuration and testing less of an issue. / Denna rapport beskriver arbetet med att automatisera testningen av noder hos CCS på Ericsson. Målet var att var tredje timma konfigurera noderna med binärfiler kompilerade från den senaste källkoden och sedan testa dem. Detta ska ske helt automatisk utan att användarens hjälp och konfigurationen ska använda det befintliga konfigurations verktyget CICC. Innan arbetet påbörjades skulle felrapporter analyseras för att se om det fanns något att tjäna på automaseringen.Uppgiften löstes genom att först titta på felrapporterna och konstatera att det fanns rum för förbättringar, främst gällande omstarter. Efter det automatiserades CICC som tidigare körts via en GUI. För att schemalägga konfiguration och testning användes testverktyget Jenkins. Jenkins använder sig av ett s.k. wrapperskript som kör CICC och testfallen. Wrapperskriptet sköter även felhanteringen och skriver sedan resultatet av körningen till en XML fil som läses av Jenkins.Resultaten av testen går sedan att se i Jenkins via ett webinterface. Där går det att se resultatet av wrapperskript körningen och testerna, om det blev några fel finns det felmeddelanden med anledningen till felet. Misslyckade tester visas också.Projektet visar att med automatisk testning som sker oftare kan fler fel hittas tidigare och därför åtgärdas snabbare. Innan arbetet används skarpt bör förbättringar ske som tillexempel att köra konfiguration och testning av olika noder parallellt med varandra i wrapperskriptet, för att klara tidsbegränsningen när det är flera noder. agile regression system testing test agilt systemregressionstest test utveckling Computer Systems Datorsystem
638	Causal Discovery for Time Series : Based on Continuous Optimization Nouri, Ali January 2023 (has links) Causal discovery is an important field of study that seeks to understand the underlying relationships between variables in a system. The goal of causal discovery is to discover the causal relationships from observational data and determine the direction of influence between variables. This information is crucial for making informed decisions and predictions about the behavior of complex systems. This thesis investigates the application of continuous optimization-based causal discovery methods for time series data that exhibit temporal dependencies. The research focuses on finding the optimal method for discovering causal relationships in real-world data, which can include nonlinear associations. Additionally, this thesis evaluated and compared three causal discovery methods, namely NTS-NOTEARS (a neural network-based approach), DYNOTEARS (optimization-based approaches), and the newly developed LTS-NOTEARS (optimization-based approaches). After a thorough examination, NTS-NOTEARS was determined to be the optimal method due to its impressive performance and the potential for further uncertainty analysis. The NTS-NOTEARS method was subjected to extensive testing using various data sets and showed high accuracy, robustness to changes in sample size and number of nodes, and reliability in terms of uncertainty. In conclusion, this thesis provides a comprehensive analysis of the application of continuous optimization-based causal discovery methods for time series data. The research focuses on finding the optimal method for discovering causal relationships in real-world data and introduces a novel measure for analyzing model uncertainty in neural network-based methods. The thesis also presents a novel adaptation of established causal discovery methods to examine delayed causation and generate Structural Vector Auto-regressive (SVAR) models. After extensive testing and evaluation, the NTS-NOTEARS method was determined to be the preferred method, due to its high accuracy, robustness, and reliability. Signal Processing Signalbehandling Computer Systems Datorsystem Övrig annan teknik
639	A Multi-camera based Next Best View Approach for Semantic Scene Understanding Persson, Anton January 2023 (has links) Robots are becoming more common; robotics has gone from bleeding-edge technology to an everyday topic that families discuss around thedinner table.The number of robots in the industry is growing, which means thatthe demand and need for robots to understand the environment it isworking in is also growing.The standard method for a robot to gather information about a sceneinvolves moving to different pre-determined poses from which it canview and analyze the scene. However, this approach does not con-sider the topology of the scene that the robot should explore.This thesis aims to create a two-dimensional approach to determinethe next best view ( 2D-NBV) to view and explore the scene, intro-duced in the method section.The 2D-NBV method converts a point cloud of the scene to an ele-vation map. A segmenting network is used to get the positions ofpre-trained objects. The positions are then used to generate a2DGaussian kernel heatmap of the scene. Using the 2D elevation andGaussian map, the NBV pose is then calculated. The NBV pose isthen converted back to a 6D pose that the robot moves to capture anew point cloud and register it to the scene.The 2D-NBV method is compared to a baseline and a state-of-the-artmethod. The baseline method captures four different point cloudsfrom pre-determined positions and registers them together. The state-of-the-art methods find a point of interest and declare a set of viewcandidates on a sphere around the point. Ray casting is used to findthe pose with the highest information gain. This pose is set as theNBV for the robot to move to. The goal of this thesis is that themethod should perform better than the baseline method, describedfurther in the method section.The evaluation metric used in this thesis is how wellthe differentmethods could estimate the bounding boxes of pre-trained items us-ing an off-the-shelf semantic scene segmentation method. Six sceneswith varying difficulty were constructed to test the methods.The results showed that the 2D-NBV method successfully comple-mented the scene with information about its empty cells. The 2D-NBV outperforms the state-of-the-art on occluded scenes. The 2D-NBV performed overall just as well as the baseline. The reason thatthe 2D-NBV did not outperform the baseline is seen as a consequenceof the information loss going from 3D to 2D. Next Best View NBV Semantic Scene Understanding Robotics Robotics Robotteknik och automation Computer Systems Datorsystem
640	The Effect Background Traffic in VPNs has on Website Fingerprinting / Påverkan av bakgrundstrafik i VPN-tunnlar vid mönsterigenkänningsattacker mot webbplatser Rehnholm, Gustav January 2023 (has links) Tor and VPNs are used by many to be anonymous and circumvent censorship on the Internet. Therefore, traffic analysis attacks that enable adversaries to link users to their online activities are a severe threat. One such attack is Website Fingerprinting (WF), which analyses patterns in the encrypted traffic from and to users to identify website visits. To better understand to which extent WF can identify patterns in VPN traffic, there needs to be a deeper exploration into which extent background traffic in VPNs impacts WF attacks, which is traffic in the stream that the adversary does not wish to classify. This thesis explores how different background traffic types affect WF on VPN traffic. It is done by using existing VPN datasets and combining them into datasets which simulate a VPN tunnel where both foreground and background traffic is sent simultaneously. This is to explore how different kinds of background traffic affect known state-of-the-art WF attacks using Deep Learning (DL). Background traffic does affect DL-based WF attacks, but the impact on accuracy is relatively small compared to the bandwidth overhead: 200 % overhead reduces the accuracy from roughly 95 % to 70 %. WF attacks can be trained without any background traffic, as long as the overhead of the background traffic is smaller than 2 %, without any impact on accuracy. WF attacks can also be trained with background traffic from other applications than what it is tested on, as long as the applications produce similar traffic patterns. For example, traffic from different pre-recorded streaming applications like Netflix and YouTube is similar enough, but not traffic from pre-recorded and live streaming applications such as Twitch. Also, having access to the size of the packets makes WF attacks better than if the size is obscured, making VPNs probably more vulnerable than Tor to WF attacks. Thesis artefacts are available at: https://github.com/gustavRehnholm/wf-vpn-bg / Tor och VPN:er används av många för att ge anonymitet och kringgå censurera i Internet. Därför är trafikanalysattacker som gör det möjligt för angripare att länka användaren till sina onlineaktiviteter ett allvarligt hot. En sådan attack är Website Fingerprinting (WF), som analyserar mönster i den krypterade trafiken mellan användaren och reläet med målet att identifiera webbplatsbesök. För att bättre förstå i vilken ut-sträckning WF kan identifiera mönster i VPN-tunnlar måste det finnas en djupare undersökning i vilken utsträckning bakgrundstrafik i VPN-tunnlar påverkar WF-attacker, trafik i VPN-tuneln som WF-attackeraren inte försöker klassificera. Målet med denna avhandling är att undersöka hur bakgrundstrafik, i olika kombinationer, påverkar WF på VPN-tunnlar. Det görs genom att använda befintliga VPN-datauppsättningar och kombinera dem till datauppsättningar som simulerar en VPN-tunnel där både förgrunds- och bakgrundstrafik skickas samtidigt. Detta är att utforska hur olika typer av bakgrundstrafik påverkar kända WF-attacker med hjälp av djupinlärning. Bakgrundstrafik har en påverkan på djupinlärnings baserade WF-attacker, men påverkan på WF noggrannheten är relativt liten jämfört med overheaden som behövs: 200 %overhead minskar noggrannheten från ungefär 95 % till 70 %. WF-attacker kan tränas utan bakgrundstrafik, så länge bakgrundstrafikens overhead är mindre än 2 %, utan att det påverkar noggrannheten. WF-attacker kan också tränas med bakgrundstrafik från andra applikationer än vad den testas på, så länge applikationerna producerar liknande trafikmönster. Till exempel är trafik från olika förinspelade streamingapplikationer som Netflix och YouTube tillräckligt lik, men inte trafik från förinspelade och livestreamingapplikationer som Twitch. Det är också tydligt att ha tillgång till paketstorlek gör klassificeraren bättre, vilket gör VPN:er förmodligen mer sårbar än Tor. Avhandlingsartefakter finns på följande hemsida: https://github.com/gustavRehnholm/wf-vpn-bg Website Fingerprinting Virtual Private Network Deep Learning Mönsterigenkänning Virtuellt Privat Nätverk Djup Inlärning Computer Systems Datorsystem

Search results