Global ETD Search

351	MDX on Hadoop : A case study on OLAP for Big Data Stengård, Jakob January 2015 (has links) Online Analytical Processing (OLAP) is a method used for analyzing data within business intelligence and data mining, using n-dimensional hyper cubes. These cubes stores the aggregates of multiple dimensions of the data, and can traditionally be computed from a dimensional relational model in SQL databases, known as a star schema. Multidimensional expressions are a type of queries commonly used by BI tools to query OLAP cubes. This thesis investigates ways to conduct one-line OLAP like queries against a dimensional relational model, based in a Hadoop cluster. In the evaluation, Hive-on-Spark and Hive-on-Tez and various formats have been compared. The most significant conclusions are that Hive-on-Tez delivers better performance than Hive-on-Spark, and that the ORC format seems to be the best performing format. It could not be demonstrated that less than 20-second performance could be achieved for all queries with the given setup and dataset or that order of input data significantly affects the performance of the ORC format. Scaling seems fairly linear for a cluster of 3 nodes. It also could not be demonstrated that Hive indexes or bucketing improves performance. Computer and Information Sciences Data- och informationsvetenskap
352	Efficient and Reliable Filesystem Snapshot Distribution Vosandi, Lauri January 2015 (has links) Linux is an portable operating system kernel devised by Linus Torvalds and it can be used in conjunction with other userspace utilities such as GNU to build a free and open-source operating system for a multitude of target applications. While Linux-based operating systems have made significant progress on the servers and embedded systems, there is still much room for improvement for workstations and laptops. Up to now Linux-based operating system deployment has been error prone, time-consuming process and usually specific to a particular distribution of Linux. Linux-based operating systems also have a reputation of being overly complex to set up for a novice computer user and even though there are now laptops available with pre-installed Ubuntu [1], installing Linux-based operating system on arbitrary device is troublesome due to lack of native support for certain hardware components. In this thesis Butterknife, a B-tree file system (Btrfs) and Linux Containers (LXC) based provisioning suite is presented. Butterknife can be used to significantly reduce deployment time of customized Linuxbased operating system. Butterknife makes use of LXC to prepare a template of the root filesystem and Btrfs snapshotting to save state of the template. Btrfs send/receive mechanism is then used to transfer the root filesystem to the target machine. Post-deployment scripts are then used to configure the root filesystem for particular deployment, optionally retaining hostname, domain membership, configuration management keys etc. Current implementation of Butterknife uses HTTP(S) and multicast for transport, and various peer-to-peer scenarios are discussed in the Section 6 – Conclusions and Future Work. In addition to provisioning, Butterknife makes use of Btrfs incremental snapshots to implement differential upgrades. This approach is especially attractive for mobile devices, embedded systems and Internet of Things, where software upgrades have to be delivered in a guaranteed manner. Butterknife brings additional value to already existing ecosystem by bridging gap between stock installation medium and configuration management. Computer and Information Sciences Data- och informationsvetenskap
353	From Discovery to Purchase: Improving the User Experience for Buyers in eCommerce Hahn, Jasper January 2015 (has links) The Internet has revolutionized many areas of our lives. New forms of exchanging and retrieving information, making business and communication in general have been made possible with the Internet and have gone through a rapid development since its creation. In an age of nearly ubiquitous access to the Internet and a majority of the western world actively using social media, retail markets have changed, too. But compared to the rapidly changing services in other sectors, retail businesses have only converted an existing model to a new technology rather than coming up with a new one. Social Commerce is an approach that wants to change that. It takes into account lessons learned from social media and shifting marketing strategies and tries to create a better shopping experience for customers while giving brands and fashion influencers a new platform to engage with them. This thesis project uses literature from different fields such as interaction design, online marketing and fashion along with user interviews to identify the most important aspects that will lead towards a more social online shopping experience, particularly in fashion. It is conducted in collaboration with the local start-up Apprl (www.apprl.com) and includes an implementation part of realizing the identified most promising features as part of the agile development process within the company. The field of social commerce is promising to radically change the way we buy things online and Apprl is one of many examples trying to make that happen. Computer and Information Sciences Data- och informationsvetenskap
354	Radio Resource Management Algorithms for D2D Communications With Limited Channel State Information Zhao, Peiyue January 2015 (has links) Network assisted Device-to-Device (D2D) communication has the potential benefits of increasing system capacity, energy efficiency and achievable peak rates while reducing the end-to-end latency. To realize these gains, recent works have proposed power control (PC) and resource allocation (RA) schemes that show near optimal performance in terms of spectral or energy efficiency. Unfortunately, these schemes assume perfect and instantaneous access to either large scale or small scale channel state information (CSI) at some central entity. Obviously, this assumption does not hold in practical implementations and we therefore investigate the performance of D2D communications with limited CSI. First, we analyze existing power control (PC), mode selection (MS) and resource allocation (RA) approaches in terms of the required input parameters, focusing on large scale fading. Then we build up a model in a system simulator to capture the impact of unavailability or CSI errors on the performance of PC, MS and RA algorithms. Through simulations, we find that with proper algorithms, the system gains continuously from having more CSI knowledge. Specially, with additional CSI, the newly implemented Binary Power Control and Matching Allocation increases the throughput impressively with low complexity and proper fairness between D2D layer and cellular layer. Furthermore, we investigate the impact of errors in the channel gains. Simulation results demonstrate that a certain user may suffer or benefit from the errors, however, the system performance is insensitive to the small scale errors. Numerical results also show errors of asymmetric range cause relatively more notable impact than the symmetric errors. Computer and Information Sciences Data- och informationsvetenskap
355	Ett program för att upptäcka tekniska begränsningar utifrån utdata från exekverade prestandatester / An application to detect technical limitations by using data from performance test executions Östberg, Mikael January 2015 (has links) Före lansering av ny mjukvara så är det vitkigt att veta mjukvarans begränsningar innan användarna gör det. Det kan vara både en tidskrävande och svår uppgift. En erfaren testare kan luta sig tillbaka på erfarenhet som pekpinne vart man kan börja titta på den stora mängden data från ett lasttest och avgöra dess begränsningar. Den här studien föreslår ett program som förenklar proceduren genom att upptäcka flaskhalsar med hjälp av schematiska regeldefinitioner som gör det möjligt att anpassa detektionsbeteendet utefter domänen. Kombinerat med välkända algoritmer från signalbehandling som lägger märke till f örändringar i alla typer av rå data kan f öreslå vilken typ av begränsning som finns i systemet. Pålitligheten av programmet testas med hjälp av fyra olika experiment som använder rå data som innehåller flaskhalsar för CPU, minne eller nätverk eller inget. Resultaten föreslår att programmets pålitlighet motiverar fortsatta studier eftersom den gissar rätt väldigt ofta vilken typ av begränsning som finns i systemet när något sådant är på plats. Dock är resultaten f ör det fjärde experimentet när ingen flaskhals finns i systemet riktigt dåliga vilket f öreslår att ett annat sätt att upptäcka avsaknaden av begränsningar behövs. Experimentet visar att metoden kan användas f ör att bygga tillägg eller funktioner som assisterar oerfarna lasttestare f ör väldigt simpla begränsningar. / Before launching new software it is imperative to know the limits of the application before the users do. It can be both a timeconsuming and a difficult task. A seasoned performance tester may rely on experience to know where to start looking at great amounts of data from performance tests to detect its limits. This study implements and tests the reliability of an application by applying a model to simplify the procedure of detecting bottlenecks with the help of a schema defining metric connections specific to a target application domain. Combined with well known signalprocessing algorithms for detecting changes in any raw data can suggest what type of bottleneck is present in a system. The reliability of the application is assessed by four types of experiments carried out to detect the bottleneck from raw data containing bottlenecks of the types CPU, memory or networ or nothing. The results suggests that the applications reliability motivates further study since it presents a very strong ratio of correct guesses when a bottleneck is present within a system. However, the results for the fourth experiment where no bottleneck is present in a system are very bad, suggesting a different model for detecting no bottlenecks is needed. The experiment shows that the method suggested can be used to build add-ons or features that may assist inexperienced performance testers for very simple bottlenecks. Computer and Information Sciences Data- och informationsvetenskap
356	Utvärdering av Golang för högpresterande radioaccessystem / Evaluation of Golang for High Performance Scalable Radio Access Systems Forsby, Filip, Persson, Martin January 2015 (has links) Den ökande mobildataanvändningen sätter större press på de nuvarande teknologier som används i dagens radioaccessystem och nya lösningar behövs för att tillfredsställa de nya kraven. Det är därför viktigt att utvärdera uppkommande teknologier, vilket även inkluderar programmeringsspråk, för att betsämma dess lämplighet för den här typen av användingsområde. Golang är ett nytt programmeringsspråk som ännu inte har blivit utvärderat. Den här rapporten har som syfte att genomföra en utvärdering av Golang för användning i högpresterande skalbara radioaccessystem. För att göra detta utvecklades en applikation från en redan existerande model skriven i Erlang, och de två implementationerna testades och jämfördes med specifika nyckelvärden i åtanke. Resultaten visar att Golang presterar bra och har potential att vara en god kandidat för framtida system. Däremot visar sig språket att inte vara helt moget och saknar viktig funktionalitet och behöver vidareutvecklas för att bli väl lämpat för denna specifika applikation. / Increasing mobile data traffic puts pressure on the current technologies used in today’s radio access units and new solutions are needed in order to cope with the greater demands. It’s therefore important to evaluate emerging technologies, including programming languages, to determine their suitability for this field of application. Golang is one of these new programming languages that have not yet been evaluated. This thesis has the purpose to perform an evaluation of Golang used in a high performance scalable radio access system. To do this, an application is developed from an already existing model written in Erlang and the two implementations are compared and benchmarked with specific key aspects in mind. The results show that Golang performs well and has the potential to be a good candidate for future systems. However, the language is found to not yet be fully mature and lacks important functionality required and needs to be further developed in order to be fully suitable for this specific application. Computer and Information Sciences Data- och informationsvetenskap
357	Tidsberoende restider för Vehicle Routing Problem : MED OPTAPLANNER Andersson, Johan, Leborg, Sebastian January 2015 (has links) Trafikstockning är ett vanligt förekommande problem i storstäder och för med sig förseningar och extra kostnader för transportföretag. Vehicle Routing Problem är ett kombinatoriskt optimeringsproblem som ämnar hitta lägsta kostnaden att besöka en mängd kunder med flera fordon. Här beskrivs ett sätt att använda och förbättra ruttplanering i Stockholm med Vehicle Routing Problem och OptaPlanner genom att införa tidsberoende restider. Modeller till Vehicle Routing Problem har skapats där kostnaden mellan kunderna kvantifierades genom att mäta distans fågelvägen, fasta restider inom trafiknätet och tidsberoende restider i trafiknätet. Jämförelser visade en tydlig förbättring hos modellerna som utgick från trafiknätet, jämfört med sträckan fågelvägen. Modellen med tidsberoende restider visade en marginell förbättring gentemot den fasta restider. Denna relativt lilla förbättring kan förklaras genom de heuristiker som har använts. / Traffic congestion is a common problem in urban areas, which results in delays and increased costs for transport companies. Vehicle routing problem is a combinatorial optimization problem intending to find the lowest cost to visit multiple clients with a fleet of vehicles. This report describes how route planning in Stockholm can be improved by optimizing the Vehicle Routing Problem using OptaPlanner by introducing time-dependent travel times. Models for the Vehicle Routing Problem have been created where the cost is quantified by calculating the distance by a straight line, fixed travel times in the traffic network and time-dependent travel times in the traffic network. Results showed a clear improvement of the models that used costs based on the traffic network, as compared to the model where the distance is measured by a straight line. The model with time-dependent travel times showed a marginal improvement over the fixed travel times. The rather small improvement may be due to the heuristics that have been used. Computer and Information Sciences Data- och informationsvetenskap
358	Störningar i ett trådlöst lokalt nätverk - 6LoWPANs påverkan på wM-Bus / Disturbance in a wireless local network - 6LoWPANs interference on wM-Bus Salomäki, Ville January 2015 (has links) Två trådlösa nätverksstandarder, Wireless Meter-Bus och 6LoWPAN använder frekvenser som ligger inom samma område. De används i samma nätverk för att skicka mätvärden. Det finns en risk att de skickar samtidigt. Uppgiften med detta arbete är att ta reda på hur stor risken är, med en förenklad modell av verkligheten. Wireless Meter-Bus är en standard för trådlös mätar-avläsning och 6LoWPAN är en trådlös nätverkstandard som använder sig av IPv6. En formel skapas som kan räkna ut sannolikheten för paketförlust. Tester utförs för att mäta störningar på olika 6LoWPANkanaler. Sedan testas formeln med både fysiskt test och med simulering. Det fysiska testet kommer ganska nära det väntade värdet, men lite osäkerhet återstår. Simuleringen däremot får däremot väldigt nära värden med det väntade värdet. / Wireless Meter-Bus and 6LoWPAN use frequencies in the same area. They are used in the same network to send measure data. There is a possibility that they disturb each other when they send at the same time. The task with this project is to find out how big the risk is that this happens, with a simplified model of reality. Wireless Meter-Bus is a standard for wireless meter-readings and 6LoWPAN is a wireless network-standard that uses IPv6. A formula for calculating the probability of packet error is created. Tests are done to measure how much different 6LoWPAN-channels disturb. Then the formula is tested with both physical tests and simulation. The physical test gets pretty close to the expected value, but some uncertainty remains. The simulation however gets very close to the expected value. Computer and Information Sciences Data- och informationsvetenskap
359	Utveckling av testsystem för detektorer / Development of test systems for detectors Rabbinson, Mirdad January 2015 (has links) Detta examensarbete har utförts hos Philips Microdose i Solna och handlar om att undersöka detektorer som återsänts för reparation från kunder. Detektorerna är en del av mammografistativet och är en värdefull elektronisk enhet. I händelse att dessa kommer tillbaka från kund vill Philips Microdose kunna reparera dessa och kunna säkerställa att dessa detektorer uppfyller lika hög kvalitet som nytillverkadeenheter. Detta examensarbete har undersökt följande frågeställningar: Kan ett system byggas som testar dessa detektorer? • Vad behövs det för att testa olika modeller av detektorer? • Vilken typ av fel kan säkerställas? • Vilka typer av tester ska utföras och under hur lång tid? • Finns det möjlighet att långtidstesta helt obemannat? • Är detta arbete lönsamt? Examenarbete visar också förslag på att förbättra och säkerställa kvalitet genom att specifera instruktioner för olika typer av tester för detektorer. / This master thesis carried out at Philips Microdose in Solna is about to investigate the detectors when they are sent back for repair from customers. Detectors are valuable electronic devices in a Mammography instruments. Philips Microdose want to ensure that these detectors will meet the same quality as a newly manufactured when they will be used again. In this graduate work is there factors examined: • Can a system be built to test these detectors? • What is needed to test different models of detectors? • What type of hidden defects we can ensure? • What kinds of tests should be carried out and for how long? • Is it possible to long- term test completely unmanned? • Is this work profitable? Graduate work shows besides that, proposals to improve and ensure quality by Specify the instructions for the different type of test for the detectors. Computer and Information Sciences Data- och informationsvetenskap
360	A Secure and Reliable Platform for Storing and Processing Genomic Data on Hadoop Sedar, Roshan January 2014 (has links) Since 2007, the cost of sequencing a whole human genome has decreased by roughly half every 4 months. As of 2014, whole genome sequencing would cost only 1,000 dollars, and, as such, Next-Generation sequencing (NGS) machines are now a source of Big Data - the Illumina HiSeq X Ten can produce up to 20 PB of data per year. The dominant open-source platform for storing and processing Big Data is Apache Hadoop. However, Hadoop does not support user identity natively, and, as genomic data is sensitive data, there are no existing solutions for multi-tenancy that meet the needs of organizations to securely store and process genomic data. In this thesis, we address the problem for how to enable Biobank users to securely store, access, and share genomic data in Hadoop. The proposed solution of the work is based on leveraging security support in the J2EE framework, and by constraining access to Hadoop through a web application built in this project. However, HTTP(S) limits the size of files that can be transferred into web applications, and we address the follow-on problem of how to enable users to efficiently, easily, and securely copy genomic data into Hadoop. Our prototype demonstrates how Hadoop can be secured to support sensitive data, and how Big Data can be securely transported over HTTP. Computer and Information Sciences Data- och informationsvetenskap

Search results