Global ETD Search

341	Automatic Log Analysis System Integration : Message Bus Integration in a Machine Learning Environment Svensson, Carl January 2015 (has links) Ericsson is one of the world's largest providers of communications technology and services. Reliable networks are important to deliver services that live up to customers' expectations. Tests are frequently run on Ericsson's systems in order to identify stability problems in their networks. These tests are not always completely reliable. The logs produced by these tests are gathered and analyzed to identify abnormal system behavior, especially abnormal behavior that the tests might not have caught. To automate this analysis process, a machine learning system, called the Awesome Automatic Log Analysis Application (AALAA), is used at Ericsson's Continuous Integration Infrastructure (CII)-department to identify problems within the large logs produced by automated Radio Base Station test loops and processes. AALAA is currently operable in two versions using different distributed cluster computing platforms: Apache Spark and Apache Hadoop. However, it needs improvements in its machine-to-machine communication to make this process more convenient to use. In this thesis, message communication has successfully been implemented in the AALAA system. The result is a message bus deployed in RabbitMQ that is able to successfully initiate model training and abnormal log identification through requests, and to handle a continuous flow of result updates from AALAA. / Ericsson är en av världens största leverantörer av kommunikationsteknologi och tjänster. Tillförlitliga nätverk är viktigt att tillhandahålla för att kunna leverera tjänster som lever upp till kundernas förväntningar. Tester körs därför ofta i Ericssons system med syfte att identifiera stabilitetsproblem som kan uppstå i nätverken. Dessa tester är inte alltid helt tillförlitliga, producerade testloggar samlas därför in och analyseras för att kunna identifiera onormalt beteende som testerna inte lyckats hitta. För att automatisera denna analysprocess har ett maskininlärningssystem utvecklats, Awesome Automatic Log Analysis Application (AALAA). Detta system används i Ericssons Continuous Integration Infrastructure (CII)-avdelning för att identifiera problem i stora loggar som producerats av automatiserade Radio Base Station tester. AALAA är för närvarande funktionellt i två olika versioner av distribuerad klusterberäkning, Apache Spark och Apache Hadoop, men behöver förbättringar i sin maskin-till-maskin-kommunikation för att göra dem enklare och effektivare att använda. I denna avhandling har meddelandekommunikation implementerats som kan kommunicera med flera olika moduler i AALAA. Resultatet är en meddelandebuss implementerad i RabbitMQ som kan initiera träning av modeller och identifiering av onormala loggar på begäran, samt hantera ett kontinuerligt flöde av resultatuppdateringar från pågående beräkningar. Big Data Machine learning Message passing Machine-to-machine communication Big Data Maskininlärning Meddelandesändning Maskin-till-maskin kommunikation Communication Systems Kommunikationssystem
342	A scalable database for a remote patient monitoring system Mukhammadov, Ruslan January 2013 (has links) Today one of the fast growing social services is the ability for doctors to monitor patients in their residences. The proposed highly scalable database system is designed to support a Remote Patient Monitoring system (RPMS). In an RPMS, a wide range of applications are enabled by collecting health related measurement results from a number of medical devices in the patient’s home, parsing and formatting these results, and transmitting them from the patient’s home to specific data stores. Subsequently, another set of applications will communicate with these data stores to provide clinicians with the ability to observe, examine, and analyze these health related measurements in (near) real-time. Because of the rapid expansion in the number of patients utilizing RPMS, it is becoming a challenge to store, manage, and process the very large number of health related measurements that are being collected. The primary reason for this problem is that most RPMSs are built on top of traditional relational databases, which are inefficient when dealing with this very large amount of data (often called “big data”). This thesis project analyzes scalable data management to support RPMSs, introduces a new set of open-source technologies that efficiently store and manage any amount of data which might be used in conjunction with such a scalable RPMS based upon HBase, implements these technologies, and as a proof of concept, compares the prototype data management system with the performance of a traditional relational database (specifically MySQL). This comparison considers both a single node and a multi node cluster. The comparison evaluates several critical parameters, including performance, scalability, and load balancing (in the case of multiple nodes). The amount of data used for testing input/output (read/write) and data statistics performance is 1, 10, 50, 100, and 250 GB. The thesis presents several ways of dealing with large amounts of data and develops & evaluates a highly scalable database that could be used with a RPMS. Several software suites were used to compare both relational and non-relational systems and these results are used to evaluate the performance of the prototype of the proposed RPMS. The results of benchmarking show that MySQL is better than HBase in terms of read performance, while HBase is better in terms of write performance. Which of these types of databases should be used to implement a RPMS is a function of the expected ratio of reads and writes. Learning this ratio should be the subject of a future thesis project. / En av de snabbast växande sociala tjänsterna idag är möjligheten för läkare att övervaka patienter i sina bostäder. Det beskrivna, mycket skalbara databassystemet är utformat för att stödja ett sådant Remote Patient Monitoring-system (RPMS). I ett RPMS kan flertalet applikationer användas med hälsorelaterade mätresultat från medicintekniska produkter i patientens hem, för att analysera och formatera resultat, samt överföra dem från patientens hem till specifika datalager. Därefter kommer ytterligare en uppsättning program kommunicera med dessa datalager för att ge kliniker möjlighet att observera, undersöka och analysera dessa hälsorelaterade mått i (nära) realtid. På grund av den snabba expansionen av antalet patienter som använder RPMS, är det en utmaning att hantera och bearbeta den stora mängd hälsorelaterade mätningar som samlas in. Den främsta anledningen till detta problem är att de flesta RPMS är inbyggda i traditionella relationsdatabaser, som är ineffektiva när det handlar om väldigt stora mängder data (ofta kallat "big data"). Detta examensarbete analyserar skalbar datahantering för RPMS, och inför en ny uppsättning av teknologier baserade på öppen källkod som effektivt lagrar och hanterar godtyckligt stora datamängder. Dessa tekniker används i en prototypversion (proof of concept) av ett skalbart RPMS baserat på HBase. Implementationen av det designade systemet jämförs mot ett RPMS baserat på en traditionell relationsdatabas (i detta fall MySQL). Denna jämförelse ges för både en ensam nod och flera noder. Jämförelsen utvärderar flera kritiska parametrar, inklusive prestanda, skalbarhet, och lastbalansering (i fallet med flera noder). Datamängderna som används för att testa läsning/skrivning och statistisk prestanda är 1, 10, 50, 100 respektive 250 GB. Avhandlingen presenterar flera sätt att hantera stora mängder data och utvecklar samt utvärderar en mycket skalbar databas, som är lämplig för användning i RPMS. Flera mjukvaror för att jämföra relationella och icke-relationella system används för att utvärdera prototypen av de föreslagna RPMS och dess resultat. Resultaten av dessa jämförelser visar att MySQL presterar bättre än HBase när det gäller läsprestanda, medan HBase har bättre prestanda vid skrivning. Vilken typ av databas som bör väljas vid en RMPS-implementation beror därför på den förväntade kvoten mellan läsningar och skrivningar. Detta förhållande är ett lämpligt ämne för ett framtida examensarbete. Big data database performance scalability load balancing Remote Patient Monitoring System Big data databas prestanda skalbarhet lastbalansering Remote Patient Monitoring System Communication Systems Kommunikationssystem
343	Service Innovation and Business Models : A Case Study of A Small Swedish ICT Company / Serviceinnovation och affärsmodeller : En fallstudie av ett mindre företag inom ICT-industrin Wendel, Alexander January 2013 (has links) Innovation has become of increased importance to a company’ competitive advantage during the past years. Over the years, the importance of services has increased. Information and Tele- Communication Technologies (ICT) have become a supportive role in almost any type of industry. The ICT market is continuously changing at a very high pace. In order to cope with these changes, companies active within the IT and software industry needs to unceasingly maintain their solutions up to date. This thesis provides a case study on Digital Marketing AB, a small company active in the ITindustry, delivering tools for planning, sending, and analyzing digital marketing campaigns. Digital Marketing AB operates within a market that is changing rapidly. As new technologies emerge, existing technologies becomes known, and low-cost versions of the present technology appears in the market, eroding revenues from more differentiated services. Furthermore, if the companies are small, and do not have the same financial resources as bigger actors, it is important for these companies to rely on other types of strengths. Companies also need to make sure that they are able to sell the new technology in a way that is attractive to their customers, but at the same time profitable for the company. In other words, they need to integrate the new technology in a business model. The thesis concludes that Digital Marketing AB needs to develop new technology with regards to a specific target customer group, but also to work together with the customers in order to develop an attractive and competitive business model. Furthermore, the thesis concludes that how the business model will be designed will determine the success of adopting a new technology. Other issues that arise who have to do with the design of the business model are how to package and position the new technology. / Under de senaste åren har innovation har blivit ett allt viktigare bidrag till ett företags konkurrensfördelar. Betydelsen av tjänster har dessutom ökat. IT och telekommunikation (ICT) har kommit att spela en viktig roll i nästan alla typer av industrier. Denna marknad ändras mycket snabbt och kontinuerligt. För att bemöta dessa förändringar måste företag som är aktiva inom IT- och mjukvaruindustrin ständigt hålla sina lösningar uppdaterade. Detta examensarbete består av en fallstudie utförd på ett litet företag aktivt i IT-branschen, referat till som Digital Marketing AB. Företaget levererar ett system för att planera, sända och analysera digitala marknadsföringskampanjer. Digital Marketing AB konkurrerar på en marknad som förändras i mycket hög takt. Då nya teknologier växer fram blir de existerande lösningarna kända vilket ger utrymme för lågkostnadsalternativ som eroderar intäkter från mer differentierade tjänster. Om dessa företag vars intäkter eroderas dessutom är mindre företag som inte har samma finansiella resurser som de större företagen, måste de förlita sig på andra typer av styrkor. Företag måste även se till att kunna sälja tekniken de producerar på ett sätt som är attraktivt för kunden, men som samtidigt är lönsamt för företaget. De måste integrera sin teknik i en affärsmodell. Examensarbetet visar på att Digital Marketing AB bör utveckla sin affärsmodell dedicerad åt en specifik målgrupp, och dessutom göra det tillsammans med potentiella kunder för att affärsmodellen skall bli attraktiv och konkurrenskraftig. Dessutom visar arbetet på att beroende på hur affärsmodellen utformas, kommer att avgöras hur pass framgångsrik affärsmodellen kommer att vara. Andra frågor som uppstår i samband med utvecklingen av affärsmodellen har att göra med hur tekniken skall paketeras och positioneras. Big Data Business Midel Digital Business Models Digital Economy ICT Industry Service Innovation Big Data Affärsmodell Digitala affärsmodeller Digital Ekonomi ICT-industri Serviceinnovation Economics and Business Ekonomi och näringsliv
344	Revisorns förändring i linje med digitaliseringen / Auditor's change in line with digitalization Frick, Victor, Källroos, William, Lindberg, Niclas January 2020 (has links) All industries are affected by digitalisation, therefore are also the auditing industry and the auditor's professional role. The audit function is to create confidence for the business and it is therefore important to understand how the digitization of the area is affected. How the professional role needs to be changed and education needs to be developed to keep up with the development. What are the positive and possibly negative consequences of digitalisation. This has been discussed for a long time and many believe that digitization will have a major impact, which is why further studies are needed in the area. It was because of lack of information within the area that the issue arose and was adopted in this work. / Introduktion Alla branscher påverkas av digitaliseringen, därmed också revisionsbranschen och revisorns yrkesroll. Revisionen funktion är att skapa förtroende för näringslivet och det är därför av vikt att förstå hur digitaliseringen av området påverkas. Hur behöver yrkesrollen förändras och utbildningen utvecklas för att följa med i utvecklingen. Vilka är de positiva och eventuellt negativa konsekvenser av digitaliseringen. Detta har under en längre tid diskuterats och många tror att digitaliseringen kommer att ha en stor påverkan varför ytterligare studier behövs göras inom området. Det var på grund av denna brist som frågeställningen uppkom och antogs i detta arbete. Syfte Syftet med denna studie är att få en djupare förståelse och utforska digitaliseringens påverkan på revisorns yrkesroll Metod Studien använder sig av en kvalitativ forskningsdesign som har utgått från en abduktiv ansats. informationsinsamlingen i empirin har samlats in genom semi-strukturerade intervjuer. Slutsats Studien visar att revisorsyrket påverkas i hög grad av digitaliseringen. Mest centralt är graden av effektivisering, där manuellt arbete har ersatts av automatiserade processer. En ytterligare förändring är att tillgängligheten av större mängder data ger underlag för djupare analyser vilket ger en bättre helhetsbild. Det skapar utrymme för mer värdeskapande arbete mot företaget men även att upptäcka oegentligheter. De nya verktygen kräver en ökad kunskap och möjlighet att ta till sig den nya teknologin, och en ny roll har i och med det skapats, vilket är IT-revisorn. Digitalisering och automatisering frigör tid vilket gör att revisorerna kan utöka erbjudandet med rådgivningstjänster. Digitaliseringen medför även nackdelar; genom automatisering försvinner möjligheten att lära sig grundläggande processer som tidigare varit centralt för revisorn, därmed kan det försvåra förståelsen av helhetsbilden hos företagen. Dessutom kan kundrelationerna påverkas i och med digitaliseringens utveckling då virtuella möten blivit en vanligare del i vardagen. Auditor Audit Audit process Big data future of audit technology risk auditor risk och digitalization. Revisorn revision revisionsprocess big data future of audit technology risk auditor risk och digitalization. Business Administration Företagsekonomi
345	En studie om Big data och personlig integritet : Vad vet studenter om lagring av deras personliga uppgifter? / A study on Big data and personal privacy : What do students' know about storing their personal information? Demirsoy, Delil, Holm, Erik January 2020 (has links) Denna studie handlar om studenters kännedom om de personliga uppgifter som lagras av institutioner inom högre utbildning, och om det finns skillnader mellan kön gällande kännedomen och hanteringen av dessa uppgifter. Då det i samband med den expanderande lagringen av data och användningen av den genom Big data inom organisationen, visat sig ha påverkan på den personliga integriteten. Tidigare forskning indikerar på att det finns en brist i kännedomen och hanteringen hos människor om vad som lagras av organisationer. Tidigare forskning har även indikerat på att det finns skillnader mellan kön i hanteringen och kännedomen om personliga uppgifter som lagras av organisationer. Denna studien avgränsar sig till studenternas kännedom om vad institutioner inom högre utbildning lagrar om dem och skillnader mellan kön angående dessa uppgifter. För denna studie har en forskningsmetod i form av elektroniska enkäter använts, där studenter fått redogöra för deras kännedom och tankar av institutioner inom högre utbildnings lagring och hantering av personliga uppgifter. Syftet var att undersöka vilken kännedom dessa har om de personliga uppgifter som lagras av organisationer och vilken insikt de har om hur de kan användas. Totalt har 151 deltagit i enkätundersökningen där 126 uppgett att de varit studenter. Metoden som använts för studien är en kvantitativ ansats med kvalitativa inslag, där den kvalitativa delen avser de frågor som besvarats i fri text. Vidare är den kvantitativa delen för de frågor som har analyserats genom statistik och siffror. Frågorna från den använda forskningsmetoden i denna studie i form av elektroniska enkäter har bearbetats och slutligen presenterats. Under teoriavsnittet redogörs de begrepp som använts i studien, samt en mer ingående redogörelse för integritetens betydelse vid lagring av data. Vidare analyseras resultatet från studien utifrån Petronios CPM-teori och dess fem principer med hänsyn till den personliga integriteten. Arbetet konkluderades med en slutsats om att studenter har en vag form av kännedom gällande de uppgifter som lagras av institutioner inom högre utbildning. Det visade sig att studenter inte upplever att deras kännedom är tillräcklig. Till följd av att de inte upplever att de får tillräcklig med information av institutioner inom högre utbildning. Resultatet från studien visade att studenter hade kännedom om begrepp kopplade till lagring av personliga uppgifter. Det visade sig att det finns skillnader mellan kön i hanteringen av personliga uppgifter, men till följd av avsaknaden av bortfallsanalysen var dessa fynd svårt att verifiera fullt ut. Resultatet indikerade på att ett flertal studenter inte kände sig trygga när institutioner inom högre utbildning samlade information om dem. På grund av att de inte hade kännedom om vad som lagrades. Dock visade resultatet att de flesta tycker att skolan borde ge tydligare information om de personliga uppgifterna som lagras. Vilket i sin tur gjorde att flera kände att de förlorade kontroll över den personliga integriteten. Detta gällande de åsikter om hur tydliga de anser att institutioner inom högre utbildning är vid informering av personliga uppgifter som de lagrar. / This research study examines students' knowledge of personal data stored by institutions of higher education as well as, whether there are differences between the genders regarding the knowledge and the management of this stored data. This is connected to the expanding storage of data and the use of it through Big data within the organisations where it was shown to have impact on the personal integrity. Previous studies report that there is a knowledge gap within society regarding the information on what is stored. In addition to this, research showed that there are differences between genders about the knowledge and control of their personal data. Therefore, this study focuses on students' knowledge of what higher education institutions store about them as well as whether there are differences between genders. This study applies a quantitative research method in the form of electronic questionnaires for data collection. These questionnaires were handed out to students which contained questions about students' knowledge and views about their institution's storage and management of their personal data. A total of 151 people participated in this study, where 126 of the participants stated that they were students. Moreover. this study includes some elements of qualitative research methods where some of the questions in the electronic questionnaires could be answered in free text. The qualitative and quantitative methods were later analyzed and compared to Petrionio's CPM-theory and its five principles regarding the personal integrity. The result of the study showed that students' have a vague form of knowledge regarding the data stored by institutions of higher education. The research also indicated that there are differences between the sexes in the handling of personal data. However, our findings show that the lack of dropout analysis makes the mentioned findings quite difficult to be fully verified. The result has shown that several students did not feel secure when organizations within higher educational institutions stored personal data about them. This is because they feel that their knowledge on what is being stored is insufficient which consequently led them to feel a lacking control about their own personal integrity. Thus, results showed that most people think that the educational institutions should provide more specific information about the personal data that they store about them. Organization student’s knowledge Big data privacy data storage privacy principles personal information Organisation studenters kännedom Big data integritet datalagring integritets principer personlig information Information Systems
346	Analyzing Small Businesses' Adoption of Big Data Security Analytics Mathias, Henry 01 January 2019 (has links) Despite the increased cost of data breaches due to advanced, persistent threats from malicious sources, the adoption of big data security analytics among U.S. small businesses has been slow. Anchored in a diffusion of innovation theory, the purpose of this correlational study was to examine ways to increase the adoption of big data security analytics among small businesses in the United States by examining the relationship between small business leaders' perceptions of big data security analytics and their adoption. The research questions were developed to determine how to increase the adoption of big data security analytics, which can be measured as a function of the user's perceived attributes of innovation represented by the independent variables: relative advantage, compatibility, complexity, observability, and trialability. The study included a cross-sectional survey distributed online to a convenience sample of 165 small businesses. Pearson correlations and multiple linear regression were used to statistically understand relationships between variables. There were no significant positive correlations between relative advantage, compatibility, and the dependent variable adoption; however, there were significant negative correlations between complexity, trialability, and the adoption. There was also a significant positive correlation between observability and the adoption. The implications for positive social change include an increase in knowledge, skill sets, and jobs for employees and increased confidentiality, integrity, and availability of systems and data for small businesses. Social benefits include improved decision making for small businesses and increased secure transactions between systems by detecting and eliminating advanced, persistent threats. adoption of security analytics big data analytics big data security analytics security small business small business adoption of technology Computer Sciences Databases and Information Systems Library and Information Science
347	Interopérabilité des systèmes distribués produisant des flux de données sémantiques au profit de l'aide à la prise de décision / Interoperability of distributed systems producing semantic data stream for decision-making Belghaouti, Fethi 26 January 2017 (has links) Internet est une source infinie de données émanant de sources telles que les réseaux sociaux ou les capteurs (domotique, ville intelligente, véhicule autonome, etc.). Ces données hétérogènes et de plus en plus volumineuses, peuvent être gérées grâce au web sémantique, qui propose de les homogénéiser et de les lier et de raisonner dessus, et aux systèmes de gestion de flux de données, qui abordent essentiellement les problèmes liés au volume, à la volatilité et à l’interrogation continue. L’alliance de ces deux disciplines a vu l’essor des systèmes de gestion de flux de données sémantiques RSP (RDF Stream Processing systems). L’objectif de cette thèse est de permettre à ces systèmes, via de nouvelles approches et algorithmes à faible coût, de rester opérationnels, voire plus performants, même en cas de gros volumes de données en entrée et/ou de ressources système limitées.Pour atteindre cet objectif, notre thèse s’articule principalement autour de la problématique du : "Traitement de flux de données sémantiques dans un contexte de systèmes informatiques à ressources limitées". Elle adresse les questions de recherche suivantes : (i) Comment représenter un flux de données sémantiques ? Et (ii) Comment traiter les flux de données sémantiques entrants, lorsque leurs débits et/ou volumes dépassent les capacités du système cible ?Nous proposons comme première contribution une analyse des données circulant dans les flux de données sémantiques pour considérer non pas une succession de triplets indépendants mais plutôt une succession de graphes en étoiles, préservant ainsi les liens entre les triplets. En utilisant cette approche, nous avons amélioré significativement la qualité des réponses de quelques algorithmes d’échantillonnage bien connus dans la littérature pour le délestage des flux. L’analyse de la requête continue permet d’optimiser cette solution en repèrant les données non pertinentes pour être délestées les premières. Dans la deuxième contribution, nous proposons un algorithme de détection de motifs fréquents de graphes RDF dans les flux de données RDF, appelé FreGraPaD (Frequent RDF Graph Patterns Detection). C’est un algorithme en une passe, orienté mémoire et peu coûteux. Il utilise deux structures de données principales un vecteur de bits pour construire et identifier le motif de graphe RDF assurant une optimisation de l’espace mémoire et une table de hachage pour le stockage de ces derniers. La troisième contribution de notre thèse consiste en une solution déterministe de réduction de charge des systèmes RSP appelée POL (Pattern Oriented Load-shedding for RDF Stream Processing systems). Elle utilise des opérateurs booléens très peu coûteux, qu’elle applique aux deux motifs binaires construits de la donnée et de la requête continue pour déterminer et éjecter celle qui est non-pertinente. Elle garantit un rappel de 100%, réduit la charge du système et améliore son temps de réponse. Enfin, notre quatrième contribution est un outil de compression en ligne de flux RDF, appelé Patorc (Pattern Oriented Compression for RSP systems). Il se base sur les motifs fréquents présents dans les flux qu’il factorise. C’est une solution de compression sans perte de données dont l’interrogation sans décompression est très envisageable. Les solutions apportées par cette thèse permettent l’extension des systèmes RSP existants en leur permettant le passage à l’échelle dans un contexte de Bigdata. Elles leur permettent ainsi de manipuler un ou plusieurs flux arrivant à différentes vitesses, sans perdre de leur qualité de réponse et tout en garantissant leur disponibilité au-delà même de leurs limites physiques. Les résultats des expérimentations menées montrent que l’extension des systèmes existants par nos solutions améliore leurs performances. Elles illustrent la diminution considérable de leur temps de réponse, l’augmentation de leur seuil de débit de traitement en entrée tout en optimisant l’utilisation de leurs ressources systèmes / Internet is an infinite source of data coming from sources such as social networks or sensors (home automation, smart city, autonomous vehicle, etc.). These heterogeneous and increasingly large data can be managed through semantic web technologies, which propose to homogenize, link these data and reason above them, and data flow management systems, which mainly address the problems related to volume, volatility and continuous querying. The alliance of these two disciplines has seen the growth of semantic data stream management systems also called RSP (RDF Stream Processing Systems). The objective of this thesis is to allow these systems, via new approaches and "low cost" algorithms, to remain operational, even more efficient, even for large input data volumes and/or with limited system resources.To reach this goal, our thesis is mainly focused on the issue of "Processing semantic data streamsin a context of computer systems with limited resources". It directly contributes to answer the following research questions : (i) How to represent semantic data stream ? And (ii) How to deal with input semantic data when their rates and/or volumes exceed the capabilities of the target system ?As first contribution, we propose an analysis of the data in the semantic data streams in order to consider a succession of star graphs instead of just a success of andependent triples, thus preserving the links between the triples. By using this approach, we significantly impoved the quality of responses of some well known sampling algoithms for load-shedding. The analysis of the continuous query allows the optimisation of this solution by selection the irrelevant data to be load-shedded first. In the second contribution, we propose an algorithm for detecting frequent RDF graph patterns in semantic data streams.We called it FreGraPaD for Frequent RDF Graph Patterns Detection. It is a one pass algorithm, memory oriented and "low-cost". It uses two main data structures : A bit-vector to build and identify the RDF graph pattern, providing thus memory space optimization ; and a hash-table for storing the patterns.The third contribution of our thesis consists of a deterministic load-shedding solution for RSP systems, called POL (Pattern Oriented Load-shedding for RDF Stream Processing systems). It uses very low-cost boolean operators, that we apply on the built binary patterns of the data and the continuous query inorder to determine which data is not relevant to be ejected upstream of the system. It guarantees a recall of 100%, reduces the system load and improves response time. Finally, in the fourth contribution, we propose Patorc (Pattern Oriented Compression for RSP systems). Patorc is an online compression toolfor RDF streams. It is based on the frequent patterns present in RDF data streams that factorizes. It is a data lossless compression solution whith very possible querying without any need to decompression.This thesis provides solutions that allow the extension of existing RSP systems and makes them able to scale in a bigdata context. Thus, these solutions allow the RSP systems to deal with one or more semantic data streams arriving at different speeds, without loosing their response quality while ensuring their availability, even beyond their physical limitations. The conducted experiments, supported by the obtained results show that the extension of existing systems with the new solutions improves their performance. They illustrate the considerable decrease in their engine’s response time, increasing their processing rate threshold while optimizing the use of their system resources Flux de données sémantiques Donnée liées Big data SPARQL continu Détection de motifs fréquents Compression Echantillonnage Semantic data streams Linked Data Big data Continuous SPARQL Frequent patterns detection Compression Sampling
348	Optimisation de requêtes spatiales et serveur de données distribué - Application à la gestion de masses de données en astronomie / Spatial Query Optimization and Distributed Data Server - Application in the Management of Big Astronomical Surveys Brahem, Mariem 31 January 2019 (has links) Les masses de données scientifiques générées par les moyens d'observation modernes, dont l’observation spatiale, soulèvent des problèmes de performances récurrents, et ce malgré les avancées des systèmes distribués de gestion de données. Ceci est souvent lié à la complexité des systèmes et des paramètres qui impactent les performances et la difficulté d’adapter les méthodes d’accès au flot de données et de traitement.Cette thèse propose de nouvelles techniques d'optimisations logiques et physiques pour optimiser les plans d'exécution des requêtes astronomiques en utilisant des règles d'optimisation. Ces méthodes sont intégrées dans ASTROIDE, un système distribué pour le traitement de données astronomiques à grande échelle.ASTROIDE allie la scalabilité et l’efficacité en combinant les avantages du traitement distribué en utilisant Spark avec la pertinence d’un optimiseur de requêtes astronomiques.Il permet l'accès aux données à l'aide du langage de requêtes ADQL, couramment utilisé.Il implémente des algorithmes de requêtes astronomiques (cone search, kNN search, cross-match, et kNN join) en exploitant l'organisation physique des données proposée.En effet, ASTROIDE propose une méthode de partitionnement des données permettant un traitement efficace de ces requêtes grâce à l'équilibrage de la répartition des données et à l'élimination des partitions non pertinentes. Ce partitionnement utilise une technique d’indexation adaptée aux données astronomiques, afin de réduire le temps de traitement des requêtes. / The big scientific data generated by modern observation telescopes, raises recurring problems of performances, in spite of the advances in distributed data management systems. The main reasons are the complexity of the systems and the difficulty to adapt the access methods to the data. This thesis proposes new physical and logical optimizations to optimize execution plans of astronomical queries using transformation rules. These methods are integrated in ASTROIDE, a distributed system for large-scale astronomical data processing.ASTROIDE achieves scalability and efficiency by combining the benefits of distributed processing using Spark with the relevance of an astronomical query optimizer.It supports the data access using the query language ADQL that is commonly used.It implements astronomical query algorithms (cone search, kNN search, cross-match, and kNN join) tailored to the proposed physical data organization.Indeed, ASTROIDE offers a data partitioning technique that allows efficient processing of these queries by ensuring load balancing and eliminating irrelevant partitions. This partitioning uses an indexing technique adapted to astronomical data, in order to reduce query processing time. Bases de données astronomiques Big Data Optimisation de requêtes Systèmes distribués Partitionnement Spark Astronomical Databases Big Data Query optimization Distributed systems Data partitioning Spark 005.74
349	Datadriven affärsanalys : en studie om värdeskapande mekanismer / Data-driven business analysis : a study about value creating mechanisms Adamsson, Anton, Jönsson, Julius January 2021 (has links) Affärsanalys är en ökande trend som många organisationer idag använder på grund av potentialen att fastställa värdefulla insikter, ökad lönsamhet och förbättrad operativ effektivitet. Något som visat sig vara problematiskt då det önskade resultatet inte alltid är en självklarhet. Syftet med studien är att undersöka hur modeföretag kan använda datadriven affärsanalys för att generera positiva insikter genom värdeskapande mekanismer. Utifrån semistrukturerade intervjuer med anställda på ett modeföretag har vi, med utgångspunkt i tidigare forskning, kartlagt hur datadriven affärsanalys brukas för att skapa värde genom att applicera en processmodell på verksamheten. Empirin resulterade i tre värdefulla insikter (1) Det studerade företaget använder affärsanalys för ökad lönsamhet (2) Företagets data tillgångar är tillräckliga för att utvinna värdefulla insikter (3) Vidare såg vi att företaget arbetar med influencers vilket är en ny affärsanalys-funktion som inte definierats i tidigare forskning. / Business analysis is an increasingly popular trend that many organisations use because of its potential to establish valuable insights, increased profitability and improved operational efficiency. Something that has proved to be rather problematic as the desired results rarely is a certainty. The purpose of the study is to examine how fashion retailers can use business analytics to generate positive insights through value-creating mechanisms by applying a process model. Based on semi-structured interviews with the employees of a fashion company and a starting point in previous research, we have mapped how business analysis can be used to obtain value. The empirical study resulted in three valuable insights (1) The examined organisation uses business analysis to increase profitability. (2) The data assets of the organisation are enough to acquire valuable insights. (3) Further we discovered that the organisation uses influencers as a valuable asset and can be categorised as a business analysis capability, previously undefined in preceding research. Big data analytics data analytics big data data-driven business analysis process model Affärsanalys dataanalys stordata datadriven affärsanalys processmodell Computer and Information Sciences Data- och informationsvetenskap
350	Capitalising on Big Data from Space : How Novel Data Utilisation Can Drive Business Model Innovation / Kapitalisera på stora datamängder från rymden : Hur nya sätt att utnyttja data leder till innovation av affärsmodeller Bremström, Maria, Stipic, Susanne January 2019 (has links) Business model innovation has in recent year become more important for firms looking to gain competitive advantage on dynamic markets. Additionally, incorporating data into a firm’s business model has been shown to lead to improved performance. This development has led to interest in the connection between data utilisation and business model innovation. This thesis provides an in-depth case study of a Swedish space firm active within the satellite industry. The firm operates within an increasingly dynamic market, and ongoing disruptions in the form of new market entrants and rapid technological advancements has led to a search for new business opportunities. As a result, novel ways of utilising the increased amounts of data from space are of significant importance. While the firm is still realising profits utilising their incumbent business model, the firm must simultaneously explore new business opportunities to avoid extinction. The findings show that novel data utilisation, in the form of data processing, leads to business model innovation. Furthermore, the degree of business model transformation is dependent on how many of the business model's underlying elements are affected by data utilisation. Furthermore, the study concludes that a lack of trial-and-error learning impedes radical innovation efforts and hinders the development of ambidextrous capabilities within the firm. Lastly, the study finds a novel connection between the introduction of large-scale projects and improved ambidextrous capabilities. / Innovation av affärsmodeller har under senare år blivit alltmer viktigt för företag som vill uppnå ökad konkurrenskraft på dynamiska marknader. Vidare har det visat sig att företag som använder data för att förändra sin affärsmodell når bättre resultat än sina konkurrenter. Detta har lett till ett intresse för kopplingen mellan datautnyttjande och innovation av affärsmodeller. Detta examensarbete består av en fallstudie av ett svenskt rymdföretag, som har del av sin verksamhet inom satellitbranschen. Företaget verkar på en alltmer dynamisk marknad, och pågående störningar i form av nya marknadsaktörer och tekniska framsteg har lett till att företaget nu måste söka efter nya affärsmöjligheter. Som ett resultat av detta blir nya sätt att använda de ökade mängderna data från rymden av stor betydelse. Fastän företaget fortfarande framgångsrikt nyttjar sin befintliga affärsmodell, måste företaget samtidigt undersöka nya affärsmöjligheter för att undvika att hamna efter marknadsutvecklingen. Studiens resultat visar att nya sätt att använda data, i form av databehandling, leder till innovation av företagets affärsmodell. Dessutom beror graden av innovation på hur många av affärsmodellens underliggande byggstenar som påverkas av införandet av data. Studien drar vidare slutsatsen att en brist på lärande genom ’trial-and-error’ inom företaget hindrar radikala innovationsinsatser och leder till begränsade förutsättningar för att hantera organisatorisk ambidexteritet. Slutligen finner studien att storskaliga innovationsprojekt kan förbättra förutsättningarna för organisatorisk ambidexteritet. Business Model Innovation Data-Driven Business Model Innovation Organisational Ambidexterity Satellite Data Big Data Affärsmodellsutveckling innovation datadriven affärsutveckling organisatorisk ambidexteritet satellitdata big data Engineering and Technology Teknik och teknologier

Search results