311 |
Big Data em conteúdo espontâneo não-estruturado da internet como estratégia organizacional de orientação para o mercadoCorrêa Junior, Dirceu Silva Mello 25 April 2018 (has links)
Submitted by JOSIANE SANTOS DE OLIVEIRA (josianeso) on 2018-09-25T15:42:43Z
No. of bitstreams: 1
Dirceu Silva Mello Corrêa Junior_.pdf: 5130564 bytes, checksum: 9921c0e8eafdc5eb26cc6cf6211bdb01 (MD5) / Made available in DSpace on 2018-09-25T15:42:43Z (GMT). No. of bitstreams: 1
Dirceu Silva Mello Corrêa Junior_.pdf: 5130564 bytes, checksum: 9921c0e8eafdc5eb26cc6cf6211bdb01 (MD5)
Previous issue date: 2018-04-25 / Nenhuma / O Big Data é uma realidade social, com crescente impacto nos negócios. Entretanto, uma pesquisa realizada com executivos americanos de grandes corporações identificou uma baixa capacidade no aproveitamento efetivo dessa oportunidade de inteligência competitiva em suas empresas. Ao aprofundar o entendimento desse contexto, a partir da perspectiva de Orientação para o Mercado, a presente dissertação apresentou uma análise exploratória sobre a atual capacidade de grandes empresas com atuação nacional em absorver valor do Big Data, focando sua atenção num tipo específico de conteúdo, chamado Dado Não-Estruturado. Como resultado, identificou-se que as empresas estudadas se encontram em um momento peculiar para a gestão moderna de Orientação para o Mercado, uma espécie de processo evolutivo e de transição na compreensão e aproveitamento desse dilúvio de dados. Tal momento de adaptação é ainda reforçado por uma tendência para o uso de dados mais espontâneos dos consumidores.
Neste estudo inicialmente são apresentadas cinco dimensões desse momento peculiar, abordando sistematicamente quesitos relacionados à organização interna; fornecedores e perfis de investimentos; adaptações internas; entre outros achados estratégicos. Após, também é detalhada a atual caminhada na efetiva compreensão do Big Data, a partir das práticas possíveis identificadas nesse contexto empresarial. / Big Data is a social reality with growing business impact. However, a survey of US executives of large corporations identified a low capacity to effectively exploit this competitive intelligence opportunity in their companies. In order to deepen the understanding of this context, from the perspective of Market Orientation, the present dissertation presented an exploratory analysis about the current capacity of large companies with national performance in absorbing Big Data value, focusing their attention on a type of content, called Unstructured Data. As a result, it was identified that the companies studied are in a peculiar moment for the modern management of Market Orientation, a sort of evolutionary process and of transition in the understanding and use of this deluge of data. This moment of adaptation is further reinforced by a trend towards the use of more spontaneous data from consumers. In this study, five dimensions of this peculiar moment are presented, systematically addressing questions related to internal organization; suppliers and investment profiles; internal adaptations; among other strategic findings. Afterwards, the current path to the understanding of Big Data is also detailed, based on the possible practices identified in this business context.
|
312 |
Modélisation NoSQL des entrepôts de données multidimensionnelles massives / Modeling Multidimensional Data Warehouses into NoSQLEl Malki, Mohammed 08 December 2016 (has links)
Les systèmes d’aide à la décision occupent une place prépondérante au sein des entreprises et des grandes organisations, pour permettre des analyses dédiées à la prise de décisions. Avec l’avènement du big data, le volume des données d’analyses atteint des tailles critiques, défiant les approches classiques d’entreposage de données, dont les solutions actuelles reposent principalement sur des bases de données R-OLAP. Avec l’apparition des grandes plateformes Web telles que Google, Facebook, Twitter, Amazon… des solutions pour gérer les mégadonnées (Big Data) ont été développées et appelées « Not Only SQL ». Ces nouvelles approches constituent une voie intéressante pour la construction des entrepôts de données multidimensionnelles capables de supporter des grandes masses de données. La remise en cause de l’approche R-OLAP nécessite de revisiter les principes de la modélisation des entrepôts de données multidimensionnelles. Dans ce manuscrit, nous avons proposé des processus d’implantation des entrepôts de données multidimensionnelles avec les modèles NoSQL. Nous avons défini quatre processus pour chacun des deux modèles NoSQL orienté colonnes et orienté documents. De plus, le contexte NoSQL rend également plus complexe le calcul efficace de pré-agrégats qui sont habituellement mis en place dans le contexte ROLAP (treillis). Nous avons élargis nos processus d’implantations pour prendre en compte la construction du treillis dans les deux modèles retenus.Comme il est difficile de choisir une seule implantation NoSQL supportant efficacement tous les traitements applicables, nous avons proposé deux processus de traductions, le premier concerne des processus intra-modèles, c’est-à-dire des règles de passage d’une implantation à une autre implantation du même modèle logique NoSQL, tandis que le second processus définit les règles de transformation d’une implantation d’un modèle logique vers une autre implantation d’un autre modèle logique. / Decision support systems occupy a large space in companies and large organizations in order to enable analyzes dedicated to decision making. With the advent of big data, the volume of analyzed data reaches critical sizes, challenging conventional approaches to data warehousing, for which current solutions are mainly based on R-OLAP databases. With the emergence of major Web platforms such as Google, Facebook, Twitter, Amazon...etc, many solutions to process big data are developed and called "Not Only SQL". These new approaches are an interesting attempt to build multidimensional data warehouse capable of handling large volumes of data. The questioning of the R-OLAP approach requires revisiting the principles of modeling multidimensional data warehouses.In this manuscript, we proposed implementation processes of multidimensional data warehouses with NoSQL models. We defined four processes for each model; an oriented NoSQL column model and an oriented documents model. Each of these processes fosters a specific treatment. Moreover, the NoSQL context adds complexity to the computation of effective pre-aggregates that are typically set up within the ROLAP context (lattice). We have enlarged our implementations processes to take into account the construction of the lattice in both detained models.As it is difficult to choose a single NoSQL implementation that supports effectively all the applicable treatments, we proposed two translation processes. While the first one concerns intra-models processes, i.e., pass rules from an implementation to another of the same NoSQL logic model, the second process defines the transformation rules of a logic model implementation to another implementation on another logic model.
|
313 |
Modelagem de sistemas de informação para a mineração de processos: características e propriedades das linguagens / Information systems modeling for a process mining: characteristics and properties of languagesTeixeira Junior, Gilmar 03 May 2017 (has links)
Submitted by Luciana Ferreira (lucgeral@gmail.com) on 2017-07-19T11:05:26Z
No. of bitstreams: 2
Dissertação - Gilmar Teixeira Junior - 2017.pdf: 6982787 bytes, checksum: c52c456e0cb8184f1f7144d862bff726 (MD5)
license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Approved for entry into archive by Luciana Ferreira (lucgeral@gmail.com) on 2017-07-19T11:05:52Z (GMT) No. of bitstreams: 2
Dissertação - Gilmar Teixeira Junior - 2017.pdf: 6982787 bytes, checksum: c52c456e0cb8184f1f7144d862bff726 (MD5)
license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Made available in DSpace on 2017-07-19T11:05:52Z (GMT). No. of bitstreams: 2
Dissertação - Gilmar Teixeira Junior - 2017.pdf: 6982787 bytes, checksum: c52c456e0cb8184f1f7144d862bff726 (MD5)
license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5)
Previous issue date: 2017-05-03 / Storing information in large data repositories (Big Data) creates opportunities for Organizations to use Process Mining techniques to extract knowledge about the performance and actual flow of their processes of business. One of the fundamental elements for achieving this objective is the relationship between process modeling languages, process event logging (logs) and Process Mining algorithms. In this work, comparisons were made between three languages (BPMN, Petri Nets and YAWL) which are usually used to model business processes with respect to their capabilities of use in Process Mining, especially in Process
Discovery. The models created were based on typical Workflow patterns and five scenarios were simulated for each language using three Process Discovery algorithms (Alpha, Heuristic Miner and ILP Miner). The results indicate that the choice of the language used in the modeling and in recording of the business processes influences the quality of the results obtained by the Process Discovery algorithms. This work also presents suggestions for the development of process modeling languages and process mining algorithms. / O armazenamento das informações em grandes repositórios de dados (Big Data) geram oportunidades para que as Organizações utilizem técnicas de Mineração de Processos (Process Mining) para extrair conhecimento sobre o desempenho e o fluxo real de seus processos de negócio. Um dos elementos fundamentais para que este objetivo seja alcançado está na relação entre as linguagens de modelagem de processos, o registro dos eventos de processo (logs) e os algoritmos de Mineração de Processos. Neste trabalho, foram realizadas comparações entre três linguagens (BPMN, Redes de Petri e YAWL) normalmente utilizadas para modelar processos de negócio com respeito a suas capacidades de utilização na Mineração de Processos, em especial, na Descoberta de Processos. Os modelos criados foram baseados em padrões típicos de Workflow e cinco cenários foram simulados para cada linguagem utilizando três algoritmos de Descoberta de Processos (Alpha, Heuristic Miner e ILP Miner). Os resultados indicam que a escolha da linguagem utilizada na modelagem e no registro dos processos de negócio influencia na qualidade dos resultados obtidos pelos algoritmos de Descoberta de Processos. O trabalho também apresenta sugestões para o desenvolvimento das linguagens de modelagem de processos e dos algoritmos de Mineração de Processos.
|
314 |
Uma nova arquitetura para Internet das Coisas com análise e reconhecimento de padrões e processamento com Big Data. / A novel Internet of Things architecture with pattern recognition and big data processing.Alberto Messias da Costa Souza 16 October 2015 (has links)
A Internet das Coisas é um novo paradigma de comunicação que estende o mundo virtual (Internet) para o mundo real com a interface e interação entre objetos. Ela possuirá um grande número de dispositivos heteregôneos interconectados, que deverá gerar um grande volume de dados. Um dos importantes desafios para seu desenvolvimento é se guardar e processar esse grande volume de dados em aceitáveis intervalos de tempo. Esta pesquisa endereça esse desafio, com a introdução de serviços de análise e reconhecimento de padrões nas camadas inferiores do modelo de para Internet das Coisas, que procura reduzir o processamento nas camadas superiores. Na pesquisa foram analisados os modelos de referência para Internet das Coisas e plataformas para desenvolvimento de aplicações nesse contexto. A nova arquitetura de implementada estende o LinkSmart Middeware pela introdução de um módulo para reconhecimento de padrões, implementa algoritmos para estimação de valores, detecção de outliers e descoberta de grupos nos dados brutos, oriundos de origens de dados. O novo módulo foi integrado à plataforma para Big Data Hadoop e usa as implementações algorítmicas do framework Mahout. Este trabalho destaca a importância da comunicação cross layer integrada à essa nova arquitetura. Nos experimentos desenvolvidos na pesquisa foram utilizadas bases de dados reais, provenientes do projeto Smart Santander, de modo a validar da nova arquitetura de IoT integrada aos serviços de análise e reconhecimento de padrões e a comunicação cross-layer. / The Internet of Things is a new communication paradigm in which the Internet is extended from the virtual world to interface and interact with objects of the physical world. The IoT has high number of heterogeneous interconnected devices, that generate huge volume data. The most important IoT challenges is store and proccess this large volume data. This research addresses this issue by introducing pattern recognition services into the lower layers of the Internet of Things reference model stack and reduces the processing at the higher layers. The research analyzes the Internet of Things reference model and Middleware platforms to develop applications in this context. The new architecture implementation extends the LinkSmart by introducing a pattern recognition manager that includes algorithms to estimate parameters, detect outliers, and to perform clustering of raw data from IoT resources. The new module is integrated with the Big Data Haddop platform and uses Mahout algorithms implementation. This work highlights the cross-layer communication intregated in the new IoT architecture. The experiments made in this research using the real database from Smart Santander Framework to validate the new IoT architecture with pattern recognition services and cross-layer communication.
|
315 |
Computer vision for continuous plankton monitoring / Visão computacional para o monitoramento contínuo de plânctonMatuszewski, Damian Janusz 04 April 2014 (has links)
Plankton microorganisms constitute the base of the marine food web and play a great role in global atmospheric carbon dioxide drawdown. Moreover, being very sensitive to any environmental changes they allow noticing (and potentially counteracting) them faster than with any other means. As such they not only influence the fishery industry but are also frequently used to analyze changes in exploited coastal areas and the influence of these interferences on local environment and climate. As a consequence, there is a strong need for highly efficient systems allowing long time and large volume observation of plankton communities. This would provide us with better understanding of plankton role on global climate as well as help maintain the fragile environmental equilibrium. The adopted sensors typically provide huge amounts of data that must be processed efficiently without the need for intensive manual work of specialists. A new system for general purpose particle analysis in large volumes is presented. It has been designed and optimized for the continuous plankton monitoring problem; however, it can be easily applied as a versatile moving fluids analysis tool or in any other application in which targets to be detected and identified move in a unidirectional flux. The proposed system is composed of three stages: data acquisition, targets detection and their identification. Dedicated optical hardware is used to record images of small particles immersed in the water flux. Targets detection is performed using a Visual Rhythm-based method which greatly accelerates the processing time and allows higher volume throughput. The proposed method detects, counts and measures organisms present in water flux passing in front of the camera. Moreover, the developed software allows saving cropped plankton images which not only greatly reduces required storage space but also constitutes the input for their automatic identification. In order to assure maximal performance (up to 720 MB/s) the algorithm was implemented using CUDA for GPGPU. The method was tested on a large dataset and compared with alternative frame-by-frame approach. The obtained plankton images were used to build a classifier that is applied to automatically identify organisms in plankton analysis experiments. For this purpose a dedicated feature extracting software was developed. Various subsets of the 55 shape characteristics were tested with different off-the-shelf learning models. The best accuracy of approximately 92% was obtained with Support Vector Machines. This result is comparable to the average expert manual identification performance. This work was developed under joint supervision with Professor Rubens Lopes (IO-USP). / Microorganismos planctônicos constituem a base da cadeia alimentar marinha e desempenham um grande papel na redução do dióxido de carbono na atmosfera. Além disso, são muito sensíveis a alterações ambientais e permitem perceber (e potencialmente neutralizar) as mesmas mais rapidamente do que em qualquer outro meio. Como tal, não só influenciam a indústria da pesca, mas também são frequentemente utilizados para analisar as mudanças nas zonas costeiras exploradas e a influência destas interferências no ambiente e clima locais. Como consequência, existe uma forte necessidade de desenvolver sistemas altamente eficientes, que permitam observar comunidades planctônicas em grandes escalas de tempo e volume. Isso nos fornece uma melhor compreensão do papel do plâncton no clima global, bem como ajuda a manter o equilíbrio do frágil meio ambiente. Os sensores utilizados normalmente fornecem grandes quantidades de dados que devem ser processados de forma eficiente sem a necessidade do trabalho manual intensivo de especialistas. Um novo sistema de monitoramento de plâncton em grandes volumes é apresentado. Foi desenvolvido e otimizado para o monitoramento contínuo de plâncton; no entanto, pode ser aplicado como uma ferramenta versátil para a análise de fluídos em movimento ou em qualquer aplicação que visa detectar e identificar movimento em fluxo unidirecional. O sistema proposto é composto de três estágios: aquisição de dados, detecção de alvos e suas identificações. O equipamento óptico é utilizado para gravar imagens de pequenas particulas imersas no fluxo de água. A detecção de alvos é realizada pelo método baseado no Ritmo Visual, que acelera significativamente o tempo de processamento e permite um maior fluxo de volume. O método proposto detecta, conta e mede organismos presentes na passagem do fluxo de água em frente ao sensor da câmera. Além disso, o software desenvolvido permite salvar imagens segmentadas de plâncton, que não só reduz consideravelmente o espaço de armazenamento necessário, mas também constitui a entrada para a sua identificação automática. Para garantir o desempenho máximo de até 720 MB/s, o algoritmo foi implementado utilizando CUDA para GPGPU. O método foi testado em um grande conjunto de dados e comparado com a abordagem alternativa de quadro-a-quadro. As imagens obtidas foram utilizadas para construir um classificador que é aplicado na identificação automática de organismos em experimentos de análise de plâncton. Por este motivo desenvolveu-se um software para extração de características. Diversos subconjuntos das 55 características foram testados através de modelos de aprendizagem disponíveis. A melhor exatidão de aproximadamente 92% foi obtida através da máquina de vetores de suporte. Este resultado é comparável à identificação manual média realizada por especialistas. Este trabalho foi desenvolvido sob a co-orientacao do Professor Rubens Lopes (IO-USP).
|
316 |
Visualizing media with interactive multiplex networks / Cartographier les médias avec des réseaux multiplexes interactifsRen, Haolin 14 March 2019 (has links)
Les flux d’information suivent aujourd’hui des chemins complexes: la propagation des informations, impliquant éditeurs on-line, chaînes d’information en continu et réseaux sociaux, emprunte alors des chemins croisés, susceptibles d’agir sur le contenu et sa perception. Ce projet de thèse étudie l’adaptation des mesures de graphes classiques aux graphes multiplexes en relation avec le domaine étudié, propose de construire des visualisations à partir de plusieurs représentations graphiques des réseaux, et de les combiner (visualisations multi-vues synchronisées, représentations hybrides, etc.). L’accent est mis sur les modes d’interaction permettant de prendre en compte l’aspect multiplexe (multicouche) des réseaux. Ces représentations et manipulations interactives s’appuient aussi sur le calcul d’indicateurs propres aux réseaux multiplexes. Ce travail est basé sur deux jeux de données principaux: l’un est une archive de 12 ans de l’émission japonaise publique quotidienne NHK News 7, de 2001 à 2013. L’autre recense les participants aux émissions de télévision/radio françaises entre 2010 et 2015. Deux systèmes de visualisation s’appuyant sur une interface Web ont été développés pour analyser des réseaux multiplexes, que nous appelons «Visual Cloud» et «Laputa». Dans le Visual Cloud, nous définissons formellement une notion de similitude entre les concepts et les groupes de concepts que nous nommons possibilité de co-occurrence (CP). Conformément à cette définition, nous proposons un algorithme de classification hiérarchique. Nous regroupons les couches dans le réseau multiplexe de documents, et intégrons cette hiérarchie dans un nuage de mots interactif. Nous améliorons les algorithmes traditionnels de disposition de mise en forme de nuages de mots de sorte à préserver les contraintes sur la hiérarchie de concepts. Le système Laputa est destiné à l’analyse complexe de réseaux temporels denses et multidimensionnels. Pour ce faire, il associe un graphe à une segmentation. La segmentation par communauté, par attribut, ou encore par tranche temporelle, forme des vues de ce graphe. Afin d’associer ces vues avec le tout global, nous utilisons des diagrammes de Sankey pour révéler l’évolution des communautés (diagrammes que nous avons augmentés avec un zoom sémantique). Cette thèse nous permet ainsi de parcourir trois aspects (3V) des plus intéressants de la donnée et du BigData appliqués aux archives multimédia: Le Volume de nos données dans l’immensité des archives, nous atteignons des ordres de grandeurs qui ne sont pas praticables pour la visualisation et l’exploitation des liens. La Vélocité à cause de la nature temporelle de nos données (par définition). La Variété qui est un corollaire de la richesse des données multimédia et de tout ce que l’on peut souhaiter vouloir y investiguer. Ce que l’on peut retenir de cette thèse c’est que la traduction de ces trois défis a pris dans tous les cas une réponse sous la forme d’une analyse de réseaux multiplexes. Nous retrouvons toujours ces structures au coeur de notre travail, que ce soit de manière plus discrète dans les critères pour filtrer les arêtes par l’algorithme Simmelian backbone, que ce soit par la superposition de tranches temporelles, ou bien que ce soit beaucoup plus directement dans la combinaison d’indices sémantiques visuels et textuels pour laquelle nous extrayons les hiérarchies permettant notre visualisation. / Nowadays, information follows complex paths: information propagation involving on-line editors, 24-hour news providers and social medias following entangled paths acting on information content and perception. This thesis studies the adaptation of classical graph measurements to multiplex graphs, to build visualizations from several graphical representations of the networks, and to combine them (synchronized multi-view visualizations, hybrid representations, etc.). Emphasis is placed on the modes of interaction allowing to take in hand the multiplex nature (multilayer) of the networks. These representations and interactive manipulations are also based on the calculation of indicators specific to multiplex networks. The work is based on two main datasets: one is a 12-year archive of the Japanese public daily broadcast NHK News 7, from 2001 to 2013. Another lists the participants in the French TV/radio shows between 2010 and 2015. Two visualization systems based on a Web interface have been developed for multiplex network analysis, which we call "Visual Cloud" and "Laputa". In the Visual Cloud, we formally define a notion of similarity between concepts and groups of concepts that we call co-occurrence possibility (CP). According to this definition, we propose a hierarchical classification algorithm. We aggregate the layers in a multiplex network of documents, and integrate that hierarchy into an interactive word cloud. Here we improve the traditional word cloud layout algorithms so as to preserve the constraints on the concept hierarchy. The Laputa system is intended for the complex analysis of dense and multidimensional temporal networks. To do this, it associates a graph with a segmentation. The segmentation by communities, by attributes, or by time slices, forms views of this graph. In order to associate these views with the global whole, we use Sankey diagrams to reveal the evolution of the communities (diagrams that we have increased with a semantic zoom). This thesis allows us to browse three aspects of the most interesting aspects of the data miming and BigData applied to multimedia archives: The Volume since our archives are immense and reach orders of magnitude that are usually not practicable for the visualization; Velocity, because of the temporal nature of our data (by definition). The Variety that is a corollary of the richness of multimedia data and of all that one may wish to want to investigate. What we can retain from this thesis is that we met each of these three challenges by taking an answer in the form of a multiplex network analysis. These structures are always at the heart of our work, whether in the criteria for filtering edges using the Simmelian backbone algorithm, or in the superposition of time slices in the complex networks, or much more directly in the combinations of visual and textual semantic indices for which we extract hierarchies allowing our visualization.
|
317 |
Scalable System-Wide Traffic Flow Predictions Using Graph Partitioning and Recurrent Neural NetworksReginbald Ivarsson, Jón January 2018 (has links)
Traffic flow predictions are an important part of an Intelligent Transportation System as the ability to forecast accurately the traffic conditions in a transportation system allows for proactive rather than reactive traffic control. Providing accurate real-time traffic predictions is a challenging problem because of the nonlinear and stochastic features of traffic flow. An increasingly widespread deployment of traffic sensors in a growing transportation system produces greater volume of traffic flow data. This results in problems concerning fast, reliable and scalable traffic predictions.The thesis explores the feasibility of increasing the scalability of real-time traffic predictions by partitioning the transportation system into smaller subsections. This is done by using data collected by Trafikverket from traffic sensors in Stockholm and Gothenburg to construct a traffic sensor graph of the transportation system. In addition, three graph partitioning algorithms are designed to divide the traffic sensor graph according to vehicle travel time. Finally, the produced transportation system partitions are used to train multi-layered long shortterm memory recurrent neural networks for traffic density predictions. Four different types of models are produced and evaluated based on root mean squared error, training time and prediction time, i.e. transportation system model, partitioned transportation models, single sensor models, and overlapping partition models.Results of the thesis show that partitioning a transportation system is a viable solution to produce traffic prediction models as the average prediction accuracy for each traffic sensor across the different types of prediction models are comparable. This solution tackles scalability issues that are caused by increased deployment of traffic sensors to the transportation system. This is done by reducing the number of traffic sensors each prediction model is responsible for which results in less complex models with less amount of input data. A more decentralized and effective solution can be achieved since the models can be distributed to the edge of the transportation system, i.e. near the physical location of the traffic sensors, reducing prediction and response time of the models. / Prognoser för trafikflödet är en viktig del av ett intelligent transportsystem, eftersom möjligheten att prognostisera exakt trafiken i ett transportsystem möjliggör proaktiv snarare än reaktiv trafikstyrning. Att tillhandahålla noggrann trafikprognosen i realtid är ett utmanande problem på grund av de olinjära och stokastiska egenskaperna hos trafikflödet. En alltmer utbredd använding av trafiksensorer i ett växande transportsystem ger större volym av trafikflödesdata. Detta leder till problem med snabba, pålitliga och skalbara trafikprognoser.Avhandlingen undersöker möjligheten att öka skalbarheten hos realtidsprognoser genom att dela transportsystemet i mindre underavsnitt. Detta görs genom att använda data som samlats in av Trafikverket från trafiksensorer i Stockholm och Göteborg för att konstruera en trafiksensor graf för transportsystemet. Dessutom är tre grafpartitioneringsalgoritmer utformade för att dela upp trafiksensor grafen enligt fordonets körtid. Slutligen används de producerade transportsystempartitionerna för att träna multi-layered long short memory neurala nät för förspänning av trafiktäthet. Fyra olika typer av modeller producerades och utvärderades baserat på rotvärdes kvadratfel, träningstid och prediktionstid, d.v.s. transportsystemmodell, partitionerade transportmodeller, enkla sensormodeller och överlappande partitionsmodeller.Resultat av avhandlingen visar att partitionering av ett transportsystem är en genomförbar lösning för att producera trafikprognosmodeller, eftersom den genomsnittliga prognoser noggrannheten för varje trafiksensor över de olika typerna av prediktionsmodeller är jämförbar. Denna lösning tar itu med skalbarhetsproblem som orsakas av ökad användning av trafiksensorer till transportsystemet. Detta görs genom att minska antal trafiksensorer varje trafikprognosmodell är ansvarig för. Det resulterar i mindre komplexa modeller med mindre mängd inmatningsdata. En mer decentraliserad och effektiv lösning kan uppnås eftersom modellerna kan distribueras till transportsystemets kant, d.v.s. nära trafiksensorns fysiska läge, vilket minskar prognosoch responstid för modellerna.
|
318 |
Optimisation d'infrastructures de cloud computing sur des green datacenters / Infrastructure Optimization of cloud computing on green data centersSafieddine, Ibrahim 29 October 2015 (has links)
Les centres de données verts de dernière génération ont été conçus pour une consommation optimisée et une meilleure qualité du niveau de service SLA. Cependant,ces dernières années, le marché des centres de données augmente rapidement,et la concentration de la puissance de calcul est de plus en plus importante, ce qui fait augmenter les besoins en puissance électrique et refroidissement. Un centre de données est constitué de ressources informatiques, de systèmes de refroidissement et de distribution électrique. De nombreux travaux de recherche se sont intéressés à la réduction de la consommation des centres de données afin d'améliorer le PUE, tout en garantissant le même niveau de service. Certains travaux visent le dimensionnement dynamique des ressources en fonction de la charge afin de réduire le nombre de serveurs démarrés, d'autres cherchent à optimiser le système de refroidissement qui représente un part important de la consommation globale.Dans cette thèse, afin de réduire le PUE, nous étudions la mise en place d'un système autonome d'optimisation globale du refroidissement, qui se base sur des sources de données externes tel que la température extérieure et les prévisions météorologiques, couplé à un module de prédiction de charge informatique globale pour absorber les pics d'activité, pour optimiser les ressources utilisés à un moindre coût, tout en préservant la qualité de service. Afin de garantir un meilleur SLA, nous proposons une architecture distribuée pour déceler les anomalies de fonctionnements complexes en temps réel, en analysant de gros volumes de données provenant des milliers de capteurs du centre de données. Détecter les comportements anormaux au plus tôt, permet de réagir plus vite face aux menaces qui peuvent impacter la qualité de service, avec des boucles de contrôle autonomes qui automatisent l'administration. Nous évaluons les performances de nos contributions sur des données provenant d'un centre de donnée en exploitation hébergeant des applications réelles. / Next-generation green datacenters were designed for optimized consumption and improved quality of service level Service Level Agreement (SLA). However, in recent years, the datacenter market is growing rapidly, and the concentration of the computing power is increasingly important, thereby increasing the electrical power and cooling consumptions. A datacenter consists of computing resources, cooling systems, and power distribution. Many research studies have focused on reducing the consumption of datacenters to improve the PUE, while guaranteeing the same level of service. Some works aims the dynamic sizing of resources according to the load, to reduce the number of started servers, others seek to optimize the cooling system which represents an important part of total consumption. In this thesis, in order to reduce the PUE, we study the design of an autonomous system for global cooling optimization, which is based on external data sources such as the outside temperature and weather forecasting, coupled with an overall IT load prediction module to absorb the peaks of activity, to optimize activere sources at a lower cost while preserving service level quality. To ensure a better SLA, we propose a distributed architecture to detect the complex operation anomalies in real time, by analyzing large data volumes from thousands of sensors deployed in the datacenter. Early identification of abnormal behaviors, allows a better reactivity to deal with threats that may impact the quality of service, with autonomous control loops that automate the administration. We evaluate the performance of our contributions on data collected from an operating datacenter hosting real applications.
|
319 |
Sakernas Internet : En studie om vehicular fog computing påverkan i trafiken / Internet of things : An study on vehicular fog computing outcome in trafficAhlcrona, Felix January 2018 (has links)
Framtidens fordon kommer vara väldigt annorlunda jämfört med dagens fordon. Stor del av förändringen kommer ske med hjälp av IoT. Världen kommer bli oerhört uppkopplat, sensorer kommer kunna ta fram data som de flesta av oss inte ens visste fanns. Mer data betyder även mer problem. Enorma mängder data kommer genereras och distribueras av framtidens IoT-enheter och denna data behöver analyseras och lagras på effektiva sätt med hjälp av Big data principer. Fog computing är en utveckling av Cloud tekniken som föreslås som en lösning på många av de problem IoT lider utav. Är tradionella lagringsmöjligheter och analyseringsverktyg tillräckliga för den enorma volymen data som kommer produceras eller krävs det nya tekniker för att stödja utvecklingen? Denna studie kommer försöka besvara frågeställningen: ”Vilka problem och möjligheter får utvecklingen av Fog computing i personbilar för konsumenter?” Frågeställningen besvaras genom en systematisk litteraturstudie. Den systematiska litteraturstudien syfte är identifiera och tolka tidigare litteratur och forskning. Analys av materialet har skett med hjälp av öppen kodning som har använts för att sortera och kategorisera data. Resultat visar att tekniker som IoT, Big data och Fog computing är väldigt integrerade i varandra. I framtidens fordon kommer det finns mycket IoTenheter som producerar enorma mängder data. Fog computing kommer bli en effektiv lösning för att hantera de mängder data från IoT-enheterna med låg fördröjning. Möjligheterna blir nya applikationer och system som hjälper till med att förbättra säkerheten i trafiken, miljön och information om bilens tillstånd. Det finns flera risker och problem som behöver lösas innan en fullskalig version kan börja användas, risker som autentisering av data, integriteten för användaren samt bestämma vilken mobilitetsmodell som är effektivast. / Future vehicles will be very different from today's vehicles. Much of the change will be done using the IoT. The world will be very connected, sensors will be able to access data that most of us did not even know existed. More data also means more problems. Enormous amounts of data will be generated and distributed by the future's IoT devices, and this data needs to be analyzed and stored efficiently using Big data Principles. Fog computing is a development of Cloud technology that is suggested as a solution to many of the problems IoT suffer from. Are traditional storage and analysis tools sufficient for the huge volume of data that will be produced or are new technologies needed to support development? This study will try to answer the question: "What problems and opportunities does the development of Fog computing in passenger cars have for consumers?" The question is answered by a systematic literature study. The objective of the systematic literature study is to identify and interpret previous literature and research. Analysis of the material has been done by using open coding where coding has been used to sort and categorize data. Results show that technologies like IoT, Big data and Fog computing are very integrated in each other. In the future vehicles there will be a lot of IoT devices that produce huge amounts of data. Fog computing will be an effective solution for managing the amount of data from IoT devices with a low latency. The possibilities will create new applications and systems that help improve traffic safety, the environment and information about the car's state and condition. There are several risks and problems that need to be resolved before a full-scale version can be used, such as data authentication, user integrity, and deciding on the most efficient mobility model.
|
320 |
Digitaliseringens påverkan på revision / Digitalization’s impact on auditingPersson, Christian January 2018 (has links)
The current business environment demand financial information which are considered relevant and reliable, to ease the managers, investors and employees’decision-making. Auditing has acted as a controlling body to ensure credible information. The audit industry is one of many industries that are constantly changing due to digitalization. Digitalization is considered to be one of society's strongest global forces of change. The aim of the study is to create an increased understanding of the impact of digitalization on auditing and to fulfil the purpose of the study, which is done by answering how the audit process changes due to digitalization and what skills that are necessary for auditors in the digital environment. A qualitative research strategy is applied, where ten semistructured interviews were conducted with both system developers and auditors.The theoretical framework and empirics are structured by the audit process, digitalization and competence needs. Furthermore, the analysis is based on different answers of the respondents and relevant theory. The study implies that the industry is positively influenced by the digitalization, where efficiency is one of the top benefits. The audit process undergoes a shift from statistical selections to data analyses of companies' entire data volumes. Manual operations are eliminated and gives the auditors more time for consulting. The role of consultant requires more qualified knowledge and therefore the study also demonstrate a knowledge gap between universities and audit firms. Digitalization has created a demand for more qualified staff that leads to fewer newly graduated persons being employed. The study also displays that IT-knowledge is one of the key competencies in the future audit industry.
|
Page generated in 0.0668 seconds