Spelling suggestions: "subject:"dilatata""
91 |
The impact of Big Data on companies and a lack of skills as the origin of the challenges they are facing : An investigation aimed to understand the origin of the challenges companies are facing with Big DataIshac, Patrick, Dussoulier, Hannah January 2018 (has links)
The 21st century saw the rise of internet and with it, the digitalization of our world. Today, many companies rely on technology to run their businesses and Big Data is one of the latest phenomenon that arose from technological evolution. As the amount of data is constantly increasing, ranging from business intelligence to personal information, Big Data has become a major source of competitive advantage for companies who are able to implement it efficiently. However, as with every new technology, challenges and issues arise. What’s more, the learning curve is steep, and companies need to adapt quickly, so as to follow the pace of innovation and develop the skill-set of their employees to remain competitive in their respective industries. This paper investigates how Big Data is impacting companies, the main challenges they are facing within its implementation and looks to determine if these challenges originate from a lack of skills from the current workforce. A qualitative study has been conducted, interviewing nine respondents over eight interviews of 54 minutes on average. Three main ideas have been outlined through the interviews conducted by the authors. The first is the impact of Big Data in companies with mainly the benefits, challenges, regulations as well as the cohabitation of human beings and technology. The second and third are the optimal profile of a decision-maker and the ideal profile of the employee in companies working with Big Data. The profiles of the decision-maker and employee are composed of characteristics, skills and experience. The decision-maker, in this paper, was defined as a key actor in the success or failure of a company and of great influence on the profile of the employee. His skills, such as strategic, basic, analytical, communication and decision-making were developed, and their correlation was demonstrated. Ultimately, the lack of skills in companies today, often regarded as a challenge by numerous scholars, was shown to be the origin for many of the challenges companies are facing, mainly through bad decision-making and lack of communication. The authors finally outlined steps for a successful implementation of Big Data in companies and future trends such as regulations and increased technological evolution to carefully and actively pursue for people and businesses alike.
|
92 |
Utilização de big data analytics nos sistemas de medição de desempenho: estudos de casoMello, Raquel Gama Soares de 12 February 2015 (has links)
Made available in DSpace on 2016-06-02T19:52:10Z (GMT). No. of bitstreams: 1
6712.pdf: 2095829 bytes, checksum: 0fcab607bc1d879d07e91b41e95f55c5 (MD5)
Previous issue date: 2015-02-12 / Financiadora de Estudos e Projetos / Big data is associated with large amounts of data of different types that come from different sources in a very fast way, able to add value to business and with veracity. Nowadays, many companies are looking for ways to extract useful information from this huge amount of data. This can be attained applying analytical techniques. The application of these techniques to big data is denominated big data analytics. It can influence how managers make their decisions and manage the company businesses. This influences the use of performance measurement systems (PMSs). These systems are composed by a multidimensional set of performance measures that can support decision making and business planning. This way, performance measurement systems and big data analytics can be used to support decision making and the implementation of actions. There is evidence, in the literature, that big data analytics can be used in performance measurement systems. Following this context, this study aims at investigating how companies apply the big data analytics in using performance measurement systems. To achieve this objective, a systematic literature review was carried out for checking existing studies on the relationship between big data and performance measurement system. Then, case study method was applied. The empirical findings showed that big data analytics supports the decision making process, making it more efficient and effective. The results showed that big data analytics helps PMS identify, through analyses, how past actions can influence the future performance. Such analyses are in essence descriptive and predictive and it was applied in sales process. The empirical findings from the case studies showed that big data analytics contributes mainly to the use of PMSs related to planning and to influencing behavior. Therefore, it is possible to conclude that there is a contribution when big data analytics is used in performance measurement system. / Big data está associado a grande quantidade de dados de diferentes tipos, provindos de diversas fontes de forma acelerada, capazes de trazer valor aos negócios e com veracidade. Atualmente, muitas empresas buscam formas de extrair informações úteis deste grande volume de dados. Isso pode ser feito por meio de técnicas analíticas. A aplicação dessas técnicas ao big data é denominada big data analytics que pode influenciar a forma como os gestores tomam as suas decisões e gerenciam os negócios da empresa. Isto pode afetar os sistemas de medição de desempenho (SMDs) que são compostos por um conjunto de medidas de desempenho multidimensionais capaz de apoiar a tomada de decisões e o planejamento dos negócios. Dessa forma, os sistemas de medição de desempenho e o big data analytics podem ser utilizados para apoiar a tomada de decisão e dar suporte à realização das ações. Há evidências, na literatura pesquisada, de que o big data analytics possa ser utilizado nos sistemas de medição de desempenho. Dentro deste contexto, esta pesquisa tem como objetivo investigar como as empresas usam big data analytics nos sistemas de medição de desempenho. Para alcançar o objetivo deste trabalho, primeiramente, foi realizada uma revisão sistemática da literatura para verificar as publicações existentes a respeito da relação entre big data analytics e sistema de medição de desempenho. Em seguida, o método de pesquisa utilizado foi estudo de caso múltiplo de caráter exploratório. As análises dos dados comprovaram que o big data analytics auxilia para que o processo de tomada de decisão seja mais eficiente e efetivo. Os resultados apontaram que o big data analytics auxilia o SMD a identificar como ações passadas podem influenciar o desempenho futuro por meio das análises realizadas. Essas análises são descritivas e preditivas e contribuem nas ações de venda dos produtos. Os dados empíricos provindos dos estudos de caso mostraram que big data analytics contribui principalmente para o uso dos SMDs relacionado ao planejamento e a influenciar o comportamento. Portanto, é possível concluir que existe uma contribuição quando big data analytics é utilizado no sistema de medição de desempenho.
|
93 |
Marknadsföringsmixens fortsatta betydelse, med hänsyn till digitaliseringen. : En systematisk litteraturstudieLager, Joel, Hermansson, Pontus January 2018 (has links)
Syfte: Syftet med denna studie är att diskutera hur marknadsföringsmixen kunnat bevarat sin betydelse över tid inom området marknadsföring, med hänsyn till de förändringar som digitaliseringen inneburit. Metod: Studien genomfördes som en systematisk litteraturstudie. Studien baserades på över 50 vetenskapliga artiklar som är relevanta för syftet. Artiklarna samlades in via akademiska databaser. Resultat: Studien visar på att marknadsföringsmixen fortsatt är en aktuell modell inom marknadsföring tack vare sin pedagogiska enkelhet och förmåga att anpassa sig till rådande förutsättningar. De fyra P:na står fortsatt för Produkt, Pris, Plats och Påverkan men förändringen ligger i vad som innefattas i de ständigt växande och föränderliga subkategorierna. Det som motståndarna till marknadsföringsmixen menar är dess svaghet, att kriterierna som de olika kategorierna vilar på aldrig blivit specificerade, tycks också vara modellens styrka. Utan den efterfrågade specificeringen kan de fyra P:na anpassas efter användaren och de förhållanden som råder, vilket har gjort att den bevarat sin betydelse trots digitaliseringen och de nya förutsättningarna. / Purpose: The purpose of this study is to discuss how the marketing mix could have retained its importance over time in the field of marketing, in consideration of changes that digitalization meant. Method: The study is conducted as a systematic literature study. The study is based on more than 50 scientific articles relevant to the purpose. The articles were collected through academic databases. Result: The articles show that the marketing mix is still an up-to-date marketing tool thanks to its educational simplicity and ability to adapt to the prevailing conditions. The four P:s still stand for Product, Price, Place and Promotion, but the change lies in what is included in the ever-growing and changing subcategories. What the opponents of the marketing mix mean is its weakness, that the criteria that the different categories rest on have never been specified, also seems to be its greatest strength. Without the requested specification, the four P:s can be adapted to the user and the conditions prevailing, which has made it survive in spite of digitization and the new conditions.
|
94 |
Big Data och Hadoop : Nästa generation av lagringLindberg, Johan January 2017 (has links)
The goal of this report and study is to at a theoretical level determine the possi- bilities for Försäkringskassan IT to change platform for storage of data used in their daily activities. Försäkringskassan collects immense amounts of data ev- eryday containing personal information, lines of programming code, payments and customer service tickets. Today, everything is stored in large relationship databases which leads to problems with scalability and performance. The new platform studied in this report is built on a storage technology named Hadoop. Hadoop is developed to store and process data distributed in what is called clus- ters. Clusters that consists of commodity server hardware. The platform promises near linear scalability, possibility to store all data with a high fault tolerance and that it can handle massive amounts of data. The study is done through theo- retical studies as well as a proof of concept. The theory studies focus on the background of Hadoop, it’s structure and what to expect in the future. The plat- form being used at Försäkringskassan today is to be specified and compared to the new platform. A proof of concept will be conducted in a test environment at Försäkringskassan running a Hadoop platform from Hortonworks. Its purpose is to show how storing data is done as well as to show that unstructured data can be stored. The study shows that no theoretical problems have been found and that a move to the new platform should be possible. It does however move handling of the data from before storage to after. This is because todays platform is reliant on relationship databases that require data to be structured neatly to be stored. Hadoop however stores all data but require more work and knowledge to retrieve the data. / Målet med rapporten och undersökningen är att på en teoretisk nivå undersöka möjligheterna för Försäkringskassan IT att byta plattform för lagring av data och information som används i deras dagliga arbete. Försäkringskassan samlar på sig oerhörda mängder data på daglig basis innehållandes allt från personupp- gifter, programkod, utbetalningar och kundtjänstärenden. Idag lagrar man allt detta i stora relationsdatabaser vilket leder till problem med skalbarhet och prestanda. Den nya plattformen som undersöks bygger på en lagringsteknik vid namn Hadoop. Hadoop är utvecklat för att både lagra och processerna data distribuerat över så kallade kluster bestående av billigare serverhårdvara. Plattformen utlovar näst intill linjär skalbarhet, möjlighet att lagra all data med hög feltolerans samt att hantera enorma datamängder. Undersökningen genomförs genom teoristudier och ett proof of concept. Teoristudierna fokuserar på bakgrunden på Hadoop, dess uppbyggnad och struktur samt hur framtiden ser ut. Dagens upplägg för lagring hos Försäkringskassan specificeras och jämförs med den nya plattformen. Ett proof of concept genomförs på en testmiljö hos För- säkringskassan där en Hadoop plattform från Hortonworks används för att påvi- sa hur lagring kan fungera samt att så kallad ostrukturerad data kan lagras. Undersökningen påvisar inga teoretiska problem i att byta till den nya plattformen. Dock identifieras ett behov av att flytta hanteringen av data från inläsning till utläsning. Detta beror på att dagens lösning med relationsdatabaser kräver väl strukturerad data för att kunna lagra den medan Hadoop kan lagra allt utan någon struktur. Däremot kräver Hadoop mer handpåläggning när det kommer till att hämta data och arbeta med den.
|
95 |
Big Data, capacitações dinâmicas e valor para o negócio. / Big data, dynamic capabilities and business value.Michel Lens Seller 17 May 2018 (has links)
A conjunção das recentes tecnologias de mídias sociais, mobilidade e computação em nuvem coloca à disposição das empresas um grande volume de dados variados e recebidos em grande velocidade. Muitas empresas começam a perceber neste fenômeno, conhecido como Big Data, oportunidades de extração de valor para seus negócios. A literatura aponta diversos mecanismos pelos quais Big Data se transforma em valor para a empresa. O primeiro deles é pela geração de agilidade, aqui entendida como a capacidade de perceber e rapidamente reagir a mudanças e oportunidades em seu ambiente competitivo. Outro mecanismo é a utilização de Big Data como facilitador de capacitações dinâmicas que resultam em melhorias operacionais, por meio do aprofundamento (exploit) de alguma capacitação específica. Por fim, Big Data pode ser facilitador de capacitações dinâmicas que resultem em inovação (explore de novas capacitações) e no lançamento de novos produtos e serviços no mercado. Dentro deste contexto, o presente estudo se propõe a investigar a abordagem da utilização de Big Data por empresas inseridas em diferentes contextos competitivos e com diferentes níveis de capacitação de TI. Faz parte também do objetivo da pesquisa entender como as empresas adequaram seus processos de negócio para incorporar o grande volume de dados que têm à disposição. Por meio de estudos de caso realizados em empresas de grande porte de diferentes segmentos e com grande variabilidade na utilização de Big Data, o estudo verifica utilização de Big Data como viabilizador de capacitações dinâmicas atuando no aperfeiçoamento de capacitações operacionais, na diversificação de negócios e na inovação. Além disso, verifica-se a tendência de acoplamento de machine learning às soluções de Big Data, quando o objetivo é a obtenção de agilidade operacional. A capacitação de TI também se mostra determinante da quantidade e complexidade das ações competitivas lançadas pelas empresas com o uso de Big Data. Por fim, é possível antever que, graças às facilidades trazidas pela tecnologia de cloud, recursos de TI serão crescentemente liberados para atuação junto ao negócio - como, por exemplo, em iniciativas de Big Data - fortalecendo as capacitações dinâmicas da empresa e gerando vantagem competitiva. / The combination of the technologies of social media, mobility and cloud computing has dramatically increased the volume, variety and velocity of data available for firms. Many companies have been looking at this phenomenon, also known as Big Data, as a source of value to business. The literature shows different mechanisms for transforming Big Data in business value. First of them is agility, herein understood as the ability of sensing and rapidly responding to changes and opportunities in the competitive environment. Other mechanism is the usage of Big Data as an enabler of dynamic capabilities that result in operational improvements, through the deepening (exploit) of determined operational capability. Finally, Big Data can be the facilitator of dynamic capabilities that result in innovation (explore of new capabilities) and in the launching of new product and services in the market. Within this context, the goal of this study is to investigate the approach for Big Data usage in companies from different competitive scenarios and with different levels of IT capability. It is also part of the objectives to investigate how companies changed their processes to accommodate the huge volume of data available from Big Data. Through case studies in companies of different industries and with different Big Data approaches, the study shows Big Data as an enabler of dynamic capabilities that result in the improvement of operational capabilities, in the diversification of business and in innovation. It has also been identified the trend of association of machine learning to Big Data when the objective is operational agility. IT capability shows to be determinant of the quantity and complexity of the competitive actions launched from Big Data. To conclude, it is valid to anticipate that due to simplification coming from cloud technologies, IT resources will be increasingly released to working close to business - as, for example, in Big Data initiatives - strengthening dynamic capabilities and creating value to business.
|
96 |
Adoption of Big Data And Cloud Computing Technologies for Large Scale Mobile Traffic Analysis / L’adoption des technologies Big Data et Cloud Computing dans le cadre de l’analyse des données de trafic mobileRibot, Stephane 23 September 2016 (has links)
L’émergence des technologies Big Data et Cloud computing pour répondre à l’accroissement constant de la complexité et de la diversité des données constituent un nouvel enjeu de taille pour les entreprises qui, désormais, doivent prendre en compte ce nouveau paradigme. Les opérateurs de services mobiles sont un exemple de sociétés qui cherchent à valoriser et monétiser les données collectées de leur utilisateurs. Cette recherche a pour objectif d’analyser ce nouvel enjeu qui allie d’une part l’explosion du nombre des données à analyser, et d’autre part, la constante émergence de nouvelles technologies et de leur adoption. Dans cette thèse, nous abordons la question de recherche suivante: « Dans quelle mesure les technologies Cloud Computing et Big Data contribuent aux tâches menées par les Data Scientists? » Sur la base d’une approche hypothético-déductive relayée par les théories classiques de l’adoption, les hypothèses et le modèle conceptuel sont inspirés du modèle de l’adéquation de la tâche et de la technologie (TTF) de Goodhue. Les facteurs proposés incluent le Big Data et le Cloud Computing, la tâche, la technologie, l'individu, le TTF, l’utilisation et les impacts réalisés. Cette thèse aborde sept hypothèses qui adressent spécifiquement les faiblesses des modèles précédents. Une enquête a été conduite auprès de 169 chercheurs contribuant à l’analyse des données mobiles. Une analyse quantitative a été effectuée afin de démontrer la validité des mesures effectuées et d’établir la pertinence du modèle théorique proposé. L’analyse partielle des moindres carrés a été utilisée (partial least square) pour établir les corrélations entre les construits. Cette recherche délivre deux contributions majeures : le développement d'un construit (TTF) spécifique aux technologies Big Data et Cloud computing ainsi que la validation de ce construit dans le modèle d’adéquation des technologies Big data - Cloud Computing et de l’analyse des données mobiles. / A new economic paradigm is emerging as a result of enterprises generating and managing increasing amounts of data and looking for technologies like cloud computing and Big Data to improve data-driven decision making and ultimately performance. Mobile service providers are an example of firms that are looking to monetize the collected mobile data. Our thesis explores cloud computing determinants of adoption and Big Data determinants of adoption at the user level. In this thesis, we employ a quantitative research methodology and operationalized using a cross-sectional survey so temporal consistency could be maintained for all the variables. The TTF model was supported by results analyzed using partial least square (PLS) structural equation modeling (SEM), which reflects positive relationships between individual, technology and task factors on TTF for mobile data analysis.Our research makes two contributions: the development of a new TTF construct – task-Big Data/cloud computing technology fit model – and the testing of that construct in a model overcoming the rigidness of the original TTF model by effectively addressing technology through five subconstructs related to technology platform (Big Data) and technology infrastructure (cloud computing intention to use). These findings provide direction to mobile service providers for the implementation of cloud-based Big Data tools in order to enable data-driven decision-making and monetize the output from mobile data traffic analysis.
|
97 |
Managing consistency for big data applications : tradeoffs and self-adaptiveness / Gérer la cohérence pour les applications big data : compromis et auto-adaptabilitéChihoub, Houssem Eddine 10 December 2013 (has links)
Dans l’ère de Big Data, les applications intensives en données gèrent des volumes de données extrêmement grand. De plus, ils ont besoin de temps de traitement rapide. Une grande partie de ces applications sont déployées sur des infrastructures cloud. Ceci est afin de bénéficier de l’élasticité des clouds, les déploiements sur demande et les coûts réduits strictement relatifs à l’usage. Dans ce contexte, la réplication est un moyen essentiel dans le cloud afin de surmonter les défis de Big Data. En effet, la réplication fournit les moyens pour assurer la disponibilité des données à travers de nombreuses copies de données, des accès plus rapide aux copies locales, la tolérance aux fautes. Cependant, la réplication introduit le problème majeur de la cohérence de données. La gestion de la cohérence est primordiale pour les systèmes de Big Data. Les modèles à cohérence forte présentent de grandes limitations aux aspects liées aux performances et au passage à l’échelle à cause des besoins de synchronisation. En revanche, les modèles à cohérence faible et éventuelle promettent de meilleures performances ainsi qu’une meilleure disponibilité de données. Toutefois, ces derniers modèles peuvent tolérer, sous certaines conditions, trop d’incohérence temporelle. Dans le cadre du travail de cette thèse, on s'adresse particulièrement aux problèmes liés aux compromis de cohérence dans les systèmes à large échelle de Big Data. Premièrement, on étudie la gestion de cohérence au niveau du système de stockage. On introduit un modèle de cohérence auto-adaptative (nommé Harmony). Ce modèle augmente et diminue de manière automatique le niveau de cohérence et le nombre de copies impliquées dans les opérations. Ceci permet de fournir de meilleures performances toute en satisfaisant les besoins de cohérence de l’application. De plus, on introduit une étude détaillée sur l'impact de la gestion de la cohérence sur le coût financier dans le cloud. On emploi cette étude afin de proposer une gestion de cohérence efficace qui réduit les coûts. Dans une troisième direction, on étudie les effets de gestion de cohérence sur la consommation en énergie des systèmes de stockage distribués. Cette étude nous mène à analyser les gains potentiels des reconfigurations adaptatives des systèmes de stockage en matière de réduction de la consommation. Afin de compléter notre travail au niveau système de stockage, on s'adresse à la gestion de cohérence au niveau de l’application. Les applications de Big Data sont de nature différente et ont des besoins de cohérence différents. Par conséquent, on introduit une approche de modélisation du comportement de l’application lors de ses accès aux données. Le modèle résultant facilite la compréhension des besoins en cohérence. De plus, ce modèle est utilisé afin de délivrer une cohérence customisée spécifique à l’application. / In the era of Big Data, data-intensive applications handle extremely large volumes of data while requiring fast processing times. A large number of such applications run in the cloud in order to benefit from cloud elasticity, easy on-demand deployments, and cost-efficient Pays-As-You-Go usage. In this context, replication is an essential feature in the cloud in order to deal with Big Data challenges. Therefore, replication therefore, enables high availability through multiple replicas, fast data access to local replicas, fault tolerance, and disaster recovery. However, replication introduces the major issue of data consistency across different copies. Consistency management is a critical for Big Data systems. Strong consistency models introduce serious limitations to systems scalability and performance due to the required synchronization efforts. In contrast, weak and eventual consistency models reduce the performance overhead and enable high levels of availability. However, these models may tolerate, under certain scenarios, too much temporal inconsistency. In this Ph.D thesis, we address this issue of consistency tradeoffs in large-scale Big Data systems and applications. We first, focus on consistency management at the storage system level. Accordingly, we propose an automated self-adaptive model (named Harmony) that scale up/down the consistency level at runtime when needed in order to provide as high performance as possible while preserving the application consistency requirements. In addition, we present a thorough study of consistency management impact on the monetary cost of running in the cloud. Hereafter, we leverage this study in order to propose a cost efficient consistency tuning (named Bismar) in the cloud. In a third direction, we study the consistency management impact on energy consumption within the data center. According to our findings, we investigate adaptive configurations of the storage system cluster that target energy saving. In order to complete our system-side study, we focus on the application level. Applications are different and so are their consistency requirements. Understanding such requirements at the storage system level is not possible. Therefore, we propose an application behavior modeling that apprehend the consistency requirements of an application. Based on the model, we propose an online prediction approach- named Chameleon that adapts to the application specific needs and provides customized consistency.
|
98 |
Uma análise comparativa de ambientes para Big Data: Apche Spark e HPAT / A comparative analysis for Big Data environments: Apache Spark and HPATRafael Aquino de Carvalho 16 April 2018 (has links)
Este trabalho compara o desempenho e a estabilidade de dois arcabouços para o processamento de Big Data: Apache Spark e High Performance Analytics Toolkit (HPAT). A comparação foi realizada usando duas aplicações: soma dos elementos de um vetor unidimensional e o algoritmo de clusterização K-means. Os experimentos foram realizados em ambiente distribuído e com memória compartilhada com diferentes quantidades e configurações de máquinas virtuais. Analisando os resultados foi possível concluir que o HPAT tem um melhor desempenho em relação ao Apache Spark nos nossos casos de estudo. Também realizamos uma análise dos dois arcabouços com a presença de falhas. / This work compares the performance and stability of two Big Data processing tools: Apache Spark and High Performance Analytics Toolkit (HPAT). The comparison was performed using two applications: a unidimensional vector sum and the K-means clustering algorithm. The experiments were performed in distributed and shared memory environments with different numbers and configurations of virtual machines. By analyzing the results we are able to conclude that HPAT has performance improvements in relation to Apache Spark in our case studies. We also provide an analysis of both frameworks in the presence of failures.
|
99 |
Housing and discrimination in economics : an empirical approach using Big Data and natural experiments / Logement et discrimination en économie : une approche empirique mêlant expérience naturelle et Big DataEyméoud, Jean-Benoît 24 October 2018 (has links)
Le premier chapitre documente un paramètre clé pour comprendre le marché du logement : l'élasticité de l'offre de logements des aires urbaines françaises. Nous montrons que cette élasticité peut être appréhendée de deux manières en considérant l’offre intensive et extensive de logements. Grâce à une quantité importante de nouvelles données collectées et une stratégie d'estimation originale, ce premier chapitre estime et décompose les deux élasticités. Le deuxième chapitre est consacré aux possibilités offertes par le Big Data pour étudier le marché de logement locatif français. En exploitant des données en ligne de décembre 2015 à juin 2017 et comparant ces données aux données administratives classiques, nous montons qu’internet fournit des données permettant de suivre avec exactitude les marchés immobiliers locaux. Le troisième chapitre porte sur la discrimination des femmes en politique. Il exploite une expérience naturelle, les élections départementales françaises de 2015 au cours desquelles, pour la première fois dans l'histoire des élections françaises, les candidats ont dû se présenter par paires de candidats obligatoirement mixtes. En utilisant le fait que l'ordre d'apparition des candidats sur un bulletin de vote était déterminé par l’ordre alphabétique et en montrant que cette règle ne semble pas avoir été utilisée de façon stratégique par les partis, nous montrons d’une part que la position des femmes sur le bulletin de vote est aléatoire, et d’autre part, que les binômes de droite pour qui le nom du candidat féminin est en première position sur le bulletin reçoivent en moyenne 1,5 point de pourcentage de moins de votes / The first chapter documents a key parameter to understand the housing market: the elasticity of housing supply in French urban areas. We show that this elasticity can be apprehended in two ways by considering the intensive and extensive supply of housing. Thanks to a large amount of new data collected and an original estimation strategy, this first chapter estimates and decomposes the two elasticities. The second chapter is devoted to the possibilities offered by Big Data for studying the French rental housing market. By using online data from December 2015 to June 2017 and comparing this data with traditional administrative data, we find that the Internet provides data to accurately track local real estate markets.The third chapter deals with the discrimination of women in politics. It exploits a natural experiment, the French departmental elections of 2015 during which, for the first time in the history of the French elections, the candidates had to present themselves in pairs of candidates necessarily mixed. Using the fact that the order of appearance of the candidates on a ballot was determined by the alphabetical order and showing that this rule does not seem to have been used strategically by the parties, we show, first, that the position of women on the ballot is random, and second, that the right-hand pairs for whom the name of the female candidate is in first position on the ballot receive on average 1.5 percentage points less than votes.
|
100 |
Real-time probabilistic reasoning system using Lambda architectureAnikwue, Arinze January 2019 (has links)
Thesis (MTech (Information Technology))--Cape Peninsula University of Technology, 2019 / The proliferation of data from sources like social media, and sensor devices has become overwhelming for traditional data storage and analysis technologies to handle. This has prompted a radical improvement in data management techniques, tools and technologies to meet the increasing demand for effective collection, storage and curation of large data set. Most of the technologies are open-source.
Big data is usually described as very large dataset. However, a major feature of big data is its velocity. Data flow in as continuous stream and require to be actioned in real-time to enable meaningful, relevant value. Although there is an explosion of technologies to handle big data, they are usually targeted at processing large dataset (historic) and real-time big data independently. Thus, the need for a unified framework to handle high volume dataset and real-time big data. This resulted in the development of models such as the Lambda architecture.
Effective decision-making requires processing of historic data as well as real-time data. Some decision-making involves complex processes, depending on the likelihood of events. To handle uncertainty, probabilistic systems were designed. Probabilistic systems use probabilistic models developed with probability theories such as hidden Markov models with inference algorithms to process data and produce probabilistic scores. However, development of these models requires extensive knowledge of statistics and machine learning, making it an uphill task to model real-life circumstances. A new research area called probabilistic programming has been introduced to alleviate this bottleneck.
This research proposes the combination of modern open-source big data technologies with probabilistic programming and Lambda architecture on easy-to-get hardware to develop a highly fault-tolerant, and scalable processing tool to process both historic and real-time big data in real-time; a common solution. This system will empower decision makers with the capacity to make better informed resolutions especially in the face of uncertainty.
The outcome of this research will be a technology product, built and assessed using experimental evaluation methods. This research will utilize the Design Science Research (DSR) methodology as it describes guidelines for the effective and rigorous construction and evaluation of an artefact. Probabilistic programming in the big data domain is still at its infancy, however, the developed artefact demonstrated an important potential of probabilistic programming combined with Lambda architecture in the processing of big data.
|
Page generated in 0.0392 seconds