Spelling suggestions: "subject:" gig data"" "subject:" iig data""
181 |
Advanced Analytics in Retail Banking in the Czech Republic / Prediktívna analytika v retailovom bankovníctve v Českej republikeBúza, Ján January 2014 (has links)
Advanced analytics and big data allow a more complete picture of customers' preferences and demands. Through this deeper understanding, organizations of all types are finding new ways to engage with existing or potential customers. Research shows that companies using big data and advanced analytics in their operations have productivity and profitability rates that are 5 to 6 percent higher compared to their peers. At the same time it is almost impossible to find a banking institution in the Czech Republic exploiting potential of data analytics to its full extent. This thesis will therefore focus on exploring opportunities for banks applicable in the local context, taking into account technological and financial limitations as well as the market situation. Author will conduct interviews with bank managers and management consultants familiar with the topic in order to evaluate theoretical concepts and the best practices from around the world from the point of Czech market environment, to assess capability of local banks to exploit them and identify the main obstacles that stand in the way. Based on that a general framework for bank managers, who would like to use advanced analytics, will be proposed.
|
182 |
Implicações do fenômeno big data na análise para inteligência estratégicaNesello, Priscila 10 April 2014 (has links)
Uma grande quantidade de dados é produzida diariamente por operações comerciais e financeiras, mídias sociais e dispositivos móveis, sensores e outros equipamentos inseridos no mundo físico. Este fenômeno deu origem ao termo big data, e seus efeitos podem ser percebidos por empresas, ciência e governo. Entretanto, é a inteligência estratégica, não a informação, que auxilia gerentes a extrair valor dos grandes volumes de dados. Para isto, é necessário transformar a informação dispersa no ambiente em conhecimento estruturado e útil à tomada de decisão nas organizações. Este é um processo complexo, pois apesar das ferramentas e técnicas disponíveis é indispensável que o profissional em inteligência saiba lidar com a complexidade cognitiva inerente ao processo de análise. Neste contexto, o objetivo deste trabalho foi o de examinar como o fenômeno big data afeta o processo de análise na atividade de inteligência estratégica. A pesquisa tratou de como o fenômeno big data é percebido pelos entrevistados em suas atividades analíticas em inteligência estratégica e propôs uma análise acerca de suas implicações. Para isso, foi realizado um estudo exploratório qualitativo. Foram entrevistados profissionais brasileiros, residentes nos Estados do Rio Grande do Sul, Rio de Janeiro, Distrito Federal e São Paulo. Estes entrevistados foram selecionados por meio de agentes com atuação, conhecimento e trânsito nos campos de inteligência estratégica e/ou big data. O roteiro que serviu de base para estas entrevistas foi estruturado a partir das dimensões do fenômeno big data e seus efeitos nas atividades analíticas no processo de inteligência estratégica. A técnica utilizada para análise dos dados foi análise de conteúdo. Os resultados indicam que o volume de big data contribui para a compreensão dos métodos de coleta, entretanto prejudica o domínio da matéria. Outras descobertas revelam que para alguns entrevistados big data já integra as práticas profissionais na realização de análises mais elaboradas e no desenvolvimento de projetos específicos. Entretanto para outros, big data ainda não é uma realidade, não sendo percebida a necessidade de utilizar grandes volumes de dados nas análises. Este fato também denota um paradoxo entre a caracterização da produção de conhecimento no campo do big data e o seu uso no campo profissional de inteligência estratégica. Por um lado o maior volume de trabalhos sobre o big data está no campo profissional das organizações produtivas e não na academia e por outro os profissionais em inteligência ainda não percebem o valor do fenômeno para sua atuação profissional. / A considerable amount of data is daily produced by business and financial operations, social media, mobile devices, sensors as well as other gadgets available in the world. This phenomenon gave rise to the big data term whose effects can be perceived by companies, science and governments. However, the strategic intelligence, not the information itself, supports managers eliciting values from big volume of data. For this purpose, transforming the dispersed information in the environment into structure knowledge is necessary and useful for organizations’ decisions. This is a complex process, and despite the tools and available techniques, it is indispensable that the professional in intelligence knows how to deal with inherit cognitive complexity in the courses of analyses. In this context, the objective of the present work was examining how the phenomenon big data affected the course of analyses in the strategic intelligence activity. The research was addressed on how the phenomenon big data was perceived by the interviewers in their analytics activities in strategic intelligence. It also proposed analysis based on its implications. In order to achieve this, a qualitative exploratory study was conducted. Several Brazilian professionals were interviewed, including residents in states such as Rio Grande do Sul, Rio de Janeiro, Distrito Federal and São Paulo. Those interviewed were chosen by agents with experience and knowledge in the strategic intelligence field and/or big data. The guide used for the interviews was structured from dimensions of big data phenomenon and its effects on the course of analyses in the strategic intelligence activity. The technique used for analyzing data was through content review. Results indicate that the volume of big data contributes to the comprehension of collection methods even though it eventually might debilitate the ability to grasp the topic. Other discoveries show that for some of those interviewed, big data has already integrated professional practices on not only performing a more detailed analyses but also developing specific projects. Nevertheless, big data is not a reality yet for others since the necessity of utilizing big volume of data for analysis is not really being perceived. This fact also denotes a paradox between the characterization of production knowledge within big data field and its use in the professional area of strategic intelligence. On the one hand, the great workload about big data is located in the professional area of productive organizations. Not in the academy, though. After all, intelligence professionals have not realized yet the real value of big data phenomenon for their professional performance.
|
183 |
Automated feature synthesis on big data using cloud computing resourcesSaker, Vanessa January 2020 (has links)
The data analytics process has many time-consuming steps. Combining data that sits in a relational database warehouse into a single relation while aggregating important information in a meaningful way and preserving relationships across relations, is complex and time-consuming. This step is exceptionally important as many machine learning algorithms require a single file format as an input (e.g. supervised and unsupervised learning, feature representation and feature learning, etc.). An analyst is required to manually combine relations while generating new, more impactful information points from data during the feature synthesis phase of the feature engineering process that precedes machine learning. Furthermore, the entire process is complicated by Big Data factors such as processing power and distributed data storage. There is an open-source package, Featuretools, that uses an innovative algorithm called Deep Feature Synthesis to accelerate the feature engineering step. However, when working with Big Data, there are two major limitations. The first is the curse of modularity - Featuretools stores data in-memory to process it and thus, if data is large, it requires a processing unit with a large memory. Secondly, the package is dependent on data stored in a Pandas DataFrame. This makes the use of Featuretools with Big Data tools such as Apache Spark, a challenge. This dissertation aims to examine the viability and effectiveness of using Featuretools for feature synthesis with Big Data on the cloud computing platform, AWS. Exploring the impact of generated features is a critical first step in solving any data analytics problem. If this can be automated in a distributed Big Data environment with a reasonable investment of time and funds, data analytics exercises will benefit considerably. In this dissertation, a framework for automated feature synthesis with Big Data is proposed and an experiment conducted to examine its viability. Using this framework, an infrastructure was built to support the process of feature synthesis on AWS that made use of S3 storage buckets, Elastic Cloud Computing services, and an Elastic MapReduce cluster. A dataset of 95 million customers, 34 thousand fraud cases and 5.5 million transactions across three different relations was then loaded into the distributed relational database on the platform. The infrastructure was used to show how the dataset could be prepared to represent a business problem, and Featuretools used to generate a single feature matrix suitable for inclusion in a machine learning pipeline. The results show that the approach was viable. The feature matrix produced 75 features from 12 input variables and was time efficient with a total end-to-end run time of 3.5 hours and a cost of approximately R 814 (approximately $52). The framework can be applied to a different set of data and allows the analysts to experiment on a small section of the data until a final feature set is decided. They are able to easily scale the feature matrix to the full dataset. This ability to automate feature synthesis, iterate and scale up, will save time in the analytics process while providing a richer feature set for better machine learning results.
|
184 |
Impact of Big Data Analytics in Industry 4.0Oikonomidi, Sofia January 2020 (has links)
Big data in industry 4.0 is a major subject for the currently developed research but also for the organizations that are motivated to invest in these kinds of projects. The big data are known as the large quantity of data collected from various resources that potentially could be analyzed and provide valuable insights and patterns. In industry 4.0 the production of data is massive, and thus, provides the basis for analysis and important information extraction. This study aims to provide the impact of big data analytics in industry 4.0 environments by the utilization of the SWOT dimensions framework with the intention to provide both a positive and a negative perspective of the subject. Considering that these implementations are an innovative trend and limited awareness exists for the subject, it is valuable to summarize and explore the identified findings from the published literature that will be reviewed based on interviews with data scientists. The intention is to increase the knowledge of the subject and inform the organizations about their potential expectations and challenges. The effects are represented in the SWOT analysis based on findings collected from 22 selected articles which were afterwards discussed with professionals. The systematic literature review started with the creation of a plan and specifically defined steps approach based on previously existing scientific papers. The relevant literature was decided upon specified inclusion and exclusion criteria and their relevance to the research questions. Following this, the interview questionnaire was build based on the findings in order to gather empirical data on the subject. The results revealed that the insights developed through big data support the management towards effective decision-making since it reduces the ambiguity of the actions. Meanwhile, the optimization of production, expenditure decrement, and customer satisfaction are the following as top categories mentioned in the selected articles for the strength dimension. In the opportunities, the interoperability of the equipment, the real-time information acquirement and exchange, and self-awareness of the systems are reflected in the majority of the papers. On the contrary, the threats and weaknesses are referred to fewer studies. The infrastructure limitations, security, and privacy issues are demonstrated substantially. The organizational changes and human resources matters are also expressed but infrequently. The data scientists agreed with the findings and mentioned that decision-making, process effectiveness and customer relationships are their major expectations and objectives while the experience and knowledge limitations of the personnel is their main concern. In general, the gaps in the existing literature could be identified in the challenges that occur for the big data projects in industry 4.0. Consequently, further research is recommended in the field in order to raise the awareness in the interested parties and ensure the project’s success.
|
185 |
Análisis de redes sociales en usuarios peruanos acerca del tratamiento para Covid-19 utilizado herramienta de Big data: El caso del Dióxido de CloroAguirre, Aranza, de la cruz, betsy, Gonzales Cobeñas, Joe, Macedo Lozano, Sasha Darlene 14 October 2020 (has links)
A lo largo de la pandemia, a nivel de redes sociales, se han propuesto múltiples métodos que supuestamente buscaban reducir el impacto del COVID-19 en las personas, siendo el consumo de dióxido de cloro uno de ellos a pesar de no tener evidencia científica que lo respalde. En este contexto, se llevará a cabo el presente estudio con el propósito de realizar un análisis de redes sociales de usuarios peruanos acerca del uso de dióxido de cloro como tratamiento de COVID-19. Se busca que en el futuro dicho análisis pueda servir para la vigilancia en salud pública.
Se usará información públicamente disponible (Big Data) y se examinará a través de Google trends y social-searcher para analizar las tendencias de búsquedas; asimismo, se realizará sentiment analysis de las redes sociales (Facebook, Twitter, Instagram, Youtube. Tumblr, Reddit, Flickr, Dailymotion y Vimeo).
|
186 |
Implementierung und Evaluierung einer Verarbeitung von Datenströmen im Big Data Umfeld am Beispiel von Apache FlinkOelschlegel, Jan 17 May 2021 (has links)
Die Verarbeitung von Datenströmen rückt zunehmend in den Fokus beim Aufbau moderner Big Data Infrastrukturen. Der Praxispartner dieser Master-Thesis, die integrationfactory GmbH & Co. KG, möchte zunehmend den Big Data Bereich ausbauen, um den Kunden auch in diesen Aspekten als Beratungshaus Unterstützung bieten zu können. Der Fokus wurde von Anfang an auf Apache Flink gelegt, einem aufstrebenden Stream-Processing-Framework. Das Ziel dieser Arbeit ist die Implementierung verschiedener typischer Anwendungsfälle des Unternehmens mithilfe von Flink und die anschließende Evaluierung
dieser. Im Rahmen dessen wird am Anfang zunächst die zentrale Problemstellung festgehalten und daraus die Zielstellungen abgeleitet. Zum besseren Verständnis werden im Nachgang wichtige Grundbegriffe und Konzepte vermittelt. Es wird außerdem dem Framework ein eigenes Kapitel gewidmet, um den Leser einen umfangreichen aber dennoch kompakten Einblick in Flink zu geben. Dabei wurde auf verschiedene Quellen eingegangen, mitunter wurde auch ein direkter Kontakt mit aktiven Entwicklern des Frameworks aufgebaut. Dadurch konnten zunächst unklare Sachverhalte durch fehlende Informationen aus den Primärquellen im Nachgang geklärt und aufbereitet in das Kapitel hinzugefügt werden. Im Hauptteil der Arbeit wird eine Implementierung von definierten Anwendungsfällen
vorgenommen. Dabei kommen die Datastream-API und FlinkSQL zum Einsatz, dessen Auswahl auch begründet wird. Die Ausführung der programmierten Jobs findet im firmeneigenen Big Data Labor statt, einer virtualisierten Umgebung zum Testen von Technologien. Als zentrales Problem dieser Master-Thesis sollen beide Schnittstellen auf die Eignung hinsichtlich der Anwendungsfälle evaluiert werden. Auf Basis des Wissens aus den Grundlagen-Kapiteln und der Erfahrungen aus der Entwicklung der Jobs werden Kriterien zur Bewertung mithilfe des Analytic Hierarchy Processes aufgestellt. Im Nachgang findet eine Auswertung statt und die Einordnung des Ergebnisses.:1. Einleitung
1.1. Motivation
1.2. Problemstellung
1.3. Zielsetzung
2. Grundlagen
2.1. Begriffsdefinitionen
2.1.1. Big Data
2.1.2. Bounded vs. unbounded Streams
2.1.3. Stream vs. Tabelle
2.2. Stateful Stream Processing
2.2.1. Historie
2.2.2. Anforderungen
2.2.3. Pattern-Arten
2.2.4. Funktionsweise zustandsbehafteter Datenstromverarbeitung
3. Apache Flink
3.1. Historie
3.2. Architektur
3.3. Zeitabhängige Verarbeitung
3.4. Datentypen und Serialisierung
3.5. State Management
3.6. Checkpoints und Recovery
3.7. Programmierschnittstellen
3.7.1. DataStream-API
3.7.2. FlinkSQL & Table-API
3.7.3. Integration mit Hive
3.8. Deployment und Betrieb
4. Implementierung
4.1. Entwicklungsumgebung
4.2. Serverumgebung
4.3. Konfiguration von Flink
4.4. Ausgangsdaten
4.5. Anwendungsfälle
4.6. Umsetzung in Flink-Jobs
4.6.1. DataStream-API
4.6.2. FlinkSQL
4.7. Betrachtung der Resultate
5. Evaluierung
5.1. Analytic Hierarchy Process
5.1.1. Ablauf und Methodik
5.1.2. Phase 1: Problemstellung
5.1.3. Phase 2: Struktur der Kriterien
5.1.4. Phase 3: Aufstellung der Vergleichsmatrizen
5.1.5. Phase 4: Bewertung der Alternativen
5.2. Auswertung des AHP
6. Fazit und Ausblick
6.1. Fazit
6.2. Ausblick
|
187 |
The Evolution of Big Data and Its Business ApplicationsHalwani, Marwah Ahmed 05 1900 (has links)
The arrival of the Big Data era has become a major topic of discussion in many sectors because of the premises of big data utilizations and its impact on decision-making. It is an interdisciplinary issue that has captured the attention of scholars and created new research opportunities in information science, business, heath care, and many others fields. The problem is the Big Data is not well defined, so that there exists confusion in IT what jobs and skill sets are required in big data area. The problem stems from the newness of the Big Data profession. Because many aspects of the area are unknown, organizations do not yet possess the IT, human, and business resources necessary to cope with and benefit from big data. These organizations include health care, enterprise, logistics, universities, weather forecasting, oil companies, e-business, recruiting agencies etc., and are challenged to deal with high volume, high variety, and high velocity big data to facilitate better decision- making. This research proposes a new way to look at Big Data and Big Data analysis. It helps and meets the theoretical and methodological foundations of Big Data and addresses an increasing demand for more powerful Big Data analysis from the academic researches prospective. Essay 1 provides a strategic overview of the untapped potential of social media Big Data in the business world and describes its challenges and opportunities for aspiring business organizations. It also aims to offer fresh recommendations on how companies can exploit social media data analysis to make better business decisions—decisions that embrace the relevant social qualities of its customers and their related ecosystem. The goal of this research is to provide insights for businesses to make better, more informed decisions based on effective social media data analysis. Essay 2 provides a better understanding of the influence of social media during the 2016 American presidential election and develops a model to examine individuals' attitudes toward participating in social media (SM) discussions that might influence their decision in choosing between the two presidential election candidates, Donald Trump and Hilary Clinton. The goal of this research is to provide a theoretical foundation that supports the influence of social media on individual's decisions. Essay 3 defines the major job descriptions for careers in the new Big Data profession. It to describe the Big Data professional profile as reflected by the demand side, and explains the differences and commonalities between company-posted job requirements for data analytics, business analytics, and data scientists jobs. The main aim for this work is to clarify of the skill requirements for Big Data professionals for the joint benefit of the job market where they will be employed and of academia, where such professionals will be prepared in data science programs, to aid in the entire process of preparing and recruiting for Big Data positions.
|
188 |
Aplikace pro Big Data / Application for Big DataBlaho, Matúš January 2018 (has links)
This work deals with the description and analysis of the Big Data concept and its processing and use in the process of decision support. Suggested processing is based on the MapReduce concept designed for Big Data processing. The theoretical part of this work is largely about the Hadoop system that implements this concept. Its understanding is a key feature for properly designing applications that run within it. The work also contains design for specific Big Data processing applications. In the implementation part of the thesis is a description of Hadoop system management, description of implementation of MapReduce applications and description of their testing over data sets.
|
189 |
Návrh řešení pro efektivní analýzu bezpečnostních dat / Design of a Solution for Effective Analysis of Security DataPodlesný, Šimon January 2021 (has links)
The goal of this thesis is to design architecture capable of processing big data with focus on data leaks. For this purpose multiple data storage systems were described a and compared. The proposed architecture can load, process, store and access data for analytic purposes while taking into account authentication and authorisation of users and principles of modern agile infrastructure.
|
190 |
Analyses, Mitigation and Applications of Secure Hash AlgorithmsAl-Odat, Zeyad Abdel-Hameed January 2020 (has links)
Cryptographic hash functions are one of the widely used cryptographic primitives with a purpose to ensure the integrity of the system or data. Hash functions are also utilized in conjunction with digital signatures to provide authentication and non-repudiation services. Secure Hash Algorithms are developed over time by the National Institute of Standards and Technology (NIST) for security, optimal performance, and robustness. The most known hash standards are SHA-1, SHA-2, and SHA-3.
The secure hash algorithms are considered weak if security requirements have been broken. The main security attacks that threaten the secure hash standards are collision and length extension attacks. The collision attack works by finding two different messages that lead to the same hash. The length extension attack extends the message payload to produce an eligible hash digest. Both attacks already broke some hash standards that follow the Merkle-Damgrard construction. This dissertation proposes methodologies to improve and strengthen weak hash standards against collision and length extension attacks. We propose collision-detection approaches that help to detect the collision attack before it takes place. Besides, a proper replacement, which is supported by a proper construction, is proposed. The collision detection methodology helps to protect weak primitives from any possible collision attack using two approaches. The first approach employs a near-collision detection mechanism that was proposed by Marc Stevens. The second approach is our proposal. Moreover, this dissertation proposes a model that protects the secure hash functions from collision and length extension attacks. The model employs the sponge structure to construct a hash function. The resulting function is strong against collision and length extension attacks. Furthermore, to keep the general structure of the Merkle-Damgrard functions, we propose a model that replaces the SHA-1 and SHA-2 hash standards using the Merkle-Damgrard construction. This model employs the compression function of the SHA-1, the function manipulators of the SHA-2, and the $10*1$ padding method. In the case of big data over the cloud, this dissertation presents several schemes to ensure data security and authenticity. The schemes include secure storage, anonymous privacy-preserving, and auditing of the big data over the cloud.
|
Page generated in 0.0544 seconds