611 |
Nové výzvy teorie dohledu / New challenges of the surveillance theoryLacinová, Miroslava January 2013 (has links)
The main aim of this diploma thesis, that names "New challenges of the Surveillance theory", is to describe the surveillance theory in today's social network society by using information theory. Accordingly, I will verify the theory of surveillance in two case studies. First case study verifies an impact of Facebook's profiles content on the hiring decisions. The second case sudy analyzes regular day of concrete person in context of surveillance. Both case studies demonstrate surveillance in different surveillance sites.
|
612 |
Marketing Research in the 21st Century: Opportunities and ChallengesHair, Joe F., Harrison, Dana E., Risher, Jeffrey J. 01 October 2018 (has links)
The role of marketing is evolving rapidly, and design and analysis methods used by marketing researchers are also changing. These changes are emerging from transformations in management skills, technological innovations, and continuously evolving customer behavior. But perhaps the most substantial driver of these changes is the emergence of big data and the analytical methods used to examine and understand the data. To continue being relevant, marketing research must remain as dynamic as the markets themselves and adapt accordingly to the following: Data will continue increasing exponentially; data quality will improve; analytics will be more powerful, easier to use, and more widely used; management and customer decisions will increasingly be knowledge-based; privacy issues and challenges will be both a problem and an opportunity as organizations develop their analytics skills; data analytics will become firmly established as a competitive advantage, both in the marketing research industry and in academics; and for the foreseeable future, the demand for highly trained data scientists will exceed the supply.
|
613 |
Combining Big Data And Traditional Business Intelligence – A Framework For A Hybrid Data-Driven Decision Support SystemDotye, Lungisa January 2021 (has links)
Since the emergence of big data, traditional business intelligence systems have been unable to meet most of the information demands in many data-driven organisations. Nowadays, big data analytics is perceived to be the solution to the challenges related to information processing of big data and decision-making of most data-driven organisations. Irrespective of the promised benefits of big data, organisations find it difficult to prove and realise the value of the investment required to develop and maintain big data analytics. The reality of big data is more complex than many organisations’ perceptions of big data. Most organisations have failed to implement big data analytics successfully, and some
organisations that have implemented these systems are struggling to attain the average promised value of big data. Organisations have realised that it is impractical to migrate the entire traditional business intelligence (BI) system into big data analytics and there is a need to integrate these two types of systems.
Therefore, the purpose of this study was to investigate a framework for creating a hybrid data-driven decision support system that combines components from traditional business intelligence and big data analytics systems. The study employed an interpretive qualitative research methodology to investigate research participants' understanding of the concepts related to big data, a data-driven organisation, business intelligence, and other data analytics perceptions. Semi-structured interviews were held to collect research data and thematic data analysis was used to understand the research participants’ feedback information based on their background knowledge and experiences.
The application of the organisational information processing theory (OIPT) and the fit viability model (FVM) guided the interpretation of the study outcomes and the development of the proposed framework. The findings of the study suggested that data-driven organisations collect data from different data sources and process these data to transform them into information with the goal of using the information as a base of all their business decisions. Executive and senior management roles in the adoption of a data-driven decision-making culture are key to the success of the organisation. BI and big data analytics are tools and software systems that are used to assist a data-driven organisation in transforming data into information and knowledge.
The suggested challenges that organisations experience when they are trying to integrate BI and big data analytics were used to guide the development of the framework that can be used to create a hybrid data-driven decision support system. The framework is divided into these elements: business motivation, information requirements, supporting mechanisms, data attributes, supporting processes and hybrid data-driven decision support system architecture. The proposed framework is created to assist data-driven organisations in assessing the components of both business intelligence and big data analytics systems and make a case-by-case decision on which components can be used to satisfy the specific data requirements of an organisation. Therefore, the study contributes to enhancing the existing literature position of the attempt to integrate business
intelligence and big data analytics systems. / Dissertation (MIT (Information Systems))--University of Pretoria, 2021. / Informatics / MIT (Information Systems) / Unrestricted
|
614 |
Personlig integritet och det digitala biblioteket i en tid av Big Data / Privacy and the Digital Library in an Era of Big DataHamdan, Kristin January 2022 (has links)
This bachelor thesis aims at investigating how librarians at university libraries experience privacy and user data when using the digital library, and to relate their views to an era of Big Data. Protecting the library user’s privacy is part of the librarian profession and established in the ethical codes published by the International Federation of Library Associations and Institutions (IFLA). Privacy issues have also been of interest for the library- and information science over the years, some studies which have investigated the expectations of privacy by library users. The results show that library users expect the library to protect their privacy and feel safe about the library as an institution for doing so. Earlier studies, as well as my result, shows that this is a major challenge in an era of Big Data, when the digital library depends on third party suppliers and the exchange of data between libraries, suppliers, and library users. The thesis takes on a qualitative approach. Interviews were held with five librarians using the semi-structured interview as a method. To analyse the result, the thoughts presented by Mai (2019) about personal information in an era of Big Data, and the models of privacy and information presented by Mai (2016, 2019) and Agre (1994), have been used. The result shows that the librarians view on privacy and user data correspond with the perspectives reflected in the Panopticon Model and the Capture Model. That is a traditional view on privacy and information, seeing personal information as a certain type of information that can be controlled. According to this view, violation of a user’s privacy is about not being able to fully protect that personal information. According to Mai (2016, 2019) this is not a satisfactory view on privacy in an era of Big Data. He therefore suggests the Datafication Model. Privacy should, according to Mai (ibid.), be less about the information and more about the situations where the information is being used. This view on privacy and information couldn’t be seen in the result of the study.
|
615 |
A performance study for autoscaling big data analytics containerized applications : Scalability of Apache Spark on KubernetesVennu, Vinay Kumar, Yepuru, Sai Ram January 2022 (has links)
Container technologies are rapidly changing how distributed applications are executed and managed on cloud computing resources. As containers can be deployed on a large scale, there is a tremendous need for Container Orchestration tools like Kubernetes that are highly automatic in deployment, scaling, and management. In recent times, the adoption of these container technologies like Docker has seen a rise in internal usage, commercial offering, and various application fields ranging from High-Performance Computing to Geo-distributed (Edge or IoT) applications. Big Data analytics is another field where there is a trend to run applications (e.g., Apache Spark) as containers for elastic workloads and multi-tenant service models by leveraging various container orchestration tools like Kubernetes. Despite the abundant research on the performance impact of containerizing big data applications, to the best of our knowledge, the studies that focus on specific aspects like scalability and resource management are largely unexplored, which leaves a research gap to study upon. This research studies the performance impact of autoscaling a big data analytics application on Kubernetes based on autoscaling mechanisms like Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA). These state-of-art autoscaling mechanisms available for scaling containerized applications on Kubernetes and the available big data benchmarking tools for generating workload on frameworks like Spark are identified through a literature review. Apache Spark is selected as a representative big data application due to its ecosystem and industry-wide adoption by enterprises. In particular, a series of experiments are conducted by adjusting resource parameters (such as CPU requests and limits) and autoscaling mechanisms to measure run-time metrics like execution time and CPU utilization. Our experiment results show that while Spark performs better execution time when configured to scale with VPA, it also exhibits overhead in CPU utilization. In contrast, the impact of autoscaling big data applications using HPA adds overhead in terms of both execution time and CPU utilization. The research from this thesis can be used by researchers and other cloud practitioners, using big data applications to evaluate autoscaling mechanisms and derive better performance and resource utilization.
|
616 |
HPCC based Platform for COPD Readmission Risk Analysis with implementation of Dimensionality reduction and balancing techniquesUnknown Date (has links)
Hospital readmission rates are considered to be an important indicator of quality of care because they may be a consequence of actions of commission or omission made during the initial hospitalization of the patient, or as a consequence of poorly managed transition of the patient back into the community. The negative impact on patient quality of life and huge burden on healthcare system have made reducing hospital readmissions a central goal of healthcare delivery and payment reform efforts.
In this study, we will be proposing a framework on how the readmission analysis and other healthcare models could be deployed in real world and a Machine learning based solution which uses patients discharge summaries as a dataset to train and test the machine learning model created. Current systems does not take into consideration one of the very important aspect of solving readmission problem by taking Big data into consideration. This study also takes into consideration Big data aspect of solutions which can be deployed in the field for real world use. We have used HPCC compute platform which provides distributed parallel programming platform to create, run and manage applications which involves large amount of data. We have also proposed some feature engineering and data balancing techniques which have shown to greatly enhance the machine learning model performance. This was achieved by reducing the dimensionality in the data and fixing the imbalance in the dataset.
The system presented in this study provides a real world machine learning based predictive modeling for reducing readmissions which could be templatized for other diseases. / Includes bibliography. / Dissertation (Ph.D.)--Florida Atlantic University, 2020. / FAU Electronic Theses and Dissertations Collection
|
617 |
Propuesta de transformación digital alineada al plan estratégico de una entidad de control gubernamental del Estado Peruano luego de ser evaluados los procesos de TI utilizando COBIT 5 PAM / Digital transformation proposal aligned to the strategic plan of a government control entity of the peruvian state after evaluated IT processes using COBIT 5 PAMOrmeño Salazar, Carlos Alberto, Valdez Cordova, Hjalmar Neryght Yoght 06 November 2019 (has links)
El siguiente trabajo de tesis presenta una propuesta de transformación digital alineada al plan estratégico institucional de una entidad de control gubernamental del Estado peruano luego de realizar la evaluación de los procesos con el uso de COBIT 5 PAM.
En este documento, se encontrará como primer capítulo, la definición del proyecto, donde se detalla la entidad de objeto de estudio, la visión, misión, objetivos estratégicos y una explicación breve de los mismos.
El segundo capítulo, nos muestra el cumplimiento de los Outcomes – ABET.
En el tercer capítulo, se presenta el marco teórico utilizado para la tesis.
En el cuarto capítulo encontraremos el desarrollo del proyecto, que está dividido en dos secciones, la primera relacionada al Gobierno Corporativo de TI donde hemos utilizado como instrumento de evaluación COBIT 5 PAM, aquí se detallan los procesos de TI, se realiza un contraste con los procesos de TI de COBIT 5 PAM, se evalúan los niveles actuales de cada proceso, la brecha encontrada y finalmente se plantea un plan de mejora. Como segunda sección, encontramos el tema de Transformación Digital, en el cual se detallan las tecnologías a utilizar, el proceso que se desea transformar y cómo las herramientas tecnológicas planteadas brindan soporte al mismo, así también, se determina una factibilidad técnica y económica de la propuesta.
Como quinto capítulo, se detallan los resultados del proyecto de tesis, y finalmente, se presenta las herramientas para la gestión del proyecto y anexos. / The following thesis work presents a proposal for digital transformation aligned with the institutional strategic plan and after a self-evaluation has been carried out in relation to the use of COBIT 5 PAM on a government control entity of the Peruvian State.
In this document, you will find as the first chapter, the definition of the project, which details the entity under study, the vision, mission, strategic objectives, process diagram and a brief explanation of them.
In the third chapter, the theoretical framework used for the realization of this thesis is presented.
In the fourth chapter we will find the development of the project, which is divided into two large sections, the first related to the Corporate Governance of IT where we have used as an evaluation instrument COBIT 5 PAM, here the IT processes are detailed, a contrast is made With the IT processes proposed by COBIT 5 PAM, the current levels of each process, the expected levels, the gap found are evaluated and finally an improvement plan is proposed. As a second section, we find the topic of Digital Transformation, in which the technologies to be used are detailed, the process that you want to transform and how the technological tools proposed support it, as well, a technical and economic feasibility of the proposal.
As the fifth chapter, the results of the thesis project are detailed, and finally, the support tools for project management are presented. / Tesis
|
618 |
Vår digitala uppfattning: ett paradoxalt mönster - En kvalitativ studie om medvetenhet kring digitala fotspårSunnemark, Alma, Sylvander, Rebecka January 2018 (has links)
Digitala fotspår är de spår av personlig information individer lämnar efter sig när de använder internet. Syftet med studien var att söka en förståelse över individers medvetenhet gällande dessa digitala fotspår och därmed insamling av personlig information. Vidare var det sekundära syftet att jämföra om det fanns en skillnad mellan två utvalda generationer. Tillvägagångssättet byggde på kvalitativa intervjuer för att skapa en djup förståelse för hur individer uppfattar digitala fotspår. Ett induktivt arbetssätt har tillämpats där teorier utvecklats utifrån det empiriska materialet för att ge empirin den största rösten. Studiens resultat visar att individer är medvetna om att deras personliga information samlas in när de använder internet och digitala tjänster. Följaktligen bekräftar studien att det finns en oro gällande sin integritet på internet, men trots denna oro kommer individer fortsätta använda internet och digitala tjänster då bekvämligheten överstiger. Slutligen visar studien att den största skillnaden mellan individer är att olika generationer spelar sin roll där en äldre generation ser en större medvetenhet av konsekvenser än vad en yngre generation gör. / A digital footprint is the personal data people leave behind when using the internet. The purpose of this study was to get an understanding of people’s awareness towards digital footprints and personal data. The secondary purpose was to compare if there is a pattern between different generations. The inductive study is based on qualitative interviews to generate a deep understanding of how people feel about their digital footprints and collection of personal data. Therefore, the theories have been developed according to the empirical data in order to truly present the voice of the respondents. The study shows that people are aware of the fact that their data is collected when they use the internet and digital services. Consequently, the study presents privacy concerns towards privacy attitudes in different contexts. However, despite the privacy concern, people will continue to share their data in exchange for using the internet and its digital services. Furthermore, the study came to the conclusion that the main difference between generational belonging is individual’s perceptions of digital footprint and has proven to be the base of their vulnerability and integrity on the internet.
|
619 |
Automatizované zhromažďovanie a štrukturalizácia dát z webových zdrojovZahradník, Roman January 2018 (has links)
This diploma thesis deals with the creation of a solution for continuous data acquisition from web sources. The application is in charge of automatically navigating web pages, extracting data using dedicated selectors, and subsequently standardizing them for further processing for data mining.
|
620 |
Méthodes de sondage pour les données massives / Sampling methods for big dataRebecq, Antoine 15 February 2019 (has links)
Cette thèse présente trois parties liées à la théorie des sondages. La première partie présente deux résultats originaux de sondages qui ont eu des applications pratiques dans des enquêtes par sondage de l'Insee. Le premier article présente un théorème autorisant un plan de sondage stratifié constituant un compromis entre la dispersion des poids et l'allocation de précision optimale pour une variable d'intérêt spécifique. Les données d’enquête sont souvent utilisées pour estimer nombre de totaux ou modèles issus de variables exclues du design. La précision attendue pour ces variables est donc faible, mais une faible dispersion des poids permet de limiter les risques qu'une estimation dépendant d'une de ces variables ait une très mauvaise précision. Le second article concerne le facteur de repondération dans les estimateurs par calage. On propose un algorithme efficace capable de calculer les facteurs de poids les plus rapprochés autour de 1 tels qu'une solution au problème de calage existe. Cela permet de limiter les risques d'apparition d'unités influentes, particulièrement pour l'estimation sur des domaines. On étudie par simulations sur données réelles les propriétés statistiques des estimateurs obtenus. La seconde partie concerne l'étude des propriétés asymptotique des estimateurs sur données issues de sondage. Celles-ci sont difficiles à étudier en général. On présente une méthode originale qui établit la convergence faible vers un processus gaussien pour le processus empirique d'Horvitz-Thompson indexé par des classes de fonction, pour de nombreux algorithmes de sondage différents utilisés en pratique. Dans la dernière partie, on s'intéresse à des méthodes de sondage pour des données issues de graphes, qui ont des applications pratiques lorsque les graphes sont de taille telles que leur exploitation informatique est coûteuse. On détaille des algorithmes de sondage permettant d'estimer des statistiques d'intérêt pour le réseaux. Deux applications, à des données de Twitter puis à des données simulées, concluent cette partie. / This thesis presents three different parts with ties to survey sampling theory. In the first part, we present two original results that led to practical applications in surveys conducted at Insee (French official statistics Institute). The first chapter deals with allocations in stratified sampling. We present a theorem that proves the existence of an optimal compromise between the dispersion of the sampling weights and the allocation yielding optimal precision for a specific variable of interest. Survey data are commonly used to compute estimates for variables that were not included in the survey design. Expected precision is poor, but a low dispersion of the weights limits risks of very high variance for one or several estimates. The second chapter deals with reweighting factors in calibration estimates. We study an algorithm that computes the minimal bounds so that the calibration estimators exist, and propose an efficient way of resolution. We also study the statistical properties of estimates using these minimal bounds. The second part studies asymptotic properties of sampling estimates. Obtaining asymptotic guarantees is often hard in practice. We present an original method that establishes weak convergence for the Horvitz-Thompson empirical process indexed by a class of functions for a lot of sampling algorithms used in practice. In the third and last part, we focus on sampling methods for populations that can be described as networks. They have many applications when the graphs are so big that storing and computing algorithms on them are very costly. Two applications are presented, one using Twitter data, and the other using simulated data to establish guidelines to design efficient sampling designs for graphs.
|
Page generated in 0.0521 seconds