Spelling suggestions: "subject:"bigdata"" "subject:"bølgedata""
681 |
Processing Big Data in Main Memory and on GPUFathi Salmi, Meisam 08 June 2016 (has links)
No description available.
|
682 |
Welche Einflussfaktoren eignen sich für die Typisierung von Radfahrer*innen mittels GPS-Daten? Ein Ansatz zur Kalibrierung von Self-Selected-SamplesLißner, Sven 07 March 2022 (has links)
Für eine zielgerichtete Radverkehrsplanung sind Analysedaten notwendig, die aber in vielen Kommunen kaum verfügbar sind. GPS - Daten von Radfahrer*innen können diese Datenlücke schließen. Bestehende Datensätze und Forschungsansätze bleiben bis-her den Nachweis der Repräsentativität für die Grundgesamtheit der Radfahrer*innen im jeweiligen Untersuchungsgebiet schuldig. Oft wird dies zudem als Schwachpunkt bisheriger Arbeiten thematisiert.
Um die Frage der Repräsentativität von GPS – Datensätzen für den deutschen Raum zu untersuchen, wird in der vorliegenden Arbeit das Radverkehrsverhalten von Radfahrer*innen im Raum Dresden analysiert. Grundlage der Analyse ist ein im Rahmen des Forschungsprojektes „RadVerS“ erhobener GPS-Datensatz von 200 Proband*innen, der 5.300 Einzelwege im Untersuchungsgebiet der Stadt Dresden enthält und Einblicke in deren Radverkehrsverhalten erlaubt. Die erhobenen Daten wurden mit unterschiedlichen Verfahren aufbereitet, so wurden beispielsweise Fahrten mit anderen Verkehrsmodi entfernt und Fahrten in einzelne Wege aufgeteilt. Anschließend wurden die Wegedaten mit Daten aus dem Verkehrsnetz des Untersuchungsgebietes angereichert und statistisch ausgewertet. Der Einfluss einzelner Fahrverhaltensparameter wurde dabei sowohl deskriptiv als auch mittels eines generalisierten linearen Modells ausgewertet.
Das Ergebnis der Untersuchungen zeigt, dass folgende Attribute Einfluss auf das Rad-verkehrsverhalten aufweisen und somit maßgeblich die Diskussion über die Repräsentativität von GPS Daten für die Radverkehrsplanung bestimmen sollten. Dabei offenbaren sich Unterscheide zum Vorgehen bei Haushaltsbefragungen:
- Alter: Es ist sicherzustellen, dass in der Stichprobe insbesondere sehr junge (< 30 Jahre) und ältere (>65 Jahre) Nutzer*innen entsprechend enthalten sind. Die Altersklassen zwischen 30 und 65 können dagegen zusammengefasst werden und sind i. d. R. ausreichend repräsentiert.
- Geschlecht: Diejenigen weiblichen Teilnehmerinnen, welche in Smartphone-basierten Stichproben enthalten sind, bewegen sich auf dem Fahrrad mit langsameren Geschwindigkeiten als männliche Probanden. Zudem beschleunigen sie weniger stark und ihre zurückgelegten Entfernungen sind kürzer als die der männlichen Probanden.
- Wegezweck: Die in smartphone-basierten Stichproben enthaltenen Arbeitswege sind tendenziell länger als Einkaufs- und Freizeitwege
- Die auf Arbeitswegen gefahrenen Geschwindigkeiten sind zudem höher als bei den übrigen Wegezwecken
Oben aufgeführte Parameter haben nur einen geringen und nicht signifikanten Einfluss auf die Infrastrukturnutzung durch Radfahrer*innen. So konnten nur geringe Unterschiede bei der Wahl der Infrastruktur zwischen Geschlechtern, unterschiedlichen Altersklassen und Radfahrtypen festgestellt werden.
Darüber hinaus lässt sich feststellen, dass nach erfolgter Datenaufbereitung die Wege-weitenverteilung und der Tagesgang der Radfahrten im Wesentlichen kongruent zu den Ergebnissen von Haushaltsbefragungen wie beispielsweise Mobilität in Städten (SrV) sind.:1. Einleitung 1
1.1 Hintergrund 3
1.2 Aufgabenstellung 5
1.3 Vorgehen 6
1.4 Herausforderungen und Grenzen der gewählten Methodik 7
2. Grundlagen 10
2.1 Radverkehr in Deutschland 11
2.2 Kennwerte des Radverkehrs 13
2.3 Planungsdaten und Analysemethoden im Radverkehr 18
2.4 Crowdsourcing als neuer Ansatz in der Verkehrsplanung 23
2.5 Big Data 26
2.6 GPS als Erhebungswerkzeug 27
2.7 Zusammenfassung 31
3. Forschungsstand 33
3.1 Methodik der Literaturrecherche Definition von Schlagworten, Recherchedatenbanken
3.2 Crowdsourcing und GPS-Datennutzung in der Verkehrswissenschaft 35
3.3 Quantitative (GPS)-Studien zum Radverkehrsverhalten 40
3.4 Zusammenfassung 52
4. Methodik 55
4.1 Vorbereitung der Datenerhebung 57
4.2 Datenerhebung 58
4.3 Die Datenübertragung 67
4.4 Datenschutz 68
4.5 Parameterauswahl 69
4.6 Datenverarbeitung 70
4.7 Berechnung der Kennziffern für die Wegestatistik 99
4.8 Zusammenfassung 99
5. Ergebnisse 101
5.1 Allgemeine Kennzahlen 101
5.2 Deskriptive Statistik 109
5.3 Inferenzstatistik 128
5.4 Ergebnisse der Vergleichsstichprobe 144
5.5 Zusammenfassung der wesentlichen Ergebnisse 149
6. Diskussion 152
6.1 Zusammenfassung und Interpretation der zentralen Ergebnisse 152
6.2 Stärken und Schwächen der Arbeit 158
6.3 Grenzen der Methodik 162
6.4 Zusammenfassung 165
7. Fazit und Ausblick 168
|
683 |
Designing the Militarization 2.0 research toolEhrenberg, Nils January 2014 (has links)
Research is a time-consuming endeavor which requires appropriate tools to manage often vast amounts of information. Militarization 2.0 is a research project aiming to explore Militarization in social media. The aim of this project is to design a user interface for supporting researchers in projects involving large amounts of data in qualitative studies. The project will follow the design process of the first version of the Militarization 2.0 research database interface. The design process involves user studies, interviews and design and testing of paper and digital prototypes. The results include the interface prototype as well as reflections on what tools proved useful in the design process.
|
684 |
Achieving Data Privacy and Security in CloudHuang, Xueli January 2016 (has links)
The growing concerns in term of the privacy of data stored in public cloud have restrained the widespread adoption of cloud computing. The traditional method to protect the data privacy is to encrypt data before they are sent to public cloud, but heavy computation is always introduced by this approach, especially for the image and video data, which has much more amount of data than text data. Another way is to take advantage of hybrid cloud by separating the sensitive data from non-sensitive data and storing them in trusted private cloud and un-trusted public cloud respectively. But if we adopt the method directly, all the images and videos containing sensitive data have to be stored in private cloud, which makes this method meaningless. Moreover, the emergence of the Software-Defined Networking (SDN) paradigm, which decouples the control logic from the closed and proprietary implementations of traditional network devices, enables researchers and practitioners to design new innovative network functions and protocols in a much easier, flexible, and more powerful way. The data plane will ask the control plane to update flow rules when the data plane gets new network packets with which it does not know how to deal with, and the control plane will then dynamically deploy and configure flow rules according to the data plane's requests, which makes the whole network could be managed and controlled efficiently. However, this kind of reactive control model could be used by hackers launching Distributed Denial-of-Service (DDoS) attacks by sending large amount of new requests from the data plane to the control plane. For image data, we divide the image is into pieces with equal size to speed up the encryption process, and propose two kinds of method to cut the relationship between the edges. One is to add random noise in each piece, the other is to design a one-to-one mapping function for each piece to map different pixel value into different another one, which cuts off the relationship between pixels as well the edges. Our mapping function is given with a random parameter as inputs to make each piece could randomly choose different mapping. Finally, we shuffle the pieces with another random parameter, which makes the problems recovering the shuffled image to be NP-complete. For video data, we propose two different methods separately for intra frame, I-frame, and inter frame, P-frame, based on their different characteristic. A hybrid selective video encryption scheme for H.264/AVC based on Advanced Encryption Standard (AES) and video data themselves is proposed for I-frame. For each P-slice of P-frame, we only abstract small part of them in private cloud based on the characteristic of intra prediction mode, which efficiently prevents P-frame being decoded. For cloud running with SDN, we propose a framework to keep the controller away from DDoS attack. We first predict the amount of new requests for each switch periodically based on its previous information, and the new requests will be sent to controller if the predicted total amount of new requests is less than the threshold. Otherwise these requests will be directed to the security gate way to check if there is a attack among them. The requests that caused the dramatic decrease of entropy will be filter out by our algorithm, and the rules of these request will be made and sent to controller. The controller will send the rules to each switch to make them direct the flows matching with the rules to honey pot. / Computer and Information Science
|
685 |
Big Data Algorithms for Visualization and Supervised LearningDjuric, Nemanja January 2013 (has links)
Explosive growth in data size, data complexity, and data rates, triggered by emergence of high-throughput technologies such as remote sensing, crowd-sourcing, social networks, or computational advertising, in recent years has led to an increasing availability of data sets of unprecedented scales, with billions of high-dimensional data examples stored on hundreds of terabytes of memory. In order to make use of this large-scale data and extract useful knowledge, researchers in machine learning and data mining communities are faced with numerous challenges, since the data mining and machine learning tools designed for standard desktop computers are not capable of addressing these problems due to memory and time constraints. As a result, there exists an evident need for development of novel, scalable algorithms for big data. In this thesis we address these important problems, and propose both supervised and unsupervised tools for handling large-scale data. First, we consider unsupervised approach to big data analysis, and explore scalable, efficient visualization method that allows fast knowledge extraction. Next, we consider supervised learning setting and propose algorithms for fast training of accurate classification models on large data sets, capable of learning state-of-the-art classifiers on data sets with millions of examples and features within minutes. Data visualization have been used for hundreds of years in scientific research, as it allows humans to easily get a better insight into complex data they are studying. Despite its long history, there is a clear need for further development of visualization methods when working with large-scale, high-dimensional data, where commonly used visualization tools are either too simplistic to gain a deeper insight into the data properties, or are too cumbersome or computationally costly. We present a novel method for data ordering and visualization. By combining efficient clustering using k-means algorithm and near-optimal ordering of found clusters using state-of-the-art TSP-solver, we obtain efficient algorithm that achieves performance better than existing, computationally intensive methods. In addition, we present visualization method for smaller-scale problems based on object matching. The experiments show that the methods allow for fast detection of hidden patterns, even by users without expertise in the areas of data mining and machine learning. Supervised learning is another important task, often intractable in many modern applications due to time and memory constraints, considering prohibitively large scales of the data sets. To address this issue, we first consider Multi-hyperplane Machine (MM) classification model, and propose online Adaptive MM algorithm which represents a trade-off between linear and kernel Support Vector Machines (SVMs), as it trains MMs in linear time on limited memory while achieving competitive accuracies on large-scale non-linear problems. Moreover, we present a C++ toolbox for developing scalable classification models, which provides an Application Programming Interface (API) for training of large-scale classifiers, as well as highly-optimized implementations of several state-of-the-art SVM approximators. Lastly, we consider parallelization and distributed learning approaches to large-scale supervised learning, and propose AROW-MapReduce, a distributed learning algorithm for confidence-weighted models using MapReduce framework. Experimental evaluation of the proposed methods shows state-of-the-art performance on a number of synthetic and real-world data sets, further paving a way for efficient and effective knowledge extraction from big data problems. / Computer and Information Science
|
686 |
Foundations for a Network Model of Destination Value CreationStienmetz, Jason Lee January 2016 (has links)
Previous research has demonstrated that a network model of destination value creation (i.e. the Destination Value System model) based on the flows of travelers within a destination can be used to estimate and predict individual attractions’ marginal contributions to total visitor expenditures. While development to date of the Destination Value System (DVS) has focused on the value created from dyadic relationships within the destination network, previous research supports the proposition that system-level network structures significantly influence the total value created within a destination. This study, therefore, builds upon previous DVS research in order to determine the relationships between system-level network structures and total value creation within a destination. To answer this question econometric analysis of panel data covering 43 Florida destinations over the period from 2007 to 2015 was conducted. The panel data was created utilizing volunteered geographic information (VGI) obtained from 4.6 million photographs shared on Flickr. Results of econometric analysis indicate that both seasonal effects and DVS network structures have statistically significant relationships with total tourism-related sales within a destination. Specifically, network density, network out-degree centralization, and network global clustering coefficient are found to have negative and statistically significant effects on destination value creation, while network in-degree centralization, network betweenness centralization, and network subcommunity count are found to have positive and statistically significant effects. Quarterly seasonality is also found to have dynamic and statistically significant effects on total tourism-related sales within a destination. Based on the network structures of destinations and total tourism related sales within destinations, this study also uses k-means cluster analysis to classify tourism destinations into a taxonomy of six different system types (Exploration, Involvement, Development I, Development II, Consolidation, and Stars). This taxonomy of DVS types is found to correspond to Butler’s (1980) conceptualization of the destination life cycle, and additional data visualization and exploration based on the DVS taxonomy finds distinct characteristics in destination structure, dynamics, evolution, and performance that may be useful for benchmarking. Additionally, this study assesses the quality of VGI data for tourism related research by comparing DVS network structures based on Flickr data and visitor intercept survey data. Support for the use of VGI data is found, provided that thousands of observations are available for analysis. When fewer observations are available, aggregation techniques are recommended in order to improve the quality of overall destination network system quantification. This research makes important contributions to both the academic literature and the practical management of destinations by demonstrating that DVS network structures significantly influence the economic value created within the destination, and thus suggests that a strategic network management approach is needed for the governance of competitive destinations. As a result, this study provides a strong foundation for the DVS model and future research in the areas of destination resiliency, “smarter” destination management, and tourism experience design. / Tourism and Sport
|
687 |
Modelling Cities as a collection of TeraSystems - Computational challenges in Multi-Agent ApproachKiran, Mariam 03 June 2015 (has links)
Yes / Agent-based modeling techniques are ideal for modeling massive complex systems such as insect colonies or biological cellular systems and even cities. However these models themselves are extremely complex to code, test, simulate and analyze. This paper discusses the challenges in using agent-based models to model complete cities as a complex system. In this paper we argue that Cities are actually a collection of various complex models which are themselves massive multiple systems, each of millions of agents, working together to form one system consisting of an order of a billion agents of different types - such as people, communities and technologies interacting together. Because of the agent numbers and complexity challenges, the present day hardware architectures are unable to cope with the simulations and processing of these models. To accommodate these issues, this paper proposes a Tera (to denote the order of millions)-modeling framework, which utilizes current technologies of Cloud computing and Big data processing, for modeling a city, by allowing infinite resources and complex interactions. This paper also lays the case for bringing together research communities for interdisciplinary research to build a complete reliable model of a city.
|
688 |
Tracking time evolving data streams for short-term traffic forecastingAbdullatif, Amr R.A., Masulli, F., Rovetta, S. 20 January 2020 (has links)
Yes / Data streams have arisen as a relevant topic during the last few years as an efficient method for extracting knowledge from big data. In the robust layered ensemble model (RLEM) proposed in this paper for short-term traffic flow forecasting, incoming traffic flow data of all connected road links are organized in chunks corresponding to an optimal time lag. The RLEM model is composed of two layers. In the first layer, we cluster the chunks by using the Graded Possibilistic c-Means method. The second layer is made up by an ensemble of forecasters, each of them trained for short-term traffic flow forecasting on the chunks belonging to a specific cluster. In the operational phase, as a new chunk of traffic flow data presented as input to the RLEM, its memberships to all clusters are evaluated, and if it is not recognized as an outlier, the outputs of all forecasters are combined in an ensemble, obtaining in this a way a forecasting of traffic flow for a short-term time horizon. The proposed RLEM model is evaluated on a synthetic data set, on a traffic flow data simulator and on two real-world traffic flow data sets. The model gives an accurate forecasting of the traffic flow rates with outlier detection and shows a good adaptation to non-stationary traffic regimes. Given its characteristics of outlier detection, accuracy, and robustness, RLEM can be fruitfully integrated in traffic flow management systems.
|
689 |
Influencia de big data y economía circular en el desempeño operacional de la cadena de suministro del sector manufactura peruanoOlivera Flores, Hugo, Martinez Toledo, Ilich Daniel, Obispo Oscco, Juan Carlos 14 November 2023 (has links)
Las operaciones industriales, de giros de manufactura y servicio, se han visto
impactadas por la digitalización y el enfoque financiero ecosostenibles que demanda el
cuidado global. Considerando que se esté ingresando a la recopilación y análisis de
macrodatos (estructurados y no estructurados). Actualmente, no solo son buenas prácticas de
empresas retail, bancos y seguros, sino que ya son tendencia en el sector manufacturero, pero
su única limitante es la inversión en la infraestructura.
La presente investigación busca conocer el vínculo entre la cadena de suministro de la
economía circular, big data y el desempeño operacional de la cadena de suministro en el
sector manufacturero peruano. La investigación tuvo un enfoque cuantitativo, diseño
observacional y alcance es relacional-explicativo.
Para probar el marco propuesto se tomó un enfoque de análisis factorial confirmatorio
(AFC) utilizando datos recopilados a través de una encuesta con 48 preguntas que involucró a
77 empresas, determinando que existe significancia que corroboran las hipótesis que explican
la correlación entre big data y la economía circular con el desempeño operacional a un nivel
de confiabilidad del 95%. Los resultados de Alfa (fiabilidad) y Omega (fiabilidad compuesta)
son mayor a 0.7. Asimismo, en la varianza extraída todos los constructos presentaron valores
mayores a 0.5. En el análisis del modelo estructural (SEM) se presentó un buen ajuste, debido
valor de chi-cuadrado entre los grados de libertad fue de 1.474, siendo menor a 3. Es
importante señalar, en la medida que este modelo se implemente dentro de la manufactura
mejorará la flexibilidad para cambiar el volumen de fabricación y tiempo de entrega a través
de obtener conocimientos y apoyar su proceso de toma de decisiones. Esto implica que los
directivos deben prestar suficiente atención a la infraestructura informática y la integración
con otros sistemas de manera innovadora, en tiempo real y más eficiente, permitiendo la
interacción de todas las áreas para un mejor análisis de los datos. / Industrial manufacturing and service operations went hit by digitization and the eco-
sustainable financial approach that global care demands. The collection and analysis of big
data (structured and unstructured) are being entered in different industries, where there must
be a perfect pairing between the infrastructure and qualified personnel for decision-making
based on the data and thus achieve that the companies be sustainable. Currently, they are not
only good practices for retail, banking, and insurance companies, but they are already
trending in the manufacturing sector, but their only limitation is an investment in
infrastructure. The investigation seeks to know the link between the supply chain of the
circular economy, big data, and the operational performance of the supply chain in the
peruviana manufacturing sector. The research had a quantitative approach, observational
design, and scope are relational-explanatory. To test the proposed framework, a confirmatory
factor analysis (CFA) approach was taken using data collected through a survey with 48
questions that involved 77 companies, determining that there is a significance that
corroborates the hypotheses that explain the causal relationships between big data and the
circular economy with an operational performance at a reliability level of 95%. The results of
Alpha (reliability) and Omega (composite reliability) are values greater than 0.7. Likewise, in
the variance extracted, all the constructs presented values greater than 0.5. It was summited a
fine adjustment on the structural model analysis (SEM) due to the chi-square value between
the degrees of freedom being 1.474, being less than 3. It's important to highlight that, as long
as this model in the manufacturing process is implemented, it will improve the flexibility to
change manufacturing volume and delivery time by gaining knowledge and supporting their
decision-making process. It implies that managers must pay sufficient attention to the IT
infrastructure and the integration with other systems in an innovative, real-time, and more
efficient way, allowing the interaction of all areas for better data analysis.
|
690 |
Från Vision till Verklighet: Implementering av Artificiell Intelligens : En kvalitativ studie om teknologisk acceptans och beredskap inomorganisationer / From Vision to Reality: Implementation of Artificial Intelligence : A qualitative study on technological acceptance and readiness within organizationsEldelöv, Melvin, Fredrikson, Agnes, Fredriksson, Kim January 2024 (has links)
Forskningsfråga Vilka huvudsakliga faktorer påverkar om en organisation är redo attimplementera AI-teknologier eller inte? Syfte Studien syftar till att underlätta förändringsarbete gällande acceptans ochberedskap för AI-teknologier inom organisationer. Metod Studien genomfördes med en kvalitativ intervjumetodik därsemistrukturerade intervjuer hölls med åtta AI-experter som hade olikaerfarenheter och kompetenser. Studien utgick från de etablerade ramverkenISTAM och AI-readiness. Slutsats Studien indikerar att de teoretiska modellerna (AI-readiness och ISTAM)besitter en hög grad relevans, studien fann stöd för samtliga faktorer inommodellerna. Framförallt transparens, tillit och data indikeras att vara högstviktiga för beredskap och acceptans av intelligenta system.
|
Page generated in 0.2225 seconds