• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 590
  • 119
  • 109
  • 75
  • 40
  • 40
  • 27
  • 22
  • 19
  • 10
  • 7
  • 7
  • 6
  • 6
  • 5
  • Tagged with
  • 1223
  • 1223
  • 180
  • 170
  • 163
  • 156
  • 150
  • 149
  • 149
  • 129
  • 112
  • 110
  • 109
  • 109
  • 107
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
221

Big Data - Stort intresse, nya möjligheter

Hellström, Hampus, Ohm, Oscar January 2014 (has links)
Dagens informationssamhälle har bidragit till att människor, maskiner och företag genererar och lagrar stora mängder data. Hanteringen och bearbetningen av de stora datamängderna har fått samlingsnamnet Big Data.De stora datamängderna ökar bland annat möjligheterna att bedriva kunskapsbaserad verksamhetsutveckling. Med traditionella metoder för insamling och analys av data har kunskapsbaserad verksamhetsutveckling tillämpats genom att skicka ut resurskrävande marknadsundersökningar och kartläggningar, ofta genomförda av specialiserade undersökningsföretag. Efterhand som analyser av samhällets befintliga datamängder blir allt värdefullare,har undersökningsföretagen därmed en stor utvecklingsmöjlighet att vaska guld ifrån samhällets enorma datamängder.Studien är genomförd som en explorativ fallstudie som undersöker hur svenska undersökningsföretag arbetar med Big Data och identifierar några av de utmaningar de står inför tillämpningen av Big Data analyser i verksamheten. Resultatet visar att de deltagande undersökningsföretagen använder Big Data som verktyg för att effektivisera befintliga processer och i viss mån komplettera traditionella undersökningar. Trots att man ser möjligheter med tekniken arbetar man passivt med utvecklingen av nya processer som ämnas stödjas av Big data analyser. Och det finns en utmaning i en bristande kompetens som råder på marknaden. Resultatet behandlar även en etisk aspekt undersökningsföretagen måste ta hänsyn till, speciellt problematiskt är den när data behandlas och analyseras i realtid och kan kopplas till en individ. / Today’s information society is consisting of people, businesses and machinesthat together generate large amounts of data every day. This exponatial growthof datageneration has led to the creation of what we call Big Data. Amongother things the data produced, gathered and stored can be used bycompanies to practise knowledge based business development. Traditionallythe methods used for generating knowledge about a business environment andmarket has been timeconsuming and expensive and often conducted by aspecialized research company that carry out market research and surveys.Today the analysis of existing data sets is becoming increasingly valuable, andthe research companies have a great opportunity to mine value from societyshuge amounts of data.The study is designed as an exploratory case study that investigates how theresearch companies in Sweden work with these data sets, and identifies someof the challenges they face in the application of Big Data analysis in theirbusiness. The results shows that the participating research companies areusing Big Data tools to steamline existing business processes and to someextent use it as a complementary value to traditional research and surveys.Although they see possibilities with the technology, the participatingcompanies are unwilling to drive the development of new business processesthat are supported by Big Data analysis. There is a challenge identified in thelack of competence prevailing in the Swedish market. The result also coverssome of the ethical aspects research companies need to take intoconsideration. The ethical issues are especially problematic when data, thatcan be linked to an individual, is processed and analysed in real time.
222

High performance shared state schedulers

Kouzoupis, Antonios January 2016 (has links)
Large organizations and research institutes store a huge volume of data nowadays.In order to gain any valuable insights distributed processing frameworks over acluster of computers are needed. Apache Hadoop is the prominent framework fordistributed storage and data processing. At SICS Swedish ICT we are building Hops, a new distribution of Apache Hadoop relying on a distributed, highly available MySQL Cluster NDB to improve performance. Hops-YARN is the resource management framework of Hops which introduces distributed resource management, load balancing the tracking of resources in a cluster. In Hops-YARN we make heavy usage of the back-end database storing all the resource manager metadata and incoming RPCs to provide high fault tolerance and very short recovery time. This project aims in optimizing the mechanisms used for persisting metadata in NDB both in terms of transactional commit time but also in terms of pre-processing them. Under no condition should the in-memory RM state diverge from the state stored in NDB. With these goals in mind several solutions were examined that improved the performance of the system, making Hops-YARN comparable to Apache YARN with the extra benefits of high-fault tolerance and short recovery time. The solutions proposed in this thesis project enhance the pure commit time of a transaction to the MySQL Cluster and the pre-processing and parallelism of our Transaction Manager. The results indicate that the performance of Hops increased dramatically, utilizing more resources on a cluster with thousands of machines. Increasing the cluster utilization by a few percentages can save organizations a big amount of money. / Nu för tiden lagrar stora organisationer och forskningsinstitutioner enorma mängder data.För att kunna utvinna någon värdefull information från dessa data behöver den bearbetasav ett kluster av datorer. När flera datorer gemensamt ska bearbeta data behöver de utgåfrån ett så kallat "distributed processing framework''. I dagsläget är Apache Hadoop detmest använda ramverket för distribuerad lagring och behandling av data. Detta examensarbeteär har genomförts vid SICS Swedish ICT där vi byggt Hops, en ny distribution avApache Hadoop som drivs av ett distribuerat MySQL Cluster NDB som erbjuder en hög tillgänglighet.Hops-YARN är Hops ramverk för resurshantering med distribuerade ResourceManagers som lastbalanserarderas ResourceTrackerService. I detta examensarbete använder vi Hops-Yarn på ett sätt där ``back-end''databasen flitigt används för att hantera ResourceManagerns metadata och inkommande RPC-anrop. Vårkonfiguration erbjuder en hög feltolerans och återställer sig mycket snabbt vidfelberäkningar. Vidare används NDB-klustrets Event API för att ResourceManager ska kunnakommunicera med den distribuerade ResourceTrackers. Detta projekt syftar till att optimera de mekanismer som används för ihållande metadatai NDB både i termer av transaktions begå tid men också i termer av pre-bearbeta dem medan samtidigt garantera enhetlighet i RM: s tillstånd. ResourceManagerns tillståndi RAM-minnet får under inga omständigheteravvika från det tillstånd som finns lagrat i NDB:n. Med dessa mål i åtanke undersöktes fleralösningar som förbättrar prestandan och därmed gör Hops-Yarn jämförbart med Apache YARN.De lösningar som föreslås i denna uppsats förbättrar “pure commit time” när en transaktiongörs i ett MySQL Cluster samt förbehandlingen och parallelismen i vår Transaction Manager.Resultaten tyder på att Hops prestanda ökade dramatiskt vilket ledde till ett effektivarenyttjande av tillgängliga resurser i ett kluster bestående av ett tusental datorer. Närnyttjandet av tillgänliga resurser i ett kluster förbättras med några få procent kanorganisationer spara mycket pengar.
223

Big-Data Driven Optimization Methods with Applications to LTL Freight Routing

Tamvada, Srinivas January 2020 (has links)
We propose solution strategies for hard Mixed Integer Programming (MIP) problems, with a focus on distributed parallel MIP optimization. Although our proposals are inspired by the Less-than-truckload (LTL) freight routing problem, they are more generally applicable to hard MIPs from other domains. We start by developing an Integer Programming model for the Less-than-truckload (LTL) freight routing problem, and present a novel heuristic for solving the model in a reasonable amount of time on large LTL networks. Next, we identify some adaptations to MIP branching strategies that are useful for achieving improved scaling upon distribution when the LTL routing problem (or other hard MIPs) are solved using parallel MIP optimization. Recognizing that our model represents a pseudo-Boolean optimization problem (PBO), we leverage solution techniques used by PBO solvers to develop a CPLEX based look-ahead solver for LTL routing and other PBO problems. Our focus once again is on achieving improved scaling upon distribution. We also analyze a technique for implementing subtree parallelism during distributed MIP optimization. We believe that our proposals represent a significant step towards solving big-data driven optimization problems (such as the LTL routing problem) in a more efficient manner. / Thesis / Doctor of Philosophy (PhD) / Less-than-truckload (LTL) freight transportation is a vital part of Canada's economy, with revenues running into billions of dollars and a cascading impact on many other industries. LTL operators often have to deal with large volumes of shipments, unexpected changes in traffic conditions, and uncertainty in demand patterns. In an industry that already has low profit margins, it is therefore vitally important to make good routing decisions without expending a lot of time. The optimization of such LTL freight networks often results in complex big-data driven optimization problems. In addition to the challenge of finding optimal solutions for these problems, analysts often have to deal with the complexities of big-data driven inputs. In this thesis we develop several solution strategies for solving the LTL freight routing problem including an exact model, novel heuristics, and techniques for solving the problem efficiently on a cluster of computers. Although the techniques we develop are inspired by LTL routing, they are more generally applicable for solving big-data driven optimization problems from other domains. Experiments conducted over the years in consultation with industry experts indicate that our proposals can significantly improve solution quality and reduce time to solution. Furthermore, our proposals open up interesting avenues for future research.
224

A drug repurposing study based on clinical big data for the protective role of vitamin D in olanzapine-induced dyslipidemia / 臨床ビッグデータに基づくオランザピン誘発脂質異常症に対するビタミンDの予防作用の解明

ZHOU, ZIJIAN 23 March 2023 (has links)
京都大学 / 新制・課程博士 / 博士(薬科学) / 甲第24551号 / 薬科博第168号 / 新制||薬科||18(附属図書館) / 京都大学大学院薬学研究科薬科学専攻 / (主査)教授 金子 周司, 教授 竹島 浩, 教授 上杉 志成 / 学位規則第4条第1項該当 / Doctor of Pharmaceutical Sciences / Kyoto University / DFAM
225

Aggregated sensor payload submission model for token-based access control in the Web of Things

Amir, Mohammad, Pillai, Prashant, Hu, Yim Fun 26 October 2015 (has links)
Yes / Web of Things (WoT) can be considered as a merger of newly emerging paradigms of Internet of Things (IoT) and cloud computing. Rapidly varying, highly volatile and heterogeneous data traffic is a characteristic of the WoT. Hence, the capture, processing, storage and exchange of huge volumes of data is a key requirement in this environment. The crucial resources in the WoT are the sensing devices and the sensing data. Consequently, access control mechanisms employed in this highly dynamic and demanding environment need to be enhanced so as to reduce the end-to-end latency for capturing and exchanging data pertaining to these underlying resources. While there are many previous studies comparing the advantages and disadvantages of access control mechanisms at the algorithm level, vary few of these provide any detailed comparison the performance of these access control mechanisms when used for different data handling procedures in the context of data capture, processing and storage. This study builds on previous work on token-based access control mechanisms and presents a comparison of two different approaches used for handling sensing devices and data in the WoT. It is shown that the aggregated data submission approach is around 700% more efficient than the serial payload submission procedure in reducing the round-trip response time.
226

Towards a framework for engineering big data: An automotive systems perspective

Byrne, Thomas J., Campean, Felician, Neagu, Daniel 05 1900 (has links)
no / Demand for more sophisticated models to meet big data expectations require significant data repository obligations, operating concurrently in higher-level applications. Current models provide only disjointed modelling paradigms. The proposed framework addresses the need for higher-level abstraction, using low-level logic in the form of axioms, from which higher-level functionality is logically derived. The framework facilitates definition and usage of subjective structures across the cyber-physical system domain, and is intended to converge the range of heterogeneous data-driven objects.
227

The security of big data in fog-enabled IoT applications including blockchain: a survey

Tariq, N., Asim, M., Al-Obeidat, F., Farooqi, M.Z., Baker, T., Hammoudeh, M., Ghafir, Ibrahim 24 January 2020 (has links)
Yes / The proliferation of inter-connected devices in critical industries, such as healthcare and power grid, is changing the perception of what constitutes critical infrastructure. The rising interconnectedness of new critical industries is driven by the growing demand for seamless access to information as the world becomes more mobile and connected and as the Internet of Things (IoT) grows. Critical industries are essential to the foundation of today’s society, and interruption of service in any of these sectors can reverberate through other sectors and even around the globe. In today’s hyper-connected world, the critical infrastructure is more vulnerable than ever to cyber threats, whether state sponsored, criminal groups or individuals. As the number of interconnected devices increases, the number of potential access points for hackers to disrupt critical infrastructure grows. This new attack surface emerges from fundamental changes in the critical infrastructure of organizations technology systems. This paper aims to improve understanding the challenges to secure future digital infrastructure while it is still evolving. After introducing the infrastructure generating big data, the functionality-based fog architecture is defined. In addition, a comprehensive review of security requirements in fog-enabled IoT systems is presented. Then, an in-depth analysis of the fog computing security challenges and big data privacy and trust concerns in relation to fog-enabled IoT are given. We also discuss blockchain as a key enabler to address many security related issues in IoT and consider closely the complementary interrelationships between blockchain and fog computing. In this context, this work formalizes the task of securing big data and its scope, provides a taxonomy to categories threats to fog-based IoT systems, presents a comprehensive comparison of state-of-the-art contributions in the field according to their security service and recommends promising research directions for future investigations.
228

<b>Sample Size Determination for Subsampling in the Analysis of Big Data, Multiplicative models for confidence intervals and Free-Knot changepoint models</b>

Sheng Zhang (18468615) 11 June 2024 (has links)
<p dir="ltr">We studied the relationship between subsample size and the accuracy of resulted estimation under big data setup.</p><p dir="ltr">We also proposed a novel approach to the construction of confidence intervals based on improved concentration inequalities.</p><p dir="ltr">Lastly, we studied irregular change-point models using free-knot splines.</p>
229

Efficient computer experiment designs for Gaussian process surrogates

Cole, David Austin 28 June 2021 (has links)
Due to advancements in supercomputing and algorithms for finite element analysis, today's computer simulation models often contain complex calculations that can result in a wealth of knowledge. Gaussian processes (GPs) are highly desirable models for computer experiments for their predictive accuracy and uncertainty quantification. This dissertation addresses GP modeling when data abounds as well as GP adaptive design when simulator expense severely limits the amount of collected data. For data-rich problems, I introduce a localized sparse covariance GP that preserves the flexibility and predictive accuracy of a GP's predictive surface while saving computational time. This locally induced Gaussian process (LIGP) incorporates latent design points, inducing points, with a local Gaussian process built from a subset of the data. Various methods are introduced for the design of the inducing points. LIGP is then extended to adapt to stochastic data with replicates, estimating noise while relying upon the unique design locations for computation. I also address the goal of identifying a contour when data collection resources are limited through entropy-based adaptive design. Unlike existing methods, the entropy-based contour locator (ECL) adaptive design promotes exploration in the design space, performing well in higher dimensions and when the contour corresponds to a high/low quantile. ECL adaptive design can join with importance sampling for the purpose of reducing uncertainty in reliability estimation. / Doctor of Philosophy / Due to advancements in supercomputing and physics-based algorithms, today's computer simulation models often contain complex calculations that can produce larger amounts of data than through physical experiments. Computer experiments conducted with simulation models are sought-after ways to gather knowledge about physical problems but come with design and modeling challenges. In this dissertation, I address both data size extremes - building prediction models with large data sets and designing computer experiments when scarce resources limit the amount of data. For the former, I introduce a strategy of constructing a series of models including small subsets of observed data along with a set of unobserved data locations (inducing points). This methodology also contains the ability to perform calculations with only unique data locations when replicates exist in the data. The locally induced model produces accurate predictions while saving computing time. Various methods are introduced to decide the locations of these inducing points. The focus then shifts to designing an experiment for the purpose of accurate prediction around a particular output quantity of interest (contour). A experimental design approach is detailed that selects new sample locations one-at-a-time through a function to maximize the amount of information gain in the contour region for the overall model. This work is combined with an existing method to estimate the true volume of the contour.
230

SensAnalysis: A Big Data Platform for Vibration-Sensor Data Analysis

Kumar, Abhinav 26 June 2019 (has links)
The Goodwin Hall building on the Virginia Tech campus is the most instrumented building for vibration monitoring. It houses 225 hard-wired accelerometers which record vibrations arising due to internal as well as external activities. The recorded vibration data can be used to develop real-time applications for monitoring the health of the building or detecting human activity in the building. However, the lack of infrastructure to handle the massive scale of the data, and the steep learning curve of the tools required to store and process the data, are major deterrents for the researchers to perform their experiments. Additionally, researchers want to explore the data to determine the type of experiments they can perform. This work tries to solve these problems by providing a system to store and process the data using existing big data technologies. The system simplifies the process of big data analysis by supporting code re-usability and multiple programming languages. The effectiveness of the system was demonstrated by four case studies. Additionally, three visualizations were developed to help researchers in the initial data exploration. / Master of Science / The Goodwin Hall building on the Virginia Tech campus is an example of a ‘smart building.’ It uses sensors to record the response of the building to various internal and external activities. The recorded data can be used by algorithms to facilitate understanding of the properties of the building or to detect human activity. Accordingly, researchers in the Virginia Tech Smart Infrastructure Lab (VTSIL) run experiments using a part of the complete data. Ideally, they want to run their experiments continuously as new data is collected. However, the massive scale of the data makes it difficult to process new data as soon as it arrives, and to make it available immediately to the researchers. The technologies that can handle data at this scale have a steep learning curve. Starting to use them requires much time and effort. This project involved building a system to handle these challenges so that researchers can focus on their core area of research. The system provides visualizations depicting various properties of the data to help researchers explore that data before running an experiment. The effectiveness of this work was demonstrated using four case studies. These case studies used the actual experiments conducted by VTSIL researchers in the past. The first three case studies help in understanding the properties of the building whereas the final case study deals with detecting and locating human footsteps, on one of the floors, in real-time.

Page generated in 0.0642 seconds