• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 4
  • Tagged with
  • 6
  • 6
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Implementing a Data Acquisition System for the Training of Cloud Coverage Neural Networks

Montgomery, Weston C 01 June 2021 (has links) (PDF)
Cal Poly is home to a solar farm designed to nominally generate 4.5 MW of electricity. The Gold Tree Solar Farm (GTSF) is currently the largest photovoltaic array in the California State University (CSU) system, and it was claimed to be able to produce approximately 11 GWh per year. These types of projections come from power generation models which have been developed to predict power production of these large solar fields. However, when it comes to near-term forecasting of power generation with variable sources such as wind and solar, there is definitely room for improvement. The two primary factors that could impact solar power generation are shading and the angle of the sun. The angle of the sun relative to GTSF’s panels can be analytically calculated using geometry. Shading due to cloud coverage, on the other hand, can be very difficult to map. Due to this, artificial neural networks (NN) have a lot of potential for accurate near-term cloud coverage forecasting. Much of the necessary training data (e.g. wind speeds, temperature, humidity, etc.) can be acquired from online sources, but the most important dataset needs to be captured at GTSF: sky images showing the exact location of the clouds over the solar field. Therefore, a new image capturing digital acquisition (DAQ) system has been implemented to gather the necessary training data for a goal of forecasting cloud coverage 15-30 minutes into the future.
2

Domain Specific Language (DSL) visualisation for Big Data Pipelines

Mitrovic, Vlado January 2024 (has links)
With the grow of big data technologies, it has become challenging to design and manage complex data workflow, especially for non technical person. However, in order to understand and process these data the best way, we need to rely on domain expert who are often not familiar with tools available on the market. This thesis discovers the needs and describe the implementation of an easy to use tool to define and visualise data processing workflow. The research methodology includes the definition of customer requirements, architecture design, prototype development and user testing. The iterative approach used in this project ensure continuous improvement based on users feedback. The final solution then assessed using KPI metrics such as usability, integration, performances and support. / Med den växande big data-tekniken har det blivit en utmaning att utforma och hantera komplexa dataarbetsflöden, särskilt för icke-tekniska personer. För att förstå och bearbeta dessa data på bästa sätt måste vi dock förlita oss på domänexperter som ofta inte är bekanta med de verktyg som finns tillgängliga på marknaden. Denna avhandling identifierar behoven och beskriver implementeringen av ett lättanvänt verktyg för att definiera och visualisera arbetsflödet för databehandling. Detta genom att abstrahera de tekniska krav som krävs av andra lösningar. Forskningsmetoden omfattar definition av kundkrav, arkitekturdesign, prototyputveckling och användartestning. Det iterativa tillvägagångssätt som används i detta projekt säkerställer kontinuerlig förbättring baserat på användarnas feedback. Den slutliga lösningen utvärderas sedan med hjälp av nyckeltal som användbarhet, integration, prestanda och support.
3

DataOps : Towards Understanding and Defining Data Analytics Approach

Mainali, Kiran January 2020 (has links)
Data collection and analysis approaches have changed drastically in the past few years. The reason behind adopting different approach is improved data availability and continuous change in analysis requirements. Data have been always there, but data management is vital nowadays due to rapid generation and availability of various formats. Big data has opened the possibility of dealing with potentially infinite amounts of data with numerous formats in a short time. The data analytics is becoming complex due to data characteristics, sophisticated tools and technologies, changing business needs, varied interests among stakeholders, and lack of a standardized process. DataOps is an emerging approach advocated by data practitioners to cater to the challenges in data analytics projects. Data analytics projects differ from software engineering in many aspects. DevOps is proven to be an efficient and practical approach to deliver the project in the Software Industry. However, DataOps is still in its infancy, being recognized as an independent and essential task data analytics. In this thesis paper, we uncover DataOps as a methodology to implement data pipelines by conducting a systematic search of research papers. As a result, we define DataOps outlining ambiguities and challenges. We also explore the coverage of DataOps to different stages of the data lifecycle. We created comparison matrixes of different tools and technologies categorizing them in different functional groups to demonstrate their usage in data lifecycle management. We followed DataOps implementation guidelines to implement data pipeline using Apache Airflow as workflow orchestrator inside Docker and compared with simple manual execution of a data analytics project. As per evaluation, the data pipeline with DataOps provided automation in task execution, orchestration in execution environment, testing and monitoring, communication and collaboration, and reduced end-to-end product delivery cycle time along with the reduction in pipeline execution time. / Datainsamling och analysmetoder har förändrats drastiskt under de senaste åren. Anledningen till ett annat tillvägagångssätt är förbättrad datatillgänglighet och kontinuerlig förändring av analyskraven. Data har alltid funnits, men datahantering är viktig idag på grund av snabb generering och tillgänglighet av olika format. Big data har öppnat möjligheten att hantera potentiellt oändliga mängder data med många format på kort tid. Dataanalysen blir komplex på grund av dataegenskaper, sofistikerade verktyg och teknologier, förändrade affärsbehov, olika intressen bland intressenter och brist på en standardiserad process. DataOps är en framväxande strategi som förespråkas av datautövare för att tillgodose utmaningarna i dataanalysprojekt. Dataanalysprojekt skiljer sig från programvaruteknik i många aspekter. DevOps har visat sig vara ett effektivt och praktiskt tillvägagångssätt för att leverera projektet i mjukvaruindustrin. DataOps är dock fortfarande i sin linda och erkänns som en oberoende och viktig uppgiftsanalys. I detta examensarbete avslöjar vi DataOps som en metod för att implementera datarörledningar genom att göra en systematisk sökning av forskningspapper. Som ett resultat definierar vi DataOps som beskriver tvetydigheter och utmaningar. Vi undersöker också täckningen av DataOps till olika stadier av datalivscykeln. Vi skapade jämförelsesmatriser med olika verktyg och teknologier som kategoriserade dem i olika funktionella grupper för att visa hur de används i datalivscykelhantering. Vi följde riktlinjerna för implementering av DataOps för att implementera datapipeline med Apache Airflow som arbetsflödesorkestrator i Docker och jämfört med enkel manuell körning av ett dataanalysprojekt. Enligt utvärderingen tillhandahöll datapipelinen med DataOps automatisering i uppgiftskörning, orkestrering i exekveringsmiljö, testning och övervakning, kommunikation och samarbete, och minskad leveranscykeltid från slut till produkt tillsammans med minskningen av tid för rörledningskörning.
4

Implementing an Interactive Simulation Data Pipeline for Space Weather Visualization

Berg, Matthias, Grangien, Jonathan January 2018 (has links)
This thesis details work carried out by two students working as contractors at the Community Coordinated Modelling Center at Goddard Space Flight Center of the National Aeronautics and Space Administration. The thesis is made possible by and aims to contribute to the OpenSpace project. The first track of the work implemented is the handling of and putting together new data for a visualization of coronal mass ejections in OpenSpace. The new data allows for observation of coronal mass ejections at their origin by the surface of the Sun, whereas previous data visualized them from 30 solar radii out from the Sun and outwards. Previously implemented visualization techniques are used together to visualize different volume data and fieldlines, which together with a synoptic magnetogram of the Sun gives a multi-layered visualization. The second track is an experimental implementation of a generalized and less user involved process for getting new data into OpenSpace, with a priority on volume data as that was a subject of experience. The results show a space weather model visualization, and how one such model can be adapted to fit within the parameters of the OpenSpace project. Additionally, the results show how a GUI connected to a series of background events can form a data pipeline to make complicated space weather models more easily available.
5

The Application of LoRaWAN as an Internet of Things Tool to Promote Data Collection in Agriculture

Adam B Schreck (15315892) 27 April 2023 (has links)
<p>Information about the conditions of specific fields and assets is critical for farm managers to make operational decisions. Location, rainfall, windspeed, soil moisture, and temperature are examples of metrics that influence the ability to perform certain tasks. Monitoring these events in real time and being able to store historical data can be done using Internet of Things (IoT) devices such as sensors. The abilities of this technology have previously been communicated, yet few farmers have adopted these connected devices into their work. A lack of reliable internet connection, the high annual cost of current on-market systems, and a lack of technical awareness have all contributed to this disconnect. One technology that can better meet the demand of farmers is LoRaWAN because of its long range, low power, and low cost. To assist farmers in implementing this technology on their farms the goal was to build a LoRaWAN network with several sensors to measure metrics such as weather data, distribute these systems locally, and provide context to the operation of IoT networks. By leveraging readily available commercial hardware and opens source software two examples of standalone networks were created with sensor data stored locally and without a dependence on internet connectivity. The first use case was a kit consisting of a gateway and small PC mounted to a tripod with 6 individual sensors and cost close to $2200 in total. An additional design was prepared for a micro-computer-based version using a Raspberry Pi, which made improvements to the original design. These adjustments included a lower cost and complication of hardware, software with more open-source community support, and cataloged steps to increase approachability. Given outside factors, the PC architecture was chosen for mass distribution. Over one year, several identical units were produced and given to farms, extension educators, and vocational agricultural programs. From this series of deployments, all units survived the growing season without damage from the elements, general considerations about the chosen type of sensors and their potential drawbacks were made, the practical observed average range for packet acceptance was 3 miles, and battery life among sensors remained usable after one year. The Pi-based architecture was implemented in an individual use case with instructions to assist participation from any experience level. Ultimately, this work has introduced individuals to the possibilities of creating and managing their own network and what can be learned from a reasonably simple, self-managed data pipeline.</p>
6

A deep learning based anomaly detection pipeline for battery fleets

Khongbantabam, Nabakumar Singh January 2021 (has links)
This thesis proposes a deep learning anomaly detection pipeline to detect possible anomalies during the operation of a fleet of batteries and presents its development and evaluation. The pipeline employs sensors that connect to each battery in the fleet to remotely collect real-time measurements of their operating characteristics, such as voltage, current, and temperature. The deep learning based time-series anomaly detection model was developed using Variational Autoencoder (VAE) architecture that utilizes either Long Short-Term Memory (LSTM) or, its cousin, Gated Recurrent Unit (GRU) as the encoder and the decoder networks (LSTMVAE and GRUVAE). Both variants were evaluated against three well-known conventional anomaly detection algorithms Isolation Nearest Neighbour (iNNE), Isolation Forest (iForest), and kth Nearest Neighbour (k-NN) algorithms. All five models were trained using two variations in the training dataset (full-year dataset and partial recent dataset), producing a total of 10 different model variants. The models were trained using the unsupervised method and the results were evaluated using a test dataset consisting of a few known anomaly days in the past operation of the customer’s battery fleet. The results demonstrated that k-NN and GRUVAE performed close to each other, outperforming the rest of the models with a notable margin. LSTMVAE and iForest performed moderately, while the iNNE and iForest variant trained with the full dataset, performed the worst in the evaluation. A general observation also reveals that limiting the training dataset to only a recent period produces better results nearly consistently across all models. / Detta examensarbete föreslår en pipeline för djupinlärning av avvikelser för att upptäcka möjliga anomalier under driften av en flotta av batterier och presenterar dess utveckling och utvärdering. Rörledningen använder sensorer som ansluter till varje batteri i flottan för att på distans samla in realtidsmätningar av deras driftsegenskaper, såsom spänning, ström och temperatur. Den djupinlärningsbaserade tidsserieanomalidetekteringsmodellen utvecklades med VAE-arkitektur som använder antingen LSTM eller, dess kusin, GRU som kodare och avkodarnätverk (LSTMVAE och GRU) VAE). Båda varianterna utvärderades mot tre välkända konventionella anomalidetekteringsalgoritmer -iNNE, iForest och k-NN algoritmer. Alla fem modellerna tränades med hjälp av två varianter av träningsdatauppsättningen (helårsdatauppsättning och delvis färsk datauppsättning), vilket producerade totalt 10 olika modellvarianter. Modellerna tränades med den oövervakade metoden och resultaten utvärderades med hjälp av en testdatauppsättning bestående av några kända anomalidagar under tidigare drift av kundens batteriflotta. Resultaten visade att k-NN och GRUVAE presterade nära varandra och överträffade resten av modellerna med en anmärkningsvärd marginal. LSTMVAE och iForest presterade måttligt, medan varianten iNNE och iForest tränade med hela datasetet presterade sämst i utvärderingen. En allmän observation avslöjar också att en begränsning av träningsdatauppsättningen till endast en ny period ger bättre resultat nästan konsekvent över alla modeller.

Page generated in 0.0577 seconds