Global ETD Search

131	A multi-sensor approach for land cover classification and monitoring of tidal flats in the German Wadden Sea Jung, Richard 07 April 2016 (has links) Sand and mud traversed by tidal inlets and channels, which split in subtle branches, salt marshes at the coast, the tide, harsh weather conditions and a high diversity of fauna and flora characterize the ecosystem Wadden Sea. No other landscape on the Earth changes in such a dynamic manner. Therefore, land cover classification and monitoring of vulnerable ecosystems is one of the most important approaches in remote sensing and has drawn much attention in recent years. The Wadden Sea in the southeastern part of the North Sea is one such vulnerable ecosystem, which is highly dynamic and diverse. The tidal flats of the Wadden Sea are the zone of interaction between marine and terrestrial environments and are at risk due to climate change, pollution and anthropogenic pressure. Due to that, the European Union has implemented various directives, which formulate objectives such as achieving or maintaining a good environmental status respectively a favourable conservation status within a given time. In this context, a permanent observation for the estimation of the ecological condition is needed. Moreover, changes can be tracked or even foreseen and an appropriate response is possible. Therefore, it is important to distinguish between short-term changes, which are related to the dynamic manner of the ecosystem, and long-term changes, which are the result of extraneous influences. The accessibility both from sea and land is very poor, which makes monitoring and mapping of tidal flat environments from in situ measurements very difficult and cost-intensive. For the monitoring of big areas, time-saving applications are needed. In this context, remote sensing offers great possibilities, due to its provision of a large spatial coverage and non-intrusive measurements of the Earth’s surface. Previous studies in remote sensing have focused on the use of electro-optical and radar sensors for remote sensing of tidal flats, whereas microwave systems using synthetic aperture radar (SAR) can be a complementary tool for tidal flat observation, especially due to their high spatial resolution and all-weather imaging capability. Nevertheless, the repetitive tidal event and dynamic sedimentary processes make an integrated observation of tidal flats from multi-sourced datasets essential for mapping and monitoring. The main challenge for remote sensing of tidal flats is to isolate the sediment, vegetation or shellfish bed features in the spectral signature or backscatter intensity from interference by water, the atmosphere, fauna and flora. In addition, optically active materials, such as plankton, suspended matter and dissolved organics, affect the scattering and absorption of radiation. Tidal flats are spatially complex and temporally quite variable and thus mapping tidal land cover requires satellites or aircraft imagers with high spatial and temporal resolution and, in some cases, hyperspectral data. In this research, a hierarchical knowledge-based decision tree applied to multi-sensor remote sensing data is introduced and the results have been visually and numerically evaluated and subsequently analysed. The multi-sensor approach comprises electro-optical data from RapidEye, SAR data from TerraSAR-X and airborne LiDAR data in a decision tree. Moreover, spectrometric and ground truth data are implemented into the analysis. The aim is to develop an automatic or semi-automatic procedure for estimating the distribution of vegetation, shellfish beds and sediments south of the barrier island Norderney. The multi-sensor approach starts with a semi-automatic pre-processing procedure for the electro-optical data of RapidEye, LiDAR data, spectrometric data and ground truth data. The decision tree classification is based on a set of hierarchically structured algorithms that use object and texture features. In each decision, one satellite dataset is applied to estimate a specific class. This helps to overcome the drawbacks that arise from a combined usage of all remote sensing datasets for one class. This could be shown by the comparison of the decision tree results with a popular state-of-the-art supervised classification approach (random forest). Subsequent to the classification, a discrimination analysis of various sediment spectra, measured with a hyperspectral sensor, has been carried out. In this context, the spectral features of the tidal sediments were analysed and a feature selection method has been developed to estimate suitable wavelengths for discrimination with very high accuracy. The developed feature selection method ‘JMDFS’ (Jeffries-Matusita distance feature selection) is a filter-based supervised band elimination technique and is based on the local Euclidean distance and the Jeffries-Matusita distance. An iterative process is used to subsequently eliminate wavelengths and calculate a separability measure at the end of each iteration. If distinctive thresholds are achieved, the process stops and the remaining wavelengths are applied in the further analysis. The results have been compared with a standard feature selection method (ReliefF). The JMDFS method obtains similar results and runs 216 times faster. Both approaches are quantitatively and qualitatively evaluated using reference data and standard methodologies for comparison. The results show that the proposed approaches are able to estimate the land cover of the tidal flats and to discriminate the tidal sediments with moderate to very high accuracy. The accuracies of each land cover class vary according to the dataset used. Furthermore, it is shown that specific reflection features can be identified that help in discriminating tidal sediments and which should be used in further applications in tidal flats. Multi-Sensor approach Wadden Sea Monitoring concept Change analysis Pixel-based classification Feature selection Object-based classification Preprocessing Digital image analysis digital image processing Atmospheric correction 74.41 - Luftaufnahmen, Photogrammetrie ddc:500
132	Detekce poznávací značky v obraze / Image-Based Licence Plate Recognition Vacek, Michal January 2009 (has links) In first part thesis contains known methods of license plate detection. Preprocessing-based methods, AdaBoost-based methods and extremal region detection methods are described.Finally, there is a described and implemented own access using local detectors to creating visual vocabulary, which is used to plate recognition. All measurements are summarized on the end.
133	Posouzení korespondence zájmových bodů v obraze / Similarity Measure of Points of Interest in Image Křehlík, Jan January 2008 (has links) This document deals with experimental verifying to use machine learning algorithms AdaBoost or WaldBoost to make classifier, that is able to find point in the second picture that matches original point in the first picture. This work also depicts finding points of interest in image as a first step of finding correspondence. Next there are described some descriptors of points of interest. Corresponding points could be useful for 3D modeling of shooted scene.
134	Rozšíření funkcionality systému pro dolování z dat na platformě NetBeans / Functionality Extension of Data Mining System on NetBeans Platform Šebek, Michal January 2009 (has links) Databases increase by new data continually. A process called Knowledge Discovery in Databases has been defined for analyzing these data and new complex systems has been developed for its support. Developing of one of this systems is described in this thesis. Main goal is to analyse the actual state of implementation of this system which is based on the Java NetBeans Platform and the Oracle database system and to extend it by data preprocessing algorithms and the source data analysis. Implementation of data preprocessing components and changes in kernel of this system are described in detail in this thesis.
135	Categorization of Swedish e-mails using Supervised Machine Learning / Kategorisering av svenska e-postmeddelanden med användning av övervakad maskininlärning Mann, Anna, Höft, Olivia January 2021 (has links) Society today is becoming more digitalized, and a common way of communication is to send e-mails. Currently, the company Auranest has a filtering method for categorizing e-mails, but the method is a few years old. The filter provides a classification of valuable e-mails for jobseekers, where employers can make contact. The company wants to know if the categorization can be performed with a different method and improved. The degree project aims to investigate whether the categorization can be proceeded with higher accuracy using machine learning. Three supervised machine learning algorithms, Naïve Bayes, Support Vector Machine (SVM), and Decision Tree, have been examined, and the algorithm with the highest results has been compared with Auranest's existing filter. Accuracy, Precision, Recall, and F1 score have been used to determine which machine learning algorithm received the highest results and in comparison, with Auranest's filter. The results showed that the supervised machine learning algorithm SVM achieved the best results in all metrics. The comparison between Auranest's existing filter and SVM showed that SVM performed better in all calculated metrics, where the accuracy showed 99.5% for SVM and 93.03% for Auranest’s filter. The comparative results showed that accuracy was the only factor that received similar results. For the other metrics, there was a noticeable difference. / Dagens samhälle blir alltmer digitaliserat och ett vanligt kommunikationssätt är att skicka e-postmeddelanden. I dagsläget har företaget Auranest ett filter för att kategorisera e-postmeddelanden men filtret är några år gammalt. Användningsområdet för filtret är att sortera ut värdefulla e-postmeddelanden för arbetssökande, där kontakt kan ske från arbetsgivare. Företaget vill veta ifall kategoriseringen kan göras med en annan metod samt förbättras. Målet med examensarbetet är att undersöka ifall filtreringen kan göras med högre träffsäkerhet med hjälp av maskininlärning. Tre övervakade maskininlärningsalgoritmer, Naïve Bayes, Support Vector Machine (SVM) och Decision Tree, har granskats och algoritmen med de högsta resultaten har jämförts med Auranests befintliga filter. Träffsäkerhet, precision, känslighet och F1-poäng har använts för att avgöra vilken maskininlärningsalgoritm som gav högst resultat sinsemellan samt i jämförelse med Auranests filter. Resultatet påvisade att den övervakade maskininlärningsmetoden SVM åstadkom de främsta resultaten i samtliga mätvärden. Jämförelsen mellan Auranests befintliga filter och SVM visade att SVM presterade bättre i alla kalkylerade mätvärden, där träffsäkerheten visade 99,5% för SVM och 93,03% för Auranests filter. De jämförande resultaten visade att träffsäkerheten var den enda faktorn som gav liknande resultat. För de övriga mätvärdena var det en märkbar skillnad. Classification categorization e-mails preprocessing TF-IDF machine learning supervised learning Naïve Bayes Support Vector Machine Decision Tree Klassificering kategorisering e-postmeddelanden förbehandling av data TF-IDF maskininlärning övervakad inlärning Naïve Bayes Support Vector Machine Decision Tree Computer Sciences Datavetenskap (datalogi)
136	Knowledge Discovery and Data Mining Using Demographic and Clinical Data to Diagnose Heart Disease. / Knowledge Discovery och Data mining med hjälp av demografiska och kliniska data för att diagnostisera hjärtsjukdomar. Fernandez Sanchez, Javier January 2018 (has links) Cardiovascular disease (CVD) is the leading cause of morbidity, mortality, premature death and reduced quality of life for the citizens of the EU. It has been reported that CVD represents a major economic load on health care sys- tems in terms of hospitalizations, rehabilitation services, physician visits and medication. Data Mining techniques with clinical data has become an interesting tool to prevent, diagnose or treat CVD. In this thesis, Knowledge Dis- covery and Data Mining (KDD) was employed to analyse clinical and demographic data, which could be used to diagnose coronary artery disease (CAD). The exploratory data analysis (EDA) showed that female patients at an el- derly age with a higher level of cholesterol, maximum achieved heart rate and ST-depression are more prone to be diagnosed with heart disease. Furthermore, patients with atypical angina are more likely to be at an elderly age with a slightly higher level of cholesterol and maximum achieved heart rate than asymptotic chest pain patients. More- over, patients with exercise induced angina contained lower values of maximum achieved heart rate than those who do not experience it. We could verify that patients who experience exercise induced angina and asymptomatic chest pain are more likely to be diagnosed with heart disease. On the other hand, Logistic Regression, K-Nearest Neighbors, Support Vector Machines, Decision Tree, Bagging and Boosting methods were evaluated by adopting a stratified 10 fold cross-validation approach. The learning models provided an average of 78-83% F-score and a mean AUC of 85-88%. Among all the models, the highest score is given by Radial Basis Function Kernel Support Vector Machines (RBF-SVM), achieving 82.5% ± 4.7% of F-score and an AUC of 87.6% ± 5.8%. Our research con- firmed that data mining techniques can support physicians in their interpretations of heart disease diagnosis in addition to clinical and demographic characteristics of patients. machine learning data science artificial intelligence data mining cardiovascular disease CVD exploratory analysis EDA clinical data support vector machines preprocessing decision trees logistic regression KNN adaboost xgboost random forest health healthcare Medical Engineering Medicinteknik
137	Evaluating the robustness of DistilBERT to data shift in toxicity detection / Evaluering av DistilBERTs robusthet till dataskifte i en kontext av identifiering av kränkande språk Larsen, Caroline January 2022 (has links) With the rise of social media, cyberbullying and online spread of hate have become serious problems with devastating consequences. Mentimeter is an interactive presentation tool enabling the presentation audience to participate by typing their own answers to questions asked by the presenter. As the Mentimeter product is commonly used in schools, there is a need to have a strong toxicity detection program that filters out offensive and profane language. This thesis focuses on the topics of text pre-processing and robustness to datashift within the problem domain of toxicity detection for English text. Initially, it is investigated whether lemmatization, spelling correction, and removal of stop words are suitable strategies for pre-processing within toxicity detection. The pre-trained DistilBERT model was fine-tuned using an English twitter dataset that had been pre-processed using a number of different techniques. The results indicate that none of the above-mentioned strategies have a positive impact on the model performance. Lastly, modern methods are applied to train a toxicity detection model adjusted to anonymous Mentimeter user text data. For this purpose, a balanced Mentimeter dataset with 3654 datapoints was created and annotated by the thesis author. The best-performing model of the pre-processing experiment was iteratively fine-tuned and evaluated with an increasing amount of Mentimeter data. Based on the results, it is concluded that state-of-the-art performance can be achieved even when using relatively few datapoints for fine-tuning. Namely, when using around 500 − 2500 training datapoints, F1-scores between 0.90 and 0.94 were obtained on a Mentimeter test set. These results show that it is possible to create a customized toxicity detection program, with high performance, using just a small dataset. / I och med sociala mediers stora framtåg har allvarliga problem såsom nätmobbning och spridning av hat online blivit allt mer vanliga. Mentimeter är ett interaktivt presentationsverktyg som gör det möjligt för presentations-publiken att svara på frågor genom att formulera egna fritextsvar. Eftersom Mentimeter ofta används i skolor så finns det ett behov av ett välfungerande program som identifierar och filtrerar ut kränkande text och svordomar. Den här uppsatsen fokuserar på ämnena textbehandling och robusthet gentemot dataskifte i en kontext av identifiering av kränkande språk för engelsk text. Först undersöks det huruvida lemmatisering, stavningskorrigering, samt avlägsnande av stoppord är lämpliga textbehandlingstekniker i kontexten av identifiering av kränkande språk. Den förtränade DistilBERT-modellen används genom att finjustera dess parameterar med ett engelskt Twitter-dataset som har textbehandlats med ett antal olika tekniker. Resultaten indikerar att ingen av de nämnda strategierna har en positiv inverkan på modellens prestanda. Därefter användes moderna metoder för att träna en modell som kan identifiera kränkande text anpassad efter anonym data från Mentimeter. Ett balancerat Mentimeter-dataset med 3654 datapunkter skapades och annoterades av uppsatsförfattaren. Därefter finjusterades och evaluerades den bäst presterande modellen från textbehandlingsexperimentet iterativt med en ökande mängd Mentimeter-data. Baserat på resultaten drogs slutsatsen att toppmodern prestanda kan åstadkommas genom att använda relativt få datapunkter för träning. Nämligen, när ungefär 500 − 2500 träningsdatapunkter används, så uppnåddes F1-värden mellan 0.90 och 0.94 på ett test-set av Mentimeter-datasetet. Resultaten visar att det är möjligt att skapa en högpresterande modell som identifierar kränkande text, genom att använda ett litet dataset. Machine learning Natural Language Processing DistilBERT Toxicity Detection Profanity Detection Hate Speech Identification Text preprocessing Maskininlärning naturligtspråkbehandling DistilBERT identifiering av kränkande språk identifiering av svordomar textbehandling Computer and Information Sciences Data- och informationsvetenskap
138	Composting as a Method for On-Farm Mass Mortality Disposal Doklovic, Paige Elizabeth 27 October 2022 (has links) No description available. Agriculture Animal Sciences Mortality composting grinding mass mortality compost carbon amendment mortality foam depopulation swine poultry disposal carcass ground mortality particle size alternative carbon emergency management catastrophic composting mechanically ground preprocessing
139	Object detection for autonomous trash and litter collection / Objektdetektering för autonom skräpupplockning Edström, Simon January 2022 (has links) Trashandlitter discarded on the street is a large environmental issue in Sweden and across the globe. In Swedish cities alone it is estimated that 1.8 billion articles of trash are thrown to the street each year, constituting around 3 kilotons of waste. One avenue to combat this societal and environmental problem is to use robotics and AI. A robot could learn to detect trash in the wild and collect it in order to clean the environment. A key component of such a robot would be its computer vision system which allows it to detect litter and trash. Such systems are not trivially designed or implemented and have only recently reached high enough performance in order to work in industrial contexts. This master thesis focuses on creating and analysing such an algorithm by gathering data for use in a machine learning model, developing an object detection pipeline and evaluating the performance of that pipeline based on varying its components. Specifically, methods using hyperparameter optimisation, psuedolabeling and the preprocessing methods tiling and illumination normalisation were implemented and analysed. This thesis shows that it is possible to create an object detection algorithm with high performance using currently available state-of-the-art methods. Within the analysed context, hyperparameter optimisation did not significantly improve performance and psuedolabeling could only briefly be analysed but showed promising results. Tiling greatly increased mean average precision (mAP) for the detection of small objects, such as cigarette butts, but decreased the mAP for large objects and illumination normalisation improved mAPforimagesthat were brightly lit. Both preprocessing methods reduced the frames per second that a full detector could run at whilst psuedolabeling and hyperparameter optimisation greatly increased training times. / Skräp som slängs på marken har en stor miljöpåverkan i Sverige och runtom i världen. Enbart i Svenska städer uppskattas det att 1,8 miljarder bitar skräp slängs på gatan varje år, bestående av cirka 3 kiloton avfall. Ett sätt att lösa detta samhälleliga och miljömässiga problem är att använda robotik och AI. En robot skulle kunna lära siga att detektera skräp i utomhusmiljöer och samla in den för att på så sätt rengöra våra städer och vår natur. En nyckelkomponent av en sådan robot skulle vara dess system för datorseende som tillåter den att se och hitta skräp. Sådana system är inte triviala att designa eller implementera och har bara nyligen påvisat tillräckligt hög prestanda för att kunna användas i kommersiella sammanhang. Detta masterexamensarbete fokuserar på att skapa och analysera en sådan algoritm genom att insamla data för att använda i en maskininlärningsmodell, utveckla en objektdetekterings pipeline och utvärdera prestandan när dess komponenter modifieras. Specifikt analyseras metoderna pseudomarkering, hyperparameter optimering samt förprocesseringsmetoderna kakling och ljusintensitetsnormalisering. Examensarbetet visar att det är möjligt att skapa en objektdetekteringsalgoritm med hög prestanda med hjälp av den senaste tekniken på området. Inom det undersökta sammanhanget gav hyperparameter optimering inte någon större förbättring av prestandan och pseudomarkering kunde enbart ytligt analyseras men uppvisade preliminärt lovande resultat. Kakling förbättrade resultatet för detektering av små objekt, som cigarettfimpar, men minskade prestandan för större objekt och ljusintensitetsnormalisering förbättrade prestandan för bilder som var starkt belysta. Båda förprocesseringsmetoderna minskade bildhastigheten som en detektor skulle kunna köra i och psuedomarkering samt hyperparameter optimering ökade träningstiden kraftigt. Object detection Trash detection Machine learning Pipeline Artifical neural networks Deeplearning Dataset Preprocessing Augmentation Psuedolabel Tiling Objektdetektering Skräpigenkänning Maskininlärning Pipeline Artificiella neurala nätverk Djupinlärning Dataset Förprocessering Augmentation Psuedomarkering Kakling Computer and Information Sciences Data- och informationsvetenskap
140	Unsupervised anomaly detection for structured data - Finding similarities between retail products Fockstedt, Jonas, Krcic, Ema January 2021 (has links) Data is one of the most contributing factors for modern business operations. Having bad data could therefore lead to tremendous losses, both financially and for customer experience. This thesis seeks to find anomalies in real-world, complex, structured data, causing an international enterprise to miss out on income and the potential loss of customers. By using graph theory and similarity analysis, the findings suggest that certain countries contribute to the discrepancies more than other countries. This is believed to be an effect of countries customizing their products to match the market’s needs. This thesis is just scratching the surface of the analysis of the data, and the number of opportunities for future work are therefore many. relational data similarity analysis data analysis SQL NetworkX graph theory anomaly detection unsupervised retail products real-world data AWS amazon web services similarity learning data statistics data preprocessing similarity analysis algorithm data validation Computer Sciences Datavetenskap (datalogi)

Search results