Global ETD Search

151	Sentimental Analysis of CyberbullyingTweets with SVM Technique Thanikonda, Hrushikesh, Koneti, Kavya Sree January 2023 (has links) Background: Cyberbullying involves the use of digital technologies to harass, humiliate, or threaten individuals or groups. This form of bullying can occur on various platforms such as social media, messaging apps, gaming platforms, and mobile phones. With the outbreak of covid-19, there was a drastic increase in utilization of social media. And this upsurge was coupled with cyberbullying, making it a pressing issue that needs to be addressed. Sentiment analysis involves identifying and categorizing emotions and opinions expressed in text data using natural language processing and machine learning techniques. SVM is a machine learning algorithm that has been widely used for sentiment analysis due to its accuracy and efficiency. Objectives: The main objective of this study is to use SVM for sentiment analysis of cyberbullying tweets and evaluate its performance. The study aimed to determine the feasibility of using SVM for sentiment analysis and to assess its accuracy in detecting cyberbullying. Methods: The quantitative research method is used in this thesis, and data is analyzed using statistical analysis. The data set is from Kaggle and includes data about cyberbullying tweets. The collected data is preprocessed and used to train and test an SVM model. The created model will be evaluated on the test set using evaluation accuracy, precision, recall, and F1 score to determine the performance of the SVM model developed to detect cyberbullying. Results: The results showed that SVM is a suitable technique for sentiment analysis of cyberbullying tweets. The model had an accuracy of 82.3% in detecting cyberbullying, with a precision of 0.82, recall of 0.82, and F1-score of 0.83. Conclusions: The study demonstrates the feasibility of using SVM for sentimental analysis of cyberbullying tweets. The high accuracy of the SVM model suggests that it can be used to build automated systems for detecting cyberbullying. The findings highlight the importance of developing tools to detect and address cyberbullying in the online world. The use of sentimental analysis and SVM has the potential to make a significant contribution to the fight against cyberbullying. Cyberbullying tweets Dataset Data preprocessing Machine Learning Supervised Learning Support Vector Machine Validation. Computer Sciences Datavetenskap (datalogi)
152	Robustness of Image Classification Using CNNs in Adverse Conditions Ingelstam, Theo, Skåntorp, Johanna January 2022 (has links) The usage of convolutional neural networks (CNNs) has revolutionized the field of computer vision. Though the algorithms used in image recognition have improved significantly in the past decade, they are still limited by the availability of training data. This paper aims to gain a better understanding of how limitations in the training data might affect the performance of the system. A robustness study was conducted. The study utilizes three different image datasets; pre-training CNN models on the ImageNet or CIFAR-10 datasets, and then training on the MAdWeather dataset, whose main characteristic is containing images with differing levels of obscurity in front of the objects in the images. The MAdWeather dataset is used in order to test how accurately a model can identify images that differ from its training dataset. The study shows that CNNs performance on one condition does not translate well to other conditions. / Bildklassificering med hjälp av datorer har revolutionerats genom introduktionen av CNNs. Och även om algoritmerna har förbättrats avsevärt, så är de fortsatt begränsade av tillgänglighet av data. Syftet med detta projekt är att få en bättre förståelse för hur begränsningar i träningsdata kan påverka prestandan för en modell. En studie genomförs för att avgöra hur robust en modell är mot att förutsättningarna, under vilka bilderna tas, förändras. Studien använder sig av tre olika dataset: ImageNet och CIFAR-10, för förträning av modellerna, samt MAdWeather för vidare träning. MAdWeather är speciellt framtaget med bilder där objekten är till olika grad grumlade. MAdWeather datasetet används vidare för att avgöra hur bra en modell är på att klassificera bilder som tagits fram under omständigheter som avviker från träningsdatan. Studien visar att CNNs prestanda på en viss omständighet, inte kan generaliseras till andra omständigheter. / Kandidatexjobb i elektroteknik 2022, KTH, Stockholm image recognition convolutional neural network CNN adverse conditions computer vision MAdWeather dataset Elektroteknik och elektronik
153	Enhancing Object Detection in Infrared Videos through Temporal and Spatial Information Jinke, Shi January 2023 (has links) Object detection is a prominent area of research within computer vision. While object detection based on infrared videos holds great practical significance, the majority of mainstream methods are primarily designed for visible datasets. This thesis investigates the enhancement of object detection accuracy on infrared datasets by leveraging temporal and spatial information. The Memory Enhanced Global-Local Aggregation (MEGA) framework is chosen as a baseline due to its capability to incorporate both forms of information. Based on the initial visualization result from the infrared dataset, CAMEL, the noisy characteristic of the infrared dataset is further explored. Through comprehensive experiments, the impact of temporal and spatial information is examined, revealing that spatial information holds a detrimental effect, while temporal information could be used to improve model performance. Moreover, an innovative Dual Frame Average Aggregation (DFAA) framework is introduced to address challenges related to object overlapping and appearance changes. This framework processes two global frames in parallel and in an organized manner, showing an improvement from the original configuration. / Objektdetektion är ett framträdande forskningsområde inom datorseende. Även om objektdetektering baserad på infraröda videor har stor praktisk betydelse, är majoriteten av vanliga metoder i första hand utformade för synliga datauppsättningar. Denna avhandling undersöker förbättringen av objektdetektionsnoggrannhet på infraröda datauppsättningar genom att utnyttja tids- och rumslig information. Memory Enhanced Global-Local Aggregation (MEGA)-ramverket väljs som baslinje på grund av dess förmåga att införliva båda formerna av information. Baserat på det initiala visualiseringsresultatet från den infraröda datamängden, CAMEL, utforskas den brusiga karaktäristiken för den infraröda datamängden ytterligare. Genom omfattande experiment undersöks effekten av tids- och rumslig information, vilket avslöjar att den rumsliga informationen har en skadlig effekt, medan tidsinformation kan användas för att förbättra modellens prestanda. Dessutom introduceras en innovativ Dual Frame Average Aggregation (DFAA) ramverk för att hantera utmaningar relaterade till objektöverlappning och utseendeförändringar. Detta ramverk bearbetar två globala ramar parallellt och på ett organiserat sätt, vilket visar en förbättring från den ursprungliga konfigurationen. Object Detection Infrared Video Dataset Temporal and Spatial Information Objektdetektion infraröd videodatauppsättning tids- och rumslig information Computer and Information Sciences Data- och informationsvetenskap
154	Computational Intelligence and Data Mining Techniques Using the Fire Data Set Storer, Jeremy J. 04 May 2016 (has links) No description available. Computer Science Fire Dataset Machine Learning Computational Intelligence Data Mining Neural Networks Particle Swarm Optimization k-Means Clustering Spectral Clustering
155	Machine Learning-based Biometric Identification Israelsson, Hanna, Wrife, Andreas January 2021 (has links) With the rapid development of computers andmodels for machine learning, image recognition has, in recentyears, become widespread in various areas. In this report, imagerecognition is discussed in relation to biometric identificationusing fingerprint images. The aim is to investigate how well abiometric identification model can be trained with an extendeddataset, which resulted from rotating and shifting the images inthe original dataset consisting of very few images. Furthermore,it is investigated how the accuracy of this single-stage modeldiffers from the accuracy of a model with two-stage identification.We chose Random Forest (RF) as the machine learning modeland Scikit default values for the hyperparameters. We furtherincluded five-fold cross-validation in the training process. Theperformance of the trained machine learning model is evaluatedwith testing accuracy and confusion matrices. It was shown thatthe method for extending the dataset was successful. A greaternumber of images gave a greater accuracy in the predictions.Two-stage identification gave approximately the same accuracyas the single-stage method, but both methods would need tobe tested on datasets with images from a greater number ofindividuals before any final conclusions can be drawn. / Tack vare den snabba utvecklingen av datoreroch modeller för maskininlärning har bildigenkänning desenaste åren fått stor spridning i samhället. I denna rapportbehandlas bildigenkänning i relation till biometrisk identifieringi form av fingeravtrycksavläsning. Målet är att undersöka hurväl en modell för biometrisk identifiering kan tränas och testaspå ett dataset med ursprungligen mycket få bilder, om datasettetförst expanderas genom att flertalet kopior av originalbildernaskapas och sedan roteras och förskjuts i olika riktningar.Vidare undersöks hur noggrannheten för denna enstegsmodellskiljer sig jämfört med identifiering i två steg. Vi valdeRandom Forest (RF) som maskininlärningsmodell och Scikitsstandardinställningar för hyperparametrarna. Vidare inkluderadesfemfaldig korsvalidering i träningsprocessen. Prestandanhos den tränade maskininlärningsmodellen bedömdes med hjälpav testnoggrannhet och confusion matriser. Det visades sig attmetoden för att expandera datasettet var framgångsrik. Ettstörre antal bilder gav större noggrannhet i förutsägelserna.Tvåstegsidentifiering gav ungefärligen samma noggrannhet somenstegsidentifiering, men metoderna skulle behöva testas på datamängder med bilder från ett större antal individer innannågra slutgiltiga slutsatser kan dras. / Kandidatexjobb i elektroteknik 2021, KTH, Stockholm Machine Learning Biometric identification Classification Dataset expansion Random forest Elektroteknik och elektronik
156	Detekce dopravních značek a semaforů / Detection of Traffic Signs and Lights Oškera, Jan January 2020 (has links) The thesis focuses on modern methods of traffic sign detection and traffic lights detection directly in traffic and with use of back analysis. The main subject is convolutional neural networks (CNN). The solution is using convolutional neural networks of YOLO type. The main goal of this thesis is to achieve the greatest possible optimization of speed and accuracy of models. Examines suitable datasets. A number of datasets are used for training and testing. These are composed of real and synthetic data sets. For training and testing, the data were preprocessed using the Yolo mark tool. The training of the model was carried out at a computer center belonging to the virtual organization MetaCentrum VO. Due to the quantifiable evaluation of the detector quality, a program was created statistically and graphically showing its success with use of ROC curve and evaluation protocol COCO. In this thesis I created a model that achieved a success average rate of up to 81 %. The thesis shows the best choice of threshold across versions, sizes and IoU. Extension for mobile phones in TensorFlow Lite and Flutter have also been created.
157	Characterization of the structure, stratigraphy and CO2 storage potential of the Swedish sector of the Baltic and Hanö Bay basins using seismic reflection methods Sopher, Daniel January 2016 (has links) An extensive multi-channel seismic dataset acquired between 1970 and 1990 by Oljeprospektering AB (OPAB) has recently been made available by the Geological Survey of Sweden (SGU). This thesis summarizes four papers, which utilize this largely unpublished dataset to improve our understanding of the geology and CO2 storage capacity of the Baltic and Hanö Bay basins in southern Sweden. A range of new processing workflows were developed, which typically provide an improvement in the final stacked seismic image, when compared to the result obtained with the original processing. A method was developed to convert scanned images of seismic sections into SEGY files, which allows large amounts of the OPAB dataset to be imported and interpreted using modern software. A new method for joint imaging of multiples and primaries was developed, which is shown to provide an improvement in signal to noise for some of the seismic lines within the OPAB dataset. For the first time, five interpreted regional seismic profiles detailing the entire sedimentary sequence within these basins, are presented. Depth structure maps detailing the Outer Hanö Bay area and the deeper parts of the Baltic Basin were also generated. Although the overall structure and stratigraphy of the basins inferred from the reprocessed OPAB dataset are consistent with previous studies, some new observations have been made, which improve the understanding of the tectonic history of these basins and provide insight into how the depositional environments have changed throughout time. The effective CO2 storage potential within structural and stratigraphic traps is assessed for the Cambrian Viklau, När and Faludden sandstone reservoirs. A probabilistic methodology is utilized, which allows a robust assessment of the storage capacity as well as the associated uncertainty. The most favourable storage option in the Swedish sector of the Baltic Basin is assessed to be the Faludden stratigraphic trap, which is estimated to have a mid case (P50) storage capacity of 3390 Mt in the deeper part of the basin, where CO2 can be stored in a supercritical phase. Baltic Sea Baltic Basin Hanö Bay Basin OPAB dataset Seismic reflection CO2 storage capacity Seismic interpretation Seismic processing Vectorising Digitizing Tornquist Zone
158	Development of artificial intelligence-based in-silico toxicity models : data quality analysis and model performance enhancement through data generation Malazizi, Ladan January 2008 (has links) Toxic compounds, such as pesticides, are routinely tested against a range of aquatic, avian and mammalian species as part of the registration process. The need for reducing dependence on animal testing has led to an increasing interest in alternative methods such as in silico modelling. The QSAR (Quantitative Structure Activity Relationship)-based models are already in use for predicting physicochemical properties, environmental fate, eco-toxicological effects, and specific biological endpoints for a wide range of chemicals. Data plays an important role in modelling QSARs and also in result analysis for toxicity testing processes. This research addresses number of issues in predictive toxicology. One issue is the problem of data quality. Although large amount of toxicity data is available from online sources, this data may contain some unreliable samples and may be defined as of low quality. Its presentation also might not be consistent throughout different sources and that makes the access, interpretation and comparison of the information difficult. To address this issue we started with detailed investigation and experimental work on DEMETRA data. The DEMETRA datasets have been produced by the EC-funded project DEMETRA. Based on the investigation, experiments and the results obtained, the author identified a number of data quality criteria in order to provide a solution for data evaluation in toxicology domain. An algorithm has also been proposed to assess data quality before modelling. Another issue considered in the thesis was the missing values in datasets for toxicology domain. Least Square Method for a paired dataset and Serial Correlation for single version dataset provided the solution for the problem in two different situations. A procedural algorithm using these two methods has been proposed in order to overcome the problem of missing values. Another issue we paid attention to in this thesis was modelling of multi-class data sets in which the severe imbalance class samples distribution exists. The imbalanced data affect the performance of classifiers during the classification process. We have shown that as long as we understand how class members are constructed in dimensional space in each cluster we can reform the distribution and provide more knowledge domain for the classifier. 615.9
159	Non-parametric workspace modelling for mobile robots using push broom lasers Smith, Michael January 2011 (has links) This thesis is about the intelligent compression of large 3D point cloud datasets. The non-parametric method that we describe simultaneously generates a continuous representation of the workspace surfaces from discrete laser samples and decimates the dataset, retaining only locally salient samples. Our framework attains decimation factors in excess of two orders of magnitude without significant degradation in fidelity. The work presented here has a specific focus on gathering and processing laser measurements taken from a moving platform in outdoor workspaces. We introduce a somewhat unusual parameterisation of the problem and look to Gaussian Processes as the fundamental machinery in our processing pipeline. Our system compresses laser data in a fashion that is naturally sympathetic to the underlying structure and complexity of the workspace. In geometrically complex areas, compression is lower than that in geometrically bland areas. We focus on this property in detail and it leads us well beyond a simple application of non-parametric techniques. Indeed, towards the end of the thesis we develop a non-stationary GP framework whereby our regression model adapts to the local workspace complexity. Throughout we construct our algorithms so that they may be efficiently implemented. In addition, we present a detailed analysis of the proposed system and investigate model parameters, metric errors and data compression rates. Finally, we note that this work is predicated on a substantial amount of robotics engineering which has allowed us to produce a high quality, peer reviewed, dataset - the first of its kind. 629.8932
160	Webový vyhledávací systém / Web Search Engine Tamáš, Miroslav January 2014 (has links) Academic fulltext search engine Egothor has recently became starting point of several thesis aimed on searching. Until now, there was no solution available to provide robust set of web content processing tools. This master thesis is aiming on design and implementation of distributed search system working primary with internet sources. We analyze first generation components for processing of web content and summarize their primary features. We use those features to propose architecture of distributed web search engine. We aim mainly to phases of data fetching, processing and indexing. We also describe final implementation of such system and propose few ideas for future extensions.

Search results