Global ETD Search

1	Neural Networks for Predictive Maintenance on Highly Imbalanced Industrial Data Montilla Tabares, Oscar January 2023 (has links) Preventive maintenance plays a vital role in optimizing industrial operations. However, detecting equipment needing such maintenance using available data can be particularly challenging due to the class imbalance prevalent in real-world applications. The datasets gathered from equipment sensors primarily consist of records from well-functioning machines, making it difficult to identify those on the brink of failure, which is the main focus of preventive maintenance efforts. In this study, we employ neural network algorithms to address class imbalance and cost sensitivity issues in industrial scenarios for preventive maintenance. Our investigation centers on the "APS Failure in the Scania Trucks Data Set," a binary classification problem exhibiting significant class imbalance and cost sensitivity issues—a common occurrence across various fields. Inspired by image detection techniques, we introduce a novel loss function called Focal loss to traditional neural networks, combined with techniques like Cost-Sensitive Learning and Threshold Calculation to enhance classification accuracy. Our study's novelty is adapting image detection techniques to tackle the class imbalance problem within a binary classification task. Our proposed method demonstrates improvements in addressing the given optimization problem when confronted with these issues, matching or surpassing existing machine learning and deep learning techniques while maintaining computational efficiency. Our results indicate that class imbalance can be addressed without relying on conventional sampling techniques, which typically come at the cost of increased computational cost (oversampling) or loss of critical information (undersampling). In conclusion, our proposed method presents a promising approach for addressing class imbalance and cost sensitivity issues in industrial datasets heavily affected by these phenomena. It contributes to developing preventive maintenance solutions capable of enhancing the efficiency and productivity of industrial operations by detecting machines in need of attention: this discovery process we term predictive maintenance. The artifact produced in this study showcases the utilization of Focal Loss, Cost-Sensitive Learning, and Threshold Calculation to create reliable and effective predictive maintenance solutions for real-world applications. This thesis establishes a method that contributes to the body of knowledge in binary classification within machine learning, specifically addressing the challenges mentioned above. Our research findings have broader implications beyond industrial classification tasks, extending to other fields, such as medical or cybersecurity classification problems. The artifact (code) is at: https://shorturl.at/lsNSY Class Imbalance Cost Sensitivity Cost-Sensitive Learning Focal Loss Binary Classification Machine Learning Deep Learning Computer Sciences Datavetenskap (datalogi)
2	Enhancing Hurricane Damage Assessment from Satellite Images Using Deep Learning Berezina, Polina January 2020 (has links) No description available. Geography Geographic Information Science Remote Sensing Computer Science damage assessment remote sensing deep learning Hurricane Michael CNN U-NET focal loss
3	Classification of Transcribed Voice Recordings : Determining the Claim Type of Recordings Submitted by Swedish Insurance Clients / Klassificering av Transkriberade Röstinspelningar Piehl, Carl January 2021 (has links) In this thesis, we investigate the problem of building a text classifier for transcribed voice recordings submitted by insurance clients. We compare different models in the context of two tasks. The first is a binary classification problem, where the models are tasked with determining if a transcript belongs to a particular type or not. The second is a multiclass problem, where the models have to choose between several types when labelling transcripts, resulting in a data set with a highly imbalanced class distribution. We evaluate four different models: pretrained BERT and three LSTMs with different word embeddings. The used word embeddings are ELMo, word2vec and a baseline model with randomly initialized embedding layer. In the binary task, we are more concerned with false positives than false negatives. Thus, we also use weighted cross entropy loss to achieve high precision for the positive class, while sacrificing recall. In the multiclass task, we use focal loss and weighted cross entropy loss to reduce bias toward majority classes. We find that BERT outperforms the other models and the baseline model is worst across both tasks. The difference in performance is greatest in the multiclass task on classes with fewer samples. This demonstrates the benefit of using large language models in data constrained scenarios. In the binary task, we find that weighted cross entropy loss provides a simple, yet effective, framework for conditioning the model to favor certain types of errors. In the multiclass task, both focal loss and weighted cross entropy loss are shown to reduce bias toward majority classes. However, we also find that BERT fine tuned with regular cross entropy loss does not show bias toward majority classes, having high recall across all classes. / I examensarbetet undersöks klassificering av transkriberade röstinspelningar från försäkringskunder. Flera modeller jämförs på två uppgifter. Den första är binär klassificering, där modellerna ska särskilja på inspelningar som tillhör en specifik klass av ärende från resterande inspelningar. I det andra inkluderas flera olika klasser som modellerna ska välja mellan när inspelningar klassificeras, vilket leder till en ojämn klassfördelning. Fyra modeller jämförs: förtränad BERT och tre LSTM-nätverk med olika varianter av förtränade inbäddningar. De inbäddningar som används är ELMo, word2vec och en basmodell som har inbäddningar som inte förtränats. I det binära klassificeringsproblemet ligger fokus på att minimera antalet falskt positiva klassificeringar, därför används viktad korsentropi. Utöver detta används även fokal förlustfunktion när flera klasser inkluderas, för att minska partiskhet mot majoritetsklasser. Resultaten indikerar att BERT är en starkare modell än de andra modellerna i båda uppgifterna. Skillnaden mellan modellerna är tydligast när flera klasser används, speciellt på de klasser som är underrepresenterade. Detta visar på fördelen av att använda stora, förtränade, modeller när mängden data är begränsad. I det binära klassificeringsproblemet ser vi även att en viktad förlustfunktion ger ett enkelt men effektivt sätt att reglera vilken typ av fel modellen ska vara partisk mot. När flera klasser inkluderas ser vi att viktad korsentropi, samt fokal förlustfunktion, kan bidra till att minska partiskhet mot överrepresenterade klasser. Detta var dock inte fallet för BERT, som visade bra resultat på minoritetsklasser även utan att modifiera förlustfunktionen. Text Classification Word embeddings BERT LSTM Cost-sensitive learning Focal loss Textklassificering Ordinbäddningar BERT LSTM Kostnadskänslig inlärning Fokal förlustfunktion Computer and Information Sciences Data- och informationsvetenskap
4	Segmentace obrazu nevyvážených dat pomocí umělé inteligence / Image segmentation of unbalanced data using artificial intelligence Polách, Michal January 2019 (has links) This thesis focuses on problematics of segmentation of unbalanced datasets by the useof artificial inteligence. Numerous existing methods for dealing with unbalanced datasetsare examined, and some of them are then applied to real problem that consist of seg-mentation of dataset with class ratio of more than 6000:1.
5	Skin lesion detection using deep learning Rajit Chandra (12495442) 03 May 2022 (has links) <p>Skin lesion can be deadliest if not detected early. Early detection of skin lesion can save many lives. Artificial Intelligence and Machine learning is helping healthcare in many ways and so in the diagnosis of skin lesion. Computer aided diagnosis help clinicians in detecting the cancer. The study was conducted to classify the seven classes of skin lesion using very powerful convolutional neural networks. The two pre trained models i.e., DenseNet and Incepton-v3 were employed to train the model and accuracy, precision, recall, f1score and ROC-AUC was calculated for every class prediction. Moreover, gradient class activation maps were also used to aid the clinicians in determining what are the regions of image that influence model to make a certain decision. These visualizations are used for explainability of the model. Experiments showed that DenseNet performed better then Inception V3. Also it was noted that gradient class activation maps highlighted different regions for predicting same class. The main contribution was to introduce medical aided visualizations in lesion classification model that will help clinicians in understanding the decisions of the model. It will enhance the reliability of the model. Also, different optimizers were employed with both models to compare the accuracies.</p> Computer vision Image processing Pattern recognition Data mining and knowledge discovery Skin Cancer Diagnosis Convolutional Neural Networks Imaging DenseNet InceptionNet-V 3 pretrained model focal loss Image Processing Pattern Recognition and Data Mining Computer Vision
6	Cost-Aware Machine Learning and Deep Learning for Extremely Imbalanced Data Ahmed, Jishan 11 August 2023 (has links) No description available. Computer Science Statistics Machine learning Cost-sensitive learning Resampling techniques Failure prediction Class imbalance Deep learning Focal loss LSTM BLSTM 1D CNN Survival analysis Permutation importance Feature selection SHAP PySpark

1

Page generated in 0.0408 seconds