• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 13
  • 1
  • 1
  • Tagged with
  • 16
  • 16
  • 16
  • 16
  • 12
  • 8
  • 7
  • 6
  • 6
  • 5
  • 5
  • 5
  • 5
  • 5
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

A Deep Learning Application for Traffic Sign Recognition

Kondamari, Pramod Sai, Itha, Anudeep January 2021 (has links)
Background: Traffic Sign Recognition (TSR) is particularly useful for novice driversand self-driving cars. Driver Assistance Systems(DAS) involves automatic trafficsign recognition. Efficient classification of the traffic signs is required in DAS andunmanned vehicles for safe navigation. Convolutional Neural Networks(CNN) isknown for establishing promising results in the field of image classification, whichinspired us to employ this technique in our thesis. Computer vision is a process thatis used to understand the images and retrieve data from them. OpenCV is a Pythonlibrary used to detect traffic sign images in real-time. Objectives: This study deals with an experiment to build a CNN model which canclassify the traffic signs in real-time effectively using OpenCV. The model is builtwith low computational cost. The study also includes an experiment where variouscombinations of parameters are tuned to improve the model’s performance. Methods: The experimentation method involve building a CNN model based onmodified LeNet architecture with four convolutional layers, two max-pooling layersand two dense layers. The model is trained and tested with the German Traffic SignRecognition Benchmark (GTSRB) dataset. Parameter tuning with different combinationsof learning rate and epochs is done to improve the model’s performance.Later this model is used to classify the images introduced to the camera in real-time. Results: The graphs depicting the accuracy and loss of the model before and afterparameter tuning are presented. An experiment is done to classify the traffic signimage introduced to the camera by using the CNN model. High probability scoresare achieved during the process which is presented. Conclusions: The results show that the proposed model achieved 95% model accuracywith an optimum number of epochs, i.e., 30 and default optimum value oflearning rate, i.e., 0.001. High probabilities, i.e., above 75%, were achieved when themodel was tested using new real-time data.
2

Adversarial Framework with Temperature as a Regularizer for Semantic Segmentation

Kim, Chanho 14 January 2022 (has links)
Semantic Segmentation processes RGB scenes and classifies pixels collectively as an object. Recent deep learning methods have shown promising results in the accuracy and the speed of semantic segmentation. However, it is inevitable for the deep learning models to fall in overfitting to data used in training due to its nature of data-centric approaches. There have been numerous Regularization methods to overcome an overfitting problem, such as data augmentation, additional loss methods such as Euclidean or Least-Square terms, and structure-related methods by adding or modifying layers like Dropout and DropConnect in a network. Among those methods, penalizing a model via an additional loss or a weight constraint does not require memory increase. With this sight, our work purposes to improve a given segmentation model through temperatures and a lightweight discriminator. Temperatures have the role of generating different versions of probability maps through the division in softmax calculations. On top of probability maps from temperatures, we concatenate a simple discriminator after the segmentation network for the competition between groundtruth feature maps and modified feature maps. We pass the additional loss calculated from those probability maps into the principal network. Our contribution consists of two parts. Firstly, we use the adversarial loss as the regularization loss in the segmentation networks and validate that it can substitute the L2 regularization loss with better validation results. Also, we apply temperatures in segmentation probability maps for providing different information without using additional convolutional layers. The experiments indicate that the spiking temperature in a generator with keeping an original probability map in a discriminator provides the model improvement in terms of pixel accuracy and mean Intersection-of-Union (mIoU). Our framework shows that the segmentation model can be improved with a small increase in training time and the number of parameters.
3

WELD PENETRATION IDENTIFICATION BASED ON CONVOLUTIONAL NEURAL NETWORK

Li, Chao 01 January 2019 (has links)
Weld joint penetration determination is the key factor in welding process control area. Not only has it directly affected the weld joint mechanical properties, like fatigue for example. It also requires much of human intelligence, which either complex modeling or rich of welding experience. Therefore, weld penetration status identification has become the obstacle for intelligent welding system. In this dissertation, an innovative method has been proposed to detect the weld joint penetration status using machine-learning algorithms. A GTAW welding system is firstly built. Project a dot-structured laser pattern onto the weld pool surface during welding process, the reflected laser pattern is captured which contains all the information about the penetration status. An experienced welder is able to determine weld penetration status just based on the reflected laser pattern. However, it is difficult to characterize the images to extract key information that used to determine penetration status. To overcome the challenges in finding right features and accurately processing images to extract key features using conventional machine vision algorithms, we propose using convolutional neural network (CNN) to automatically extract key features and determine penetration status. Data-label pairs are needed to train a CNN. Therefore, an image acquiring system is designed to collect reflected laser pattern and the image of work-piece backside. Data augmentation is performed to enlarge the training data size, which resulting in 270,000 training data, 45,000 validation data and 45,000 test data. A six-layer convolutional neural network (CNN) has been designed and trained using a revised mini-batch gradient descent optimizer. Final test accuracy is 90.7% and using a voting mechanism based on three consequent images further improve the prediction accuracy.
4

Tyre sound classification with machine learning

Jabali, Aghyad, Mohammedbrhan, Husein Abdelkadir January 2021 (has links)
Having enough data about the usage of tyre types on the road can lead to a better understanding of the consequences of studded tyres on the environment. This paper is focused on training and testing a machine learning model which can be further integrated into a larger system for automation of the data collection process. Different machine learning algorithms, namely CNN, SVM, and Random Forest, were compared in this experiment. The method used in this paper is an empirical method. First, sound data for studded and none-studded tyres was collected from three different locations in the city of Gävle/Sweden. A total of 760 Mel spectrograms from both classes was generated to train and test a well-known CNN model (AlexNet) on MATLAB. Sound features for both classes were extracted using JAudio to train and test models that use SVM and Random Forest classifi-ers on Weka. Unnecessary features were removed one by one from the list of features to improve the performance of the classifiers. The result shows that CNN achieved accuracy of 84%, SVM has the best performance both with and without removing some audio features (i.e 94% and 92%, respectively), while Random Forest has 89 % accuracy. The test data is comprised of 51% of the studded class and 49% of the none-studded class and the result of the SVM model has achieved more than 94 %. Therefore, it can be considered as an acceptable result that can be used in practice.
5

Evaluating Robustness of a CNN Architecture introduced to the Adversarial Attacks

Ishak, Shaik, Jyothsna Chowdary, Anantaneni January 2021 (has links)
Abstract: Background: From Previous research, state-of-the-art deep neural networks have accomplished impressive results on many images classification tasks. However, adversarial attacks can easily fool these deep neural networks by adding little noise to the input images. This vulnerability causes a significant concern in deploying deep neural network-based systems in real-world security-sensitive situations. Therefore, research in attacking and the architectures with adversarial examples has drawn considerable attention. Here, we use the technique for image classification called Convolutional Neural Networks (CNN), which is known for determining favorable results in image classification. Objectives: This thesis reviews all types of adversarial attacks and CNN architectures in the present scientific literature. Experiment to build a CNN architecture to classify the handwritten digits in the MNIST dataset. And they are experimenting with adversarial attacks on the images to evaluate the accuracy fluctuations in categorizing images. This study also includes an experiment using the defensive distillation technique to improve the architecture's performance under adversarial attacks.  Methods: This thesis includes two methods; the systematic literature review method involved finding the best performing CNN architectures and best performing adversarial attack techniques. The experimentation method consists in building a CNN model based on modified LeNet architecture with two convolutional layers, one max-pooling layer, and two dropouts. The model is trained and tested with the MNIST dataset. Then applying adversarial attacks FGSM, IFGSM, MIFGSM on the input images to evaluate the model's performance. Later this model will be modified a little by defensive distillation technique and then tested towards adversarial attacks to evaluate the architecture's performance. Results: An experiment is conducted to evaluate the robustness of the CNN architecture in classifying the handwritten digits. The graphs show the accuracy before and after implementing adversarial attacks on the test dataset. The defensive distillation mechanism is applied to avoid adversarial attacks and achieve robust architecture. Conclusions: The results showed that FGSM, I-FGSM, MI-FGSM attacks reduce the test accuracy from 95% to around 35%. These three attacks to the proposed network successfully reduced ~70% of the test accuracy in all three cases for maximum epsilon 0.3. By the defensive distillation mechanism, the test accuracy reduces from 90% to 88% for max epsilon 0.3. The proposed defensive distillation process is successful in defending the adversarial attacks.
6

Estimation of Water Depth from Multispectral Drone Imagery : A suitability assessment of CNN models for bathymetry retrieval in shallow water areas / Uppskattning av vattendjup från multispektrala drönarbilder : En lämplighetsbedömning av CNN-modeller för att hämta batymetri i grunda vattenområden.

Shen, Qianyao January 2022 (has links)
Aedes aegypti and Aedes albopictus are the main vector species for dengue disease and zika, two arboviruses that affect a substantial fraction of the global population. These mosquitoes breed in very slow-moving or standing pools of water, so detecting and managing these potential breeding habitats is a crucial step in preventing the spread of these diseases. Using high-resolution images collected by unmanned aerial vehicles (UAV) and their multispectral mapping data, this paper investigated bathymetry retrieval model in shallow water areas to help improve the habitat detection accuracy. While previous studies have found some success with shallow water bathymetry inversion on satellite imagery, accurate centimeter-level water depth regression from high-resolution, drone multispectral imagery still remains a challenge. Unlike previous retrieval methods generally relying on retrieval factor extraction and linear regression, this thesis introduced CNN methods, considering the nonlinear relationship between image pixel reflectance values and water depth. In order to look into CNN’s potential to retrieve shallow water depths from multispectral images captured by a drone, this thesis conducts a variety of case studies to respectively specify a proper CNN architecture, compare its performance in different datasets, band combinations, depth ranges and with other general bathymetry retrieval algorithms. In summary, the CNN-based model achieves the best regression accuracy of overall root mean square error lower than 0.5, in comparison with another machine learning algorithm, random forest, and 2 other semi-empirical methods, linear and ratio model, suggesting this thesis’s practical significance. / Aedes aegypti och Aedes albopictus är de viktigaste vektorarterna för dengue och zika, två arbovirus som drabbar en stor del av den globala befolkningen. Dessa myggor förökar sig i mycket långsamt rörliga eller stillastående vattensamlingar, så att upptäcka och hantera dessa potentiella förökningsmiljöer är ett avgörande steg för att förhindra spridningen av dessa sjukdomar. Med hjälp av högupplösta bilder som samlats in av obemannade flygfarkoster (UAV) och deras multispektrala kartläggningsdata undersöktes i den här artikeln en modell för att hämta batymetri i grunda vattenområden för att förbättra noggrannheten i upptäckten av livsmiljöer. Även om tidigare studier har haft viss framgång med inversion av bathymetri på grunt vatten med hjälp av satellitbilder, är det fortfarande en utmaning att göra en exakt regression av vattendjupet på centimeternivå från högupplösta, multispektrala bilder från drönare. Till skillnad från tidigare metoder som i allmänhet bygger på extrahering av återvinningsfaktorer och linjär regression, infördes i denna avhandling CNN-metoder som tar hänsyn till det icke-linjära förhållandet mellan bildpixlarnas reflektionsvärden och vattendjupet. För att undersöka CNN:s potential att hämta grunda vattendjup från multispektrala bilder som tagits av en drönare genomförs i denna avhandling en rad fallstudier för att specificera en lämplig CNN-arkitektur, jämföra dess prestanda i olika datamängder, bandkombinationer, djupintervall och med andra allmänna algoritmer för att hämta batymetri. Sammanfattningsvis uppnår den CNN-baserade modellen den bästa regressionsnoggrannheten med ett totalt medelkvadratfel som är lägre än 0,5, i jämförelse med en annan maskininlärningsalgoritm, random forest, och två andra halvempiriska metoder, linjär och kvotmodell, vilket tyder på den praktiska betydelsen av denna avhandling.
7

Automated Gravel Road Condition Assessment : A Case Study of Assessing Loose Gravel using Audio Data

Saeed, Nausheen January 2021 (has links)
Gravel roads connect sparse populations and provide highways for agriculture and the transport of forest goods. Gravel roads are an economical choice where traffic volume is low. In Sweden, 21% of all public roads are state-owned gravel roads, covering over 20,200 km. In addition, there are some 74,000 km of gravel roads and 210,000 km of forest roads that are owned by the private sector. The Swedish Transport Administration (Trafikverket) rates the condition of gravel roads according to the severity of irregularities (e.g. corrugations and potholes), dust, loose gravel, and gravel cross-sections. This assessment is carried out during the summertime when roads are free of snow. One of the essential parameters for gravel road assessment is loose gravel. Loose gravel can cause a tire to slip, leading to a loss of driver control.  Assessment of gravel roads is carried out subjectively by taking images of road sections and adding some textual notes. A cost-effective, intelligent, and objective method for road assessment is lacking. Expensive methods, such as laser profiler trucks, are available and can offer road profiling with high accuracy. These methods are not applied to gravel roads, however, because of the need to maintain cost-efficiency.  In this thesis, we explored the idea that, in addition to machine vision, we could also use machine hearing to classify the condition of gravel roads in relation to loose gravel. Several suitable classical supervised learning and convolutional neural networks (CNN) were tested. When people drive on gravel roads, they can make sense of the road condition by listening to the gravel hitting the bottom of the car. The more we hear gravel hitting the bottom of the car, the more we can sense that there is a lot of loose gravel and, therefore, the road might be in a bad condition. Based on this idea, we hypothesized that machines could also undertake such a classification when trained with labeled sound data. Machines can identify gravel and non-gravel sounds. In this thesis, we used traditional machine learning algorithms, such as support vector machines (SVM), decision trees, and ensemble classification methods. We also explored CNN for classifying spectrograms of audio sounds and images in gravel roads. Both supervised learning and CNN were used, and results were compared for this study. In classical algorithms, when compared with other classifiers, ensemble bagged tree (EBT)-based classifiers performed best for classifying gravel and non-gravel sounds. EBT performance is also useful in reducing the misclassification of non-gravel sounds. The use of CNN also showed a 97.91% accuracy rate. Using CNN makes the classification process more intuitive because the network architecture takes responsibility for selecting the relevant training features. Furthermore, the classification results can be visualized on road maps, which can help road monitoring agencies assess road conditions and schedule maintenance activities for a particular road. / <p>Due to unforeseen circumstances the seminar was postponed from May 7 to 28, as duly stated in the new posting page.</p>
8

Natural Language Processing using Deep Learning in Social Media

Giménez Fayos, María Teresa 02 September 2021 (has links)
[ES] En los últimos años, los modelos de aprendizaje automático profundo (AP) han revolucionado los sistemas de procesamiento de lenguaje natural (PLN). Hemos sido testigos de un avance formidable en las capacidades de estos sistemas y actualmente podemos encontrar sistemas que integran modelos PLN de manera ubicua. Algunos ejemplos de estos modelos con los que interaccionamos a diario incluyen modelos que determinan la intención de la persona que escribió un texto, el sentimiento que pretende comunicar un tweet o nuestra ideología política a partir de lo que compartimos en redes sociales. En esta tesis se han propuestos distintos modelos de PNL que abordan tareas que estudian el texto que se comparte en redes sociales. En concreto, este trabajo se centra en dos tareas fundamentalmente: el análisis de sentimientos y el reconocimiento de la personalidad de la persona autora de un texto. La tarea de analizar el sentimiento expresado en un texto es uno de los problemas principales en el PNL y consiste en determinar la polaridad que un texto pretende comunicar. Se trata por lo tanto de una tarea estudiada en profundidad de la cual disponemos de una vasta cantidad de recursos y modelos. Por el contrario, el problema del reconocimiento de personalidad es una tarea revolucionaria que tiene como objetivo determinar la personalidad de los usuarios considerando su estilo de escritura. El estudio de esta tarea es más marginal por lo que disponemos de menos recursos para abordarla pero que no obstante presenta un gran potencial. A pesar de que el enfoque principal de este trabajo fue el desarrollo de modelos de aprendizaje profundo, también hemos propuesto modelos basados en recursos lingüísticos y modelos clásicos del aprendizaje automático. Estos últimos modelos nos han permitido explorar las sutilezas de distintos elementos lingüísticos como por ejemplo el impacto que tienen las emociones en la clasificación correcta del sentimiento expresado en un texto. Posteriormente, tras estos trabajos iniciales se desarrollaron modelos AP, en particular, Redes neuronales convolucionales (RNC) que fueron aplicadas a las tareas previamente citadas. En el caso del reconocimiento de la personalidad, se han comparado modelos clásicos del aprendizaje automático con modelos de aprendizaje profundo, pudiendo establecer una comparativa bajo las mismas premisas. Cabe destacar que el PNL ha evolucionado drásticamente en los últimos años gracias al desarrollo de campañas de evaluación pública, donde múltiples equipos de investigación comparan las capacidades de los modelos que proponen en las mismas condiciones. La mayoría de los modelos presentados en esta tesis fueron o bien evaluados mediante campañas de evaluación públicas, o bien emplearon la configuración de una campaña pública previamente celebrada. Siendo conscientes, por lo tanto, de la importancia de estas campañas para el avance del PNL, desarrollamos una campaña de evaluación pública cuyo objetivo era clasificar el tema tratado en un tweet, para lo cual recogimos y etiquetamos un nuevo conjunto de datos. A medida que avanzabamos en el desarrollo del trabajo de esta tesis, decidimos estudiar en profundidad como las RNC se aplicaban a las tareas de PNL. En este sentido, se exploraron dos líneas de trabajo. En primer lugar, propusimos un método de relleno semántico para RNC, que plantea una nueva manera de representar el texto para resolver tareas de PNL. Y en segundo lugar, se introdujo un marco teórico para abordar una de las críticas más frecuentes del aprendizaje profundo, el cual es la falta de interpretabilidad. Este marco busca visualizar qué patrones léxicos, si los hay, han sido aprendidos por la red para clasificar un texto. / [CA] En els últims anys, els models d'aprenentatge automàtic profund (AP) han revolucionat els sistemes de processament de llenguatge natural (PLN). Hem estat testimonis d'un avanç formidable en les capacitats d'aquests sistemes i actualment podem trobar sistemes que integren models PLN de manera ubiqua. Alguns exemples d'aquests models amb els quals interaccionem diàriament inclouen models que determinen la intenció de la persona que va escriure un text, el sentiment que pretén comunicar un tweet o la nostra ideologia política a partir del que compartim en xarxes socials. En aquesta tesi s'han proposats diferents models de PNL que aborden tasques que estudien el text que es comparteix en xarxes socials. En concret, aquest treball se centra en dues tasques fonamentalment: l'anàlisi de sentiments i el reconeixement de la personalitat de la persona autora d'un text. La tasca d'analitzar el sentiment expressat en un text és un dels problemes principals en el PNL i consisteix a determinar la polaritat que un text pretén comunicar. Es tracta per tant d'una tasca estudiada en profunditat de la qual disposem d'una vasta quantitat de recursos i models. Per contra, el problema del reconeixement de la personalitat és una tasca revolucionària que té com a objectiu determinar la personalitat dels usuaris considerant el seu estil d'escriptura. L'estudi d'aquesta tasca és més marginal i en conseqüència disposem de menys recursos per abordar-la però no obstant i això presenta un gran potencial. Tot i que el fouc principal d'aquest treball va ser el desenvolupament de models d'aprenentatge profund, també hem proposat models basats en recursos lingüístics i models clàssics de l'aprenentatge automàtic. Aquests últims models ens han permès explorar les subtileses de diferents elements lingüístics com ara l'impacte que tenen les emocions en la classificació correcta del sentiment expressat en un text. Posteriorment, després d'aquests treballs inicials es van desenvolupar models AP, en particular, Xarxes neuronals convolucionals (XNC) que van ser aplicades a les tasques prèviament esmentades. En el cas de el reconeixement de la personalitat, s'han comparat models clàssics de l'aprenentatge automàtic amb models d'aprenentatge profund la qual cosa a permet establir una comparativa de les dos aproximacions sota les mateixes premisses. Cal remarcar que el PNL ha evolucionat dràsticament en els últims anys gràcies a el desenvolupament de campanyes d'avaluació pública on múltiples equips d'investigació comparen les capacitats dels models que proposen sota les mateixes condicions. La majoria dels models presentats en aquesta tesi van ser o bé avaluats mitjançant campanyes d'avaluació públiques, o bé s'ha emprat la configuració d'una campanya pública prèviament celebrada. Sent conscients, per tant, de la importància d'aquestes campanyes per a l'avanç del PNL, vam desenvolupar una campanya d'avaluació pública on l'objectiu era classificar el tema tractat en un tweet, per a la qual cosa vam recollir i etiquetar un nou conjunt de dades. A mesura que avançàvem en el desenvolupament del treball d'aquesta tesi, vam decidir estudiar en profunditat com les XNC s'apliquen a les tasques de PNL. En aquest sentit, es van explorar dues línies de treball.En primer lloc, vam proposar un mètode d'emplenament semàntic per RNC, que planteja una nova manera de representar el text per resoldre tasques de PNL. I en segon lloc, es va introduir un marc teòric per abordar una de les crítiques més freqüents de l'aprenentatge profund, el qual és la falta de interpretabilitat. Aquest marc cerca visualitzar quins patrons lèxics, si n'hi han, han estat apresos per la xarxa per classificar un text. / [EN] In the last years, Deep Learning (DL) has revolutionised the potential of automatic systems that handle Natural Language Processing (NLP) tasks. We have witnessed a tremendous advance in the performance of these systems. Nowadays, we found embedded systems ubiquitously, determining the intent of the text we write, the sentiment of our tweets or our political views, for citing some examples. In this thesis, we proposed several NLP models for addressing tasks that deal with social media text. Concretely, this work is focused mainly on Sentiment Analysis and Personality Recognition tasks. Sentiment Analysis is one of the leading problems in NLP, consists of determining the polarity of a text, and it is a well-known task where the number of resources and models proposed is vast. In contrast, Personality Recognition is a breakthrough task that aims to determine the users' personality using their writing style, but it is more a niche task with fewer resources designed ad-hoc but with great potential. Despite the fact that the principal focus of this work was on the development of Deep Learning models, we have also proposed models based on linguistic resources and classical Machine Learning models. Moreover, in this more straightforward setup, we have explored the nuances of different language devices, such as the impact of emotions in the correct classification of the sentiment expressed in a text. Afterwards, DL models were developed, particularly Convolutional Neural Networks (CNNs), to address previously described tasks. In the case of Personality Recognition, we explored the two approaches, which allowed us to compare the models under the same circumstances. Noteworthy, NLP has evolved dramatically in the last years through the development of public evaluation campaigns, where multiple research teams compare the performance of their approaches under the same conditions. Most of the models here presented were either assessed in an evaluation task or either used their setup. Recognising the importance of this effort, we curated and developed an evaluation campaign for classifying political tweets. In addition, as we advanced in the development of this work, we decided to study in-depth CNNs applied to NLP tasks. Two lines of work were explored in this regard. Firstly, we proposed a semantic-based padding method for CNNs, which addresses how to represent text more appropriately for solving NLP tasks. Secondly, a theoretical framework was introduced for tackling one of the most frequent critics of Deep Learning: interpretability. This framework seeks to visualise what lexical patterns, if any, the CNN is learning in order to classify a sentence. In summary, the main achievements presented in this thesis are: - The organisation of an evaluation campaign for Topic Classification from texts gathered from social media. - The proposal of several Machine Learning models tackling the Sentiment Analysis task from social media. Besides, a study of the impact of linguistic devices such as figurative language in the task is presented. - The development of a model for inferring the personality of a developer provided the source code that they have written. - The study of Personality Recognition tasks from social media following two different approaches, models based on machine learning algorithms and handcrafted features, and models based on CNNs were proposed and compared both approaches. - The introduction of new semantic-based paddings for optimising how the text was represented in CNNs. - The definition of a theoretical framework to provide interpretable information to what CNNs were learning internally. / Giménez Fayos, MT. (2021). Natural Language Processing using Deep Learning in Social Media [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/172164 / TESIS
9

Investigation of hierarchical deep neural network structure for facial expression recognition

Motembe, Dodi 01 1900 (has links)
Facial expression recognition (FER) is still a challenging concept, and machines struggle to comprehend effectively the dynamic shifts in facial expressions of human emotions. The existing systems, which have proven to be effective, consist of deeper network structures that need powerful and expensive hardware. The deeper the network is, the longer the training and the testing. Many systems use expensive GPUs to make the process faster. To remedy the above challenges while maintaining the main goal of improving the accuracy rate of the recognition, we create a generic hierarchical structure with variable settings. This generic structure has a hierarchy of three convolutional blocks, two dropout blocks and one fully connected block. From this generic structure we derived four different network structures to be investigated according to their performances. From each network structure case, we again derived six network structures in relation to the variable parameters. The variable parameters under analysis are the size of the filters of the convolutional maps and the max-pooling as well as the number of convolutional maps. In total, we have 24 network structures to investigate, and six network structures per case. After simulations, the results achieved after many repeated experiments showed in the group of case 1; case 1a emerged as the top performer of that group, and case 2a, case 3c and case 4c outperformed others in their respective groups. The comparison of the winners of the 4 groups indicates that case 2a is the optimal structure with optimal parameters; case 2a network structure outperformed other group winners. Considerations were done when choosing the best network structure, considerations were; minimum accuracy, average accuracy and maximum accuracy after 15 times of repeated training and analysis of results. All 24 proposed network structures were tested using two of the most used FER datasets, the CK+ and the JAFFE. After repeated simulations the results demonstrate that our inexpensive optimal network architecture achieved 98.11 % accuracy using the CK+ dataset. We also tested our optimal network architecture with the JAFFE dataset, the experimental results show 84.38 % by using just a standard CPU and easier procedures. We also compared the four group winners with other existing FER models performances recorded recently in two studies. These FER models used the same two datasets, the CK+ and the JAFFE. Three of our four group winners (case 1a, case 2a and case 4c) recorded only 1.22 % less than the accuracy of the top performer model when using the CK+ dataset, and two of our network structures, case 2a and case 3c came in third, beating other models when using the JAFFE dataset. / Electrical and Mining Engineering
10

Image-classification for Brain Tumor using Pre-trained Convolutional Neural Network : Bildklassificering för hjärntumör medhjälp av förtränat konvolutionell tneuralt nätverk

Osman, Ahmad, Alsabbagh, Bushra January 2023 (has links)
Brain tumor is a disease characterized by uncontrolled growth of abnormal cells inthe brain. The brain is responsible for regulating the functions of all other organs,hence, any atypical growth of cells in the brain can have severe implications for itsfunctions. The number of global mortality in 2020 led by cancerous brains was estimatedat 251,329. However, early detection of brain cancer is critical for prompttreatment and improving patient’s quality of life as well as survival rates. Manualmedical image classification in diagnosing diseases has been shown to be extremelytime-consuming and labor-intensive. Convolutional Neural Networks (CNNs) hasproven to be a leading algorithm in image classification outperforming humans. Thispaper compares five CNN architectures namely: VGG-16, VGG-19, AlexNet, EffecientNetB7,and ResNet-50 in terms of performance and accuracy using transferlearning. In addition, the authors discussed in this paper the economic impact ofCNN, as an AI approach, on the healthcare sector. The models’ performance isdemonstrated using functions for loss and accuracy rates as well as using the confusionmatrix. The conducted experiment resulted in VGG-19 achieving best performancewith 97% accuracy, while EffecientNetB7 achieved worst performance with93% accuracy. / Hjärntumör är en sjukdom som kännetecknas av okontrollerad tillväxt av onormalaceller i hjärnan. Hjärnan är ansvarig för att styra funktionerna hos alla andra organ,därför kan all onormala tillväxt av celler i hjärnan ha allvarliga konsekvenser för dessfunktioner. Antalet globala dödligheten ledda av hjärncancer har uppskattats till251329 under 2020. Tidig upptäckt av hjärncancer är dock avgörande för snabb behandlingoch för att förbättra patienternas livskvalitet och överlevnadssannolikhet.Manuell medicinsk bildklassificering vid diagnostisering av sjukdomar har visat sigvara extremt tidskrävande och arbetskrävande. Convolutional Neural Network(CNN) är en ledande algoritm för bildklassificering som har överträffat människor.Denna studie jämför fem CNN-arkitekturer, nämligen VGG-16, VGG-19, AlexNet,EffecientNetB7, och ResNet-50 i form av prestanda och noggrannhet. Dessutom diskuterarförfattarna i studien CNN:s ekonomiska inverkan på sjukvårdssektorn. Modellensprestanda demonstrerades med hjälp av funktioner om förlust och noggrannhetsvärden samt med hjälp av en Confusion matris. Resultatet av det utfördaexperimentet har visat att VGG-19 har uppnått bästa prestanda med 97% noggrannhet,medan EffecientNetB7 har uppnått värsta prestanda med 93% noggrannhet.

Page generated in 0.508 seconds