Global ETD Search

1	3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions Gu, Dongfeng January 2017 (has links) In recent years, deep convolutional neural networks (CNNs) have shown remarkable results in the image domain. However, most of the neural networks in action recognition do not have very deep layer compared with the CNN in the image domain. This thesis presents a 3D Densely Connected Convolutional Network (3D-DenseNet) for action recognition that can have more than 100 layers without exhibiting performance degradation or overfitting. Our network expands Densely Connected Convolutional Networks (DenseNet) [32] to 3D-DenseNet by adding the temporal dimension to all internal convolution and pooling layers. The internal layers of our model are connected with each other in a feed-forward fashion. In each layer, the feature-maps of all preceding layers are concatenated along the last dimension and are used as inputs to all subsequent layers. We propose two different versions of 3D-DenseNets: general 3D-DenseNet and lite 3D-DenseNet. While general 3D-DenseNet has the same architecture as DenseNet, lite 3D-DenseNet adds a 3D pooling layer right after the first 3D convolution layer of general 3D-DenseNet to reduce the number of training parameters at the beginning so that we can reach a deeper network. We test on two action datasets: the MERL shopping dataset [69] and the KTH dataset [63]. Our experiment results demonstrate that our method performs better than the state-of-the-art action recognition method on the MERL shopping dataset and achieves a competitive result on the KTH dataset. 3D-DenseNet CNN
2	Monocular Depth Prediction in Deep Neural Networks Tang, Guanqian January 2019 (has links) With the development of artificial neural network (ANN), it has been introduced in more and more computer vision tasks. Convolutional neural networks (CNNs) are widely used in object detection, object tracking, and semantic segmentation, achieving great performance improvement than traditional algorithms. As a classical topic in computer vision, the exploration of applying deep CNNs for depth recovery from monocular images is popular, since the single-view image is more common than stereo image pair and video. However, due to the lack of motion and geometry information, monocular depth estimation is much more difficult. This thesis aims at investigating depth prediction from single images by exploiting state-of-the-art deep CNN models. Two neural networks are studied: the first network uses the idea of a global and local network, and the other one adopts a deeper fully convolutional network by using a pre-trained backbone CNN (ResNet or DenseNet). We compare the performance of the two networks and the result shows that the deeper convolutional neural network with the pre-trained backbone can achieve better performance. The pre-trained model can significantly accelerate the training process. We also find that the amount of training dataset is essential for CNN-based monocular depth prediction. / Utvecklingen av artificiella neurala nätverk (ANN) har gjort att det nu använts i flertal datorseende tekniker för att förbättra prestandan. Convolutional Neural Networks (CNN) används ofta inom objektdetektering, objektspårning och semantisk segmentering, och har en bättre prestanda än de föregående algoritmerna. Användningen av CNNs för djup prediktering för single-image har blivit populärt, på grund av att single-image är vanligare än stereo-image och filmer. På grund av avsaknaden av rörelse och geometrisk information, är det mycket svårare att veta djupet i en bild än för en film. Syftet med masteruppsatsen är att implementera en ny algoritm för djup prediktering, specifikt för bilder genom att använda CNN modeller. Två olika neurala nätverk analyserades; det första använder sig av lokalt och globalt nätverk och det andra består av ett avancerat Convolutional Neural Network som använder en pretrained backbone CNN (ResNet eller DenseNet). Våra analyser visar att avancerat Convolutional Neural Network som använder en pre-trained backbone CNN har en bättre prestanda som påskyndade inlärningsprocessen avsevärt. Vi kom även fram till att mängden data för inlärningsprocessen var avgörande för CNN-baserad monokulär djup prediktering. Elektroteknik och elektronik
3	Avoiding Catastrophic Forgetting in Continual Learning through Elastic Weight Consolidation Evilevitch, Anton, Ingram, Robert January 2021 (has links) Image classification is an area of computer science with many areas of application. One key issue with using Artificial Neural Networks (ANN) for image classification is the phenomenon of Catastrophic Forgetting when training tasks sequentially (i.e Continual Learning). This is when the network quickly looses its performance on a given task after it has been trained on a new task. Elastic Weight Consolidation (EWC) has previously been proposed as a remedy to lessen the effects of this phenomena through the use of a loss function which utilizes a Fisher Information Matrix. We want to explore and establish if this still holds true for modern network architectures, and to what extent this can be applied using today’s state- of- the- art networks. We focus on applying this approach on tasks within the same dataset. Our results indicate that the approach is feasible, and does in fact lessen the effect of Catastrophic Forgetting. These results are achieved, however, at the cost of much longer execution times and time spent tuning the hyper- parameters. / Bildklassifiering är ett område inom dataologi med många tillämpningsområden. En nyckelfråga när det gäller användingen av Artificial Neural Networks (ANN) för bildklassifiering är fenomenet Catastrophic Forgetting. Detta inträffar när ett nätverk tränas sekventiellt (m.a.o. Continual Learning). Detta innebär att nätverket snabbt tappar prestanda för en viss uppgift efter att den har tränats på en ny uppgift. Elastic Weight Consolidation (EWC) har tidigare föreslagits som ett lindring genom applicering av en förlustfunktion som använder Fisher Information Matrix. Vi vill utforska och fastställa om detta fortfarande gäller för moderna nätverksarkitekturer, och i vilken utsträckning det kan tillämpas. Vi utför metoden på uppgifter inom en och samma dataset. Våra resultat visar att metoden är genomförbar och har en minskande effekt på Catastrophic Forgetting. Dessa resultat uppnås dock på bekostnad av längre körningstider och ökad tidsåtgång för val av hyperparametrar. Image classification DenseNet Elastic Weight Consolidation Catastrophic Forgetting CIFAR100 Bildklassifiering DenseNet Elastic Weight Consolidation Catastrophic Forgetting CIFAR100 Computer Sciences Datavetenskap (datalogi)
4	Skip connection in a MLP network for Parkinson’s classification Steinholtz, Tim January 2021 (has links) In this thesis, two different architecture designs of a Multi-Layer Perceptron network have been implemented. One architecture being an ordinary MLP, and in the other adding DenseNet inspired skip connections to an MLP architecture. The models were used and evaluated on the classification task, where the goal was to classify if subjects were diagnosed with Parkinson’s disease or not based on vocal features. The models were trained on an openly available dataset for Parkinson’s classification and evaluated on a hold-out set from this dataset and on two datasets recorded in another sound recording environment than the training data. The thesis searched for the answer to two questions; How insensitive models for Parkinson’s classification are to the sound recording environment and how the proposed skip connections in an MLP model could help improve performance and generalization capacity. The thesis results show that the sound environment affects the accuracy. Nevertheless, it concludes that one would be able to overcome this with more time and allow for good accuracy when models are exposed to data from a new sound environment than the training data. As for the question, if the skip connections improve accuracy and generalization, the thesis cannot draw any broad conclusions due to the data that were used. The models had, in general, the best performance with shallow networks, and it is with deeper networks that the skip connections are argued to help improve these attributes. However, when evaluating on the data from a different sound recording environment than the training data, the skip connections had the best performance in two out of three tests. / I denna avhandling har två olika arkitektur designer för ett artificiellt flerskikts neuralt nätverk implementerats. En arkitektur som följer konventionen för ett vanlig MLP nätverk, samt en ny arkitektur som introducerar DenseNet inspirerade genvägs kopplingar i MLP nätverk. Modellerna användes och utvärderades för klassificering, vars mål var att urskilja försökspersoner som friska eller diagnostiserade med Parkinsons sjukdom baserat på röst attribut. Modellerna tränades på ett öppet tillgänglig dataset för Parkinsons klassificering och utvärderades på en delmängd av denna data som inte hade använts för träningen, samt två dataset som kommer från en annan ljudinspelnings miljö än datan för träningen. Avhandlingen sökte efter svaret på två frågor; Hur okänsliga modeller för Parkinsons klassificering är för ljudinspelnings miljön och hur de föreslagna genvägs kopplingarna i en MLP-modell kan bidra till att förbättra prestanda och generalisering kapacitet. Resultaten av avhandlingen visar att ljudmiljön påverkar noggrannheten, men drar slutsatsen att med mer tid skulle man troligen kunna övervinna detta och möjliggöra god noggrannhet i nya ljudmiljöer. När det kommer till om genvägs kopplingarna förbättrar noggrannhet och generalisering, är avhandlingen inte i stånd att dra några breda slutsatser på grund av den data som användes. Modellerna hade generellt bästa prestanda med grunda nätverk, och det är i djupare nätverk som genvägs kopplingarna argumenteras för att förbättra dessa egenskaper. Med det sagt, om man bara kollade på resultaten på datan som är ifrån en annan ljudinspelnings miljö så hade genvägs arkitekturen bättre resultat i två av de tre testerna som utfördes. Parkinson’s Disease Classifications Vocal Features Multilayer Perceptron DenseNet Skip Connections Highway Connections Neural networks Multiple Sound Types Klassificering för Parkinsons sjukdom MultiLayer Perceptron Genvägs kopplingar Röst attribut DenseNet Neurala Nätverk Flertal Ljudkällor Computer and Information Sciences Data- och informationsvetenskap
5	Skin lesion detection using deep learning Rajit Chandra (12495442) 03 May 2022 (has links) <p>Skin lesion can be deadliest if not detected early. Early detection of skin lesion can save many lives. Artificial Intelligence and Machine learning is helping healthcare in many ways and so in the diagnosis of skin lesion. Computer aided diagnosis help clinicians in detecting the cancer. The study was conducted to classify the seven classes of skin lesion using very powerful convolutional neural networks. The two pre trained models i.e., DenseNet and Incepton-v3 were employed to train the model and accuracy, precision, recall, f1score and ROC-AUC was calculated for every class prediction. Moreover, gradient class activation maps were also used to aid the clinicians in determining what are the regions of image that influence model to make a certain decision. These visualizations are used for explainability of the model. Experiments showed that DenseNet performed better then Inception V3. Also it was noted that gradient class activation maps highlighted different regions for predicting same class. The main contribution was to introduce medical aided visualizations in lesion classification model that will help clinicians in understanding the decisions of the model. It will enhance the reliability of the model. Also, different optimizers were employed with both models to compare the accuracies.</p> Computer vision Image processing Pattern recognition Data mining and knowledge discovery Skin Cancer Diagnosis Convolutional Neural Networks Imaging DenseNet InceptionNet-V 3 pretrained model focal loss Image Processing Pattern Recognition and Data Mining Computer Vision
6	AI on the Edge with CondenseNeXt: An Efficient Deep Neural Network for Devices with Constrained Computational Resources Priyank Kalgaonkar (10911822) 05 August 2021 (has links) Research work presented within this thesis propose a neoteric variant of deep convolutional neural network architecture, CondenseNeXt, designed specifically for ARM-based embedded computing platforms with constrained computational resources. CondenseNeXt is an improved version of CondenseNet, the baseline architecture whose roots can be traced back to ResNet. CondeseNeXt replaces group convolutions in CondenseNet with depthwise separable convolutions and introduces group-wise pruning, a model compression technique, to prune (remove) redundant and insignificant elements that either are irrelevant or do not affect performance of the network upon disposition. Cardinality, a new dimension to the existing spatial dimensions, and class-balanced focal loss function, a weighting factor inversely proportional to the number of samples, has been incorporated in order to relieve the harsh effects of pruning, into the design of CondenseNeXt’s algorithm. Furthermore, extensive analyses of this novel CNN architecture was performed on three benchmarking image datasets: CIFAR-10, CIFAR-100 and ImageNet by deploying the trained weight on to an ARM-based embedded computing platform: NXP BlueBox 2.0, for real-time image classification. The outputs are observed in real-time in RTMaps Remote Studio’s console to verify the correctness of classes being predicted. CondenseNeXt achieves state-of-the-art image classification performance on three benchmark datasets including CIFAR-10 (4.79% top-1 error), CIFAR-100 (21.98% top-1 error) and ImageNet (7.91% single model, single crop top-5 error), and up to 59.98% reduction in forward FLOPs compared to CondenseNet. CondenseNeXt can also achieve a final trained model size of 2.9 MB, however at the cost of 2.26% in accuracy loss. Thus, performing image classification on ARM-Based computing platforms without requiring a CUDA enabled GPU support, with outstanding efficiency.<br> Computer Vision CondenseNeXt edge efficient model CIFAR10 CIFAR100 ImageNet CIFAR Python PyTorch Priyank Kalgaonkar Image Processing Computer Vision Deep Learning AI Neural Networks Image Classification Machine Learning Artificial Intelligence DenseNet CondenseNet algorithm CNN DNN ANN NN edge computing

1

Page generated in 0.0187 seconds