1 |
Visual Saliency Analysis on Fashion Images Using Image Processing and Deep Learning ApproachesNeupane, Aashish 01 December 2020 (has links)
ABSTRACTAASHISH NEUPANE, for the Master of Science degree in BIOMEDICAL ENGINEERING, presented on July 35, 2020, at Southern Illinois University Carbondale. TITLE: VISUAL SALIENCY ANALYSIS ON FASHION IMAGES USING IMAGE PROCESSING AND DEEP LEARNING APPROACHES.MAJOR PROFESSOR: Dr. Jun QinState-of-art computer vision technologies have been applied in fashion in multiple ways, and saliency modeling is one of those applications. In computer vision, a saliency map is a 2D topological map which indicates the probabilistic distribution of visual attention priorities. This study is focusing on analysis of the visual saliency on fashion images using multiple saliency models, evaluated by several evaluation metrics. A human subject study has been conducted to collect people’s visual attention on 75 fashion images. Binary ground-truth fixation maps for these images have been created based on the experimentally collected visual attention data using Gaussian blurring function. Saliency maps for these 75 fashion images were generated using multiple conventional saliency models as well as deep feature-based state-of-art models. DeepFeat has been studied extensively, with 44 sets of saliency maps, exploiting the features extracted from GoogLeNet and ResNet50. Seven other saliency models have also been utilized to predict saliency maps on these images. The results were compared over 5 evaluation metrics – AUC, CC, KL Divergence, NSS and SIM. The performance of all 8 saliency models on prediction of visual attention on fashion images over all five metrics were comparable to the benchmarked scores. Furthermore, the models perform well consistently over multiple evaluation metrics, thus indicating that saliency models could in fact be applied to effectively predict salient regions in random fashion advertisement images.
|
2 |
Acceleration of deep convolutional neural networks on multiprocessor system-on-chipReiche Myrgård, Martin January 2019 (has links)
In this master thesis some of the most promising existing frameworks and implementations of deep convolutional neural networks on multiprocessor system-on-chips (MPSoCs) are researched and evaluated. The thesis’ starting point was a previousthesis which evaluated possible deep learning models and frameworks for object detection on infra-red images conducted in the spring of 2018. In order to fit an existing deep convolutional neural network (DCNN) on a Multiple-Processor-System on Chip it needs modifications. Most DCNNs are trained on Graphic processing units (GPUs) with a bit width of 32 bit. This is not optimal for a platform with hard memory constraints such as the MPSoC which means it needs to be shortened. The optimal bit width depends on the network structure and requirements in terms of throughput and accuracy although most of the currently available object detection networks drop significantly when reduced below 6 bits width. After reducing the bit width, the network needs to be quantized and pruned for better memory usage. After quantization it can be implemented using one of many existing frameworks. This thesis focuses on Xilinx CHaiDNN and DNNWeaver V2 though it touches a little on revision, HLS4ML and DNNWeaver V1 as well. In conclusion the implementation of two network models on Xilinx Zynq UltraScale+ ZCU102 using CHaiDNN were evaluated. Conversion of existing network were done and quantization tested though not fully working. The results were a two to six times more power efficient implementation in comparison to GPU inference.
|
3 |
Deep learning role in scoliosis detection and treatmentGuanche, Luis 29 January 2024 (has links)
Scoliosis is a common skeletal condition in which a curvature forms along the coronal plane of the spine. Although scoliosis has been long recognized, its pathophysiology and best mode of treatment are still debated. Currently, definitive diagnosis of scoliosis and its progression are performed through anterior-posterior (AP) radiographs by measuring the angle of coronal curvature, referred to as Cobb angle. Cobb angle measurements can be performed by Deep Learning algorithms and are currently being investigated as a possible diagnostic tool for clinicians. This thesis focuses on the role of Deep Learning in the diagnosis and treatment of Scoliosis and proposes a study design using the algorithms to continue to better understand and classify the disease.
|
4 |
A novel application of deep learning with image cropping: a smart cities use case for flood monitoringMishra, Bhupesh K., Thakker, Dhaval, Mazumdar, S., Neagu, Daniel, Gheorghe, Marian, Simpson, Sydney 13 February 2020 (has links)
Yes / Event monitoring is an essential application of Smart City platforms. Real-time monitoring of gully and drainage blockage is an important part of flood monitoring applications. Building viable IoT sensors for detecting blockage is a complex task due to the limitations of deploying such sensors in situ. Image classification with deep learning is a potential alternative solution. However, there are no image datasets of gullies and drainages. We were faced with such challenges as part of developing a flood monitoring application in a European Union-funded project. To address these issues, we propose a novel image classification approach based on deep learning with an IoT-enabled camera to monitor gullies and drainages. This approach utilises deep learning to develop an effective image classification model to classify blockage images into different class labels based on the severity. In order to handle the complexity of video-based images, and subsequent poor classification accuracy of the model, we have carried out experiments with the removal of image edges by applying image cropping. The process of cropping in our proposed experimentation is aimed to concentrate only on the regions of interest within images, hence leaving out some proportion of image edges. An image dataset from crowd-sourced publicly accessible images has been curated to train and test the proposed model. For validation, model accuracies were compared considering model with and without image cropping. The cropping-based image classification showed improvement in the classification accuracy. This paper outlines the lessons from our experimentation that have a wider impact on many similar use cases involving IoT-based cameras as part of smart city event monitoring platforms. / European Regional Development Fund Interreg project Smart Cities and Open Data REuse (SCORE).
|
5 |
Using deep learning for IoT-enabled smart camera: a use case of flood monitoringMishra, Bhupesh K., Thakker, Dhaval, Mazumdar, S., Simpson, Sydney, Neagu, Daniel 15 July 2019 (has links)
Yes / In recent years, deep learning has been increasingly used for several applications such as object analysis, feature extraction and image classification. This paper explores the use of deep learning in a flood monitoring application in the context of an EC-funded project, Smart Cities and Open Data REuse (SCORE). IoT sensors for detecting blocked gullies and drainages are notoriously hard to build, hence we propose a novel technique to utilise deep learning for building an IoT-enabled smart camera to address this need. In our work, we apply deep leaning to classify drain blockage images to develop an effective image classification model for different severity of blockages. Using this model, an image can be analysed and classified in number of classes depending upon the context of the image. In building such model, we explored the use of filtering in terms of segmentation as one of the approaches to increase the accuracy of classification by concentrating only into the area of interest within the image. Segmentation is applied in data pre-processing stage in our application before the training. We used crowdsourced publicly available images to train and test our model. Our model with segmentation showed an improvement in the classification accuracy. / Research presented in this paper is funded by the European Commission Interreg project Smart Cities and Open Data REuse (SCORE).
|
6 |
Improved Deep Convolutional Neural Networks (DCNN) Approaches for Computer Vision and Bio-Medical ImagingAlom, Md Zahangir January 2018 (has links)
No description available.
|
7 |
Monocular Depth Estimation with Edge-Based Constraints using Active Learning OptimizationSaleh, Shadi 04 April 2024 (has links)
Depth sensing is pivotal in robotics; however, monocular depth estimation encounters significant challenges. Existing algorithms relying on large-scale labeled data and large Deep Convolutional Neural Networks (DCNNs) hinder real-world applications. We propose two lightweight architectures that achieve commendable accuracy rates of 91.2% and 90.1%, simultaneously reducing the Root Mean Square Error (RMSE) of depth to 4.815 and 5.036. Our lightweight depth model operates at 29-44 FPS on the Jetson Nano GPU, showcasing efficient performance with minimal power consumption.
Moreover, we introduce a mask network designed to visualize and analyze the compact depth network, aiding in discerning informative samples for the active learning approach. This contributes to increased model accuracy and enhanced generalization capabilities.
Furthermore, our methodology encompasses the introduction of an active learning framework strategically designed to enhance model performance and accuracy by efficiently utilizing limited labeled training data. This novel framework outperforms previous studies by achieving commendable results with only 18.3% utilization of the KITTI Odometry dataset. This performance reflects a skillful balance between computational efficiency and accuracy, tailored for low-cost devices while reducing data training requirements.:1. Introduction
2. Literature Review
3. AI Technologies for Edge Computing
4. Monocular Depth Estimation Methodology
5. Implementation
6. Result and Evaluation
7. Conclusion and Future Scope
Appendix
|
8 |
Evaluating Response Images From Protein QuantificationEngström, Mathias, Olby, Erik January 2020 (has links)
Gyros Protein Technologies develops instruments for automated immunoassays. Fluorescent antibodies are added to samples and excited with a laser. This results in a 16-bit image where the intensity is correlated to concentration of bound antibody. Artefacts may appear on the images due to dust, fibers or other problems, which affect the quantification. This project seeks to automatically detect such artifacts by classifying the images as good or bad using Deep Convolutional Neural Networks (DCNNs). To augment the dataset a simulation approach is used and a simulation program is developed that generates images based on developed simulation models. Several classification models are tested as well as different techniques used for training. The highest performing classifier is a VGG16 DCNN, pre-trained on simulated images, which reaches 94.8% accuracy. There are many sub-classes in the bad class, and many of these are very underrepresented in both the training and test datasets. This means that not much can be said of the classification power of these sub-classes. The conclusion is therefore that until more of this rare data can be collected, focus should lie on classifying the other more common examples. Using the approaches from this project, we believe this could result in a high performing product.
|
9 |
Investigation of hierarchical deep neural network structure for facial expression recognitionMotembe, Dodi 01 1900 (has links)
Facial expression recognition (FER) is still a challenging concept, and machines struggle to
comprehend effectively the dynamic shifts in facial expressions of human emotions. The
existing systems, which have proven to be effective, consist of deeper network structures that
need powerful and expensive hardware. The deeper the network is, the longer the training and
the testing. Many systems use expensive GPUs to make the process faster. To remedy the
above challenges while maintaining the main goal of improving the accuracy rate of the
recognition, we create a generic hierarchical structure with variable settings. This generic
structure has a hierarchy of three convolutional blocks, two dropout blocks and one fully
connected block. From this generic structure we derived four different network structures to
be investigated according to their performances. From each network structure case, we again
derived six network structures in relation to the variable parameters. The variable parameters
under analysis are the size of the filters of the convolutional maps and the max-pooling as
well as the number of convolutional maps. In total, we have 24 network structures to
investigate, and six network structures per case. After simulations, the results achieved after
many repeated experiments showed in the group of case 1; case 1a emerged as the top
performer of that group, and case 2a, case 3c and case 4c outperformed others in their
respective groups. The comparison of the winners of the 4 groups indicates that case 2a is the
optimal structure with optimal parameters; case 2a network structure outperformed other
group winners. Considerations were done when choosing the best network structure,
considerations were; minimum accuracy, average accuracy and maximum accuracy after 15
times of repeated training and analysis of results. All 24 proposed network structures were
tested using two of the most used FER datasets, the CK+ and the JAFFE. After repeated
simulations the results demonstrate that our inexpensive optimal network architecture
achieved 98.11 % accuracy using the CK+ dataset. We also tested our optimal network
architecture with the JAFFE dataset, the experimental results show 84.38 % by using just a
standard CPU and easier procedures. We also compared the four group winners with other
existing FER models performances recorded recently in two studies. These FER models used
the same two datasets, the CK+ and the JAFFE. Three of our four group winners (case 1a,
case 2a and case 4c) recorded only 1.22 % less than the accuracy of the top performer model
when using the CK+ dataset, and two of our network structures, case 2a and case 3c came in
third, beating other models when using the JAFFE dataset. / Electrical and Mining Engineering
|
Page generated in 0.0339 seconds