271 |
VISUAL SALIENCY ANALYSIS, PREDICTION, AND VISUALIZATION: A DEEP LEARNING PERSPECTIVEMahdi, Ali Majeed 01 August 2019 (has links) (PDF)
In the recent years, a huge success has been accomplished in prediction of human eye fixations. Several studies employed deep learning to achieve high accuracy of prediction of human eye fixations. These studies rely on pre-trained deep learning for object classification. They exploit deep learning either as a transfer-learning problem, or the weights of the pre-trained network as the initialization to learn a saliency model. The utilization of such pre-trained neural networks is due to the relatively small datasets of human fixations available to train a deep learning model. Another relatively less prioritized problem is amount of computation of such deep learning models requires expensive hardware. In this dissertation, two approaches are proposed to tackle abovementioned problems. The first approach, codenamed DeepFeat, incorporates the deep features of convolutional neural networks pre-trained for object and scene classifications. This approach is the first approach that uses deep features without further learning. Performance of the DeepFeat model is extensively evaluated over a variety of datasets using a variety of implementations. The second approach is a deep learning saliency model, codenamed ClassNet. Two main differences separate the ClassNet from other deep learning saliency models. The ClassNet model is the only deep learning saliency model that learns its weights from scratch. In addition, the ClassNet saliency model treats prediction of human fixation as a classification problem, while other deep learning saliency models treat the human fixation prediction as a regression problem or as a classification of a regression problem.
|
272 |
Learning 3D Shape Representations for Reconstruction and ModelingBiao, Zhang 04 1900 (has links)
Neural fields, also known as neural implicit representations, are powerful for modeling 3D shapes. They encode shapes as continuous functions mapping 3D coordinates to scalar values like the signed distance function (SDF) or occupancy probability.
Neural fields represent complex shapes using an MLP. The MLP takes spatial coordinates, undergoes nonlinear transformations, and approximates the continuous function of the neural field. During training, the MLP's weights are learned through backpropagation.
This PhD thesis presents novel methods for shape representation learning and generation with neural fields.
The first part introduces an interpretable and high-quality reconstruction method for neural fields. A neural network predicts labeled points, improving surface visualization and interpretability. The method achieves accurate reconstruction even with rendered image input. A binary classifier, based on predicted labeled points, represents the shape's surface with precision.
The second part focuses on shape generation, a challenge in generative modeling. Complex data structures like oct-trees or BSP-trees are challenging to generate with neural networks. To address this, a two-step framework is proposed: an autoencoder compresses the neural field into a fixed-size latent space, followed by training generative models within that space. Incorporating sparsity into the shape autoencoding network reduces dimensionality while maintaining high-quality shape reconstruction. Autoregressive transformer models enable the generation of complex shapes with intricate details.
This research explores the potential of denoising diffusion models for 3D shape generation. The latent space efficiency is improved by further compression, leading to more efficient and effective generation of high-quality shapes. Remarkable shape reconstruction results are achieved, even without sparse structures. The approach combines the latest generative model advancements with novel techniques, advancing the field. It has the potential to revolutionize shape generation in gaming, manufacturing, and beyond.
In summary, this PhD thesis proposes novel methods for shape representation learning, generation, and reconstruction. It contributes to the field of shape analysis and generation by enhancing interpretability, improving reconstruction quality, and pushing the boundaries of efficient and effective 3D shape generation.
|
273 |
AI-augmented analysis onto the impact of the containment strategies and climate change to pandemicDong, Shihao January 2023 (has links)
This thesis uses a multi-tasking long short-term memory (LSTM) model to investigate the correlation between containment strategies, climate change, and the number of COVID-19 transmissions and deaths. The study focuses on examining the accuracy of different factors in predicting the number of daily confirmed cases and deaths cases to further explore the correlation between different factors and cases. The initial assessment results suggest that containment strategies, specifically vaccination policies, have a more significant impact on the accuracy of predicting daily confirmed cases and deaths from COVID-19 compared to climate factors such as the daily average surface 2-meter temperature. Additionally, the study reveals that there are unpredictable effects on predictive accuracy resulting from the interactions among certain impact factors. However, the lack of interpretability of deep learning models poses a significant challenge for real-world applications. This study provides valuable insights into understanding the correlation between the number of daily confirmed cases, daily deaths, containment strategies, and climate change, and highlights areas for further research. It is important to note that while the study reveals a correlation, it does not imply causation, and further research is needed to understand the trends of the pandemic.
|
274 |
Fusion for Object DetectionWei, Pan 10 August 2018 (has links)
In a three-dimensional world, for perception of the objects around us, we not only wish to classify them, but also know where these objects are. The task of object detection combines both classification and localization. In addition to predicting the object category, we also predict where the object is from sensor data. As it is not known ahead of time how many objects that we have interest in are in the sensor data and where are they, the output size of object detection may change, which makes the object detection problem difficult. In this dissertation, I focus on the task of object detection, and use fusion to improve the detection accuracy and robustness. To be more specific, I propose a method to calculate measure of conflict. This method does not need external knowledge about the credibility of each source. Instead, it uses the information from the sources themselves to help assess the credibility of each source. I apply the proposed measure of conflict to fuse independent sources of tracking information from various stereo cameras. Besides, I propose a computational intelligence system for more accurate object detection in real--time. The proposed system uses online image augmentation before the detection stage during testing and fuses the detection results after. The fusion method is computationally intelligent based on the dynamic analysis of agreement among inputs. Comparing with other fusion operations such as average, median and non-maxima suppression, the proposed methods produces more accurate results in real-time. I also propose a multi--sensor fusion system, which incorporates advantages and mitigate disadvantages of each type of sensor (LiDAR and camera). Generally, camera can provide more texture and color information, but it cannot work in low visibility. On the other hand, LiDAR can provide accurate point positions and work at night or in moderate fog or rain. The proposed system uses the advantages of both camera and LiDAR and mitigate their disadvantages. The results show that comparing with LiDAR or camera detection alone, the fused result can extend the detection range up to 40 meters with increased detection accuracy and robustness.
|
275 |
Mass Classification of Digital Mammograms Using Convolutional Neural NetworksFranklin, Elijah 04 May 2018 (has links)
This thesis explores the current deep learning (DL) approaches to computer aided diagnosis (CAD) of digital mammographic images and presents two novel designs for overcoming current obstacles endemic to the field, using convolutional neural networks (CNNs). The first method employed utilizes Bayesian statistics to perform decision level fusion from multiple images of an individual. The second method utilizes a new data pre-processing scheme to artificially expand the limited available training data and reduce model overitting.
|
276 |
Dose Prediction for Radiotherapy of Advanced Stage Lung CancerSingh, Rachna January 2020 (has links)
A dose prediction model for treatment planning was generated using U-Net architecture. The model was generated for advanced stage cancer patients. The U- Net architecture was created with depth=6 and kernel=6. The model architecture was successful to reduce the input image size (192X192) to feature map (6X6) which helped to extract the low level features. The dose prediction of the model was trained with depth=6, kernel=6, MSE loss, Adam optimizer, 1000 epochs and a batch size of 4. The predicted dose was rescaled for gamma analysis to quantify accuracy of the model. The renormalized predicted dose was quantified using gamma analysis with a 3mm, 3% dose tolerance. The gamma map was generated to visualize the regions where dose distributions failed. The gamma percentage values obtained on the training set were acceptable. The mean and standard deviation values of gamma pass percentage obtained on training dataset were 97.5% and 1.24% respectively, which concluded that training process was successful and was an almost perfect match of true dose and predicted dose. However, gamma pass percentage values obtained on validation set was not a good representation of the true dose. Nevertheless, the validation dataset was able to predict the approximate highest dose region. A gamma analysis with a 5mm, 5% dose tolerance was performed to test the the level of discrepancy between the predicted and true dose in the validation set. This increased the gamma pass percentage compared to the 3mm, 3% analysis to a mean gamma pass percentage of 26.2 ± 7.47%. Although the predicted dose was not of sufficient accuracy for clinical use, there technique studied in this work show promise for further development. / Thesis / Master of Science (MSc)
|
277 |
A Discrete Wavelet Transform GAN for NonHomogeneous DehazingFu, Minghan January 2021 (has links)
Hazy images are often subject to color distortion, blurring and other visible quality degradation. Some existing CNN-based methods have shown great performance on removing the homogeneous haze, but they are not robust in the non-homogeneous case. The reason is twofold. Firstly, due to the complicated haze distribution, texture details are easy to get lost during the dehazing process. Secondly, since the training pairs are hard to be collected, training on limited data can easily lead to the over-fitting problem. To tackle these two issues, we introduce a novel dehazing network using the 2D discrete wavelet transform, namely DW-GAN. Specifically, we propose a two-branch network to deal with the aforementioned problems. By utilizing the wavelet transform in the DWT branch, our proposed method can retain more high-frequency information in feature maps. To prevent over-fitting, ImageNet pre-trained Res2Net is adopted in the knowledge adaptation branch. Owing to the robust feature representations of ImageNet pre-training, the generalization ability of our network is improved dramatically. Finally, a patch-based discriminator is used to reduce artifacts of the restored images. Extensive experimental results demonstrate that the proposed method outperforms the state-of-the-art quantitatively and qualitatively. / Thesis / Master of Applied Science (MASc)
|
278 |
Deep Learning for PET Imaging : From Denoising to Learned Primal-Dual Reconstruction / Djupinlärning i PET-avbildning : Från brusreducering till Learned Primal-Dual bildrekonstruktionGuazzo, Alessandro January 2020 (has links)
PET imaging is a key tool in the fight against cancer. One of the main issues of PET imaging is the high level of noise that characterizes the reconstructed image, during this project we implemented several algorithms with the aim of improving the reconstruction of PET images exploiting the power of Neural Networks. First, we developed a simple denoiser that improves the quality of an image that has already been reconstructed with a reconstruction algorithm like the Maximum Likelihood Expectation Maximization. Then we implemented two Neural Network based iterative reconstruction algorithms that reconstruct directly an image starting from the measured data rearranged into sinograms, thus removing the dependence of the reconstruction result from the initial reconstruction needed by the denoiser. Finally, we used the most promising approach, among the developed ones, to reconstruct images from data acquired with the KTH MTH microCT - miniPET.
|
279 |
Malignant Melanoma Classification with Deep Learning / Klassificering av malignt melanom genom djupinlärningKisselgof, Jakob January 2019 (has links)
Malignant melanoma is the deadliest form of skin cancer. If correctly diagnosed in time, the expected five-year survival rate can increase up to 97 %. Therefore, exploring various methods for early detection can contribute with tools which can be used to improve detection of disease and finally to make sure that help is given in time. The purpose of this work was to investigate the performance and behavior of different convolutional neural network (CNN) architectures and to explore whether presegmenting clinical images would improve the prediction results on a binary classifier system. For the purposes of this paper, the two selected CNNs were Inception v3 and DenseNet201. The networks were pretrained on ImageNet and transfer learning techniques such as feature extraction and fine-tuning were used to extract the features of the training set. Batch size was varied and five-fold cross-validation was applied during training to find the optimal number of epochs for training. Evaluation was done on the ISIC test set, the PH2 dataset and a combined set of images from Karolinska University Hospital and FirstDerm, where the latter was also cropped to evaluate presegmentation. The achieved results for the ISIC test set were AUCs of 0.66 for Inception v3 and 0.71 for DenseNet201. For the PH2 test set, the AUCs were 0.82 and 0.73. The results for the Karolinska and FirstDerm set were 0.49 and 0.42. Presegmenting the latter test set resulted in AUCs of 0.58 and 0.51. In conclusion, quality of images could have a big impact on the classification performance. Batch size seems to affect the performance and could thus be an important hyperparameter to tune. Ultimately, the Inception v3 architecture seems to be less affected by different variability why selecting this architecture for a real-world clinical image application could be more suitable. However, the networks performed much worse than state of the art results in previous papers and the conclusions are based on rather inconclusive results. Therefore more research has to be done to verify the conclusions. / Malignt melanom är den dödligaste formen av hudcancer. Om en korrekt diagnos sätts tillräckligt tidigt kan den femåriga överlevnadsgraden uppgå till 97 %. Detta gör att forskningen efter metoder för tidigarelagd upptäckt kan bidra med verktyg som i sin tur kan användas till att upptäcka sjukdom och slutligen bidra till att hjälp sätts in i tid. Målet med detta arbete var att undersöka prestanda och beteende för olika faltningsbara neurala nätverk (CNN) och att undersöka ifall försegmentering av kliniska bilder kunde förbättra resultaten i ett binärt klassificeringssystem. De utvalda faltningsbara neurala nätverksarkitekturerna var Inception v3 och DenseNet201. Nätverken var förträanade på ImageNet och "Transfer-learning"-metoder som feature extraction och fine-tuning användes för att extrahera features från träningsuppsättningen. Batch size varierades och femtalig korsvalidering användes för att hitta det optimala antalet träningsepoker. Utvärderingen gjordes med bilder i testset från ISIC, PH2 och Karolinska och FirstDerm. Bilderna i den senare datamängden beskärdes för att utvärdera försegmenteringen av kliniska bilder. De uppnådda resultaten för ISIC testmängden var AUC-värden på 0.66 för Inception v3 och 0.71 för DenseNet201. För PH2 låg AUC-värdena på 0.82 respektive 0.73. Resultaten för testmängden med bilder frön Karolinska och FirstDerm var 0.40 och 0.42. Försegmenteringen av den sistnämnda testmängden gav AUC-värden på 0.58 och 0.51. Sammanfattningsvis kan bildkvalitet ha en stor inverkan på ett nätverks klassificeringsprestanda. Batch size verkar också påverka resultaten ochkan därför vara en viktig hyperparameter att stämma. Slutligen verkar Inception v3 vara mindre känslig för olika typer av variabiltet vilket görvalet av denna arkitektur mer lämplig ifall en riktig applikation ska byggas för detektion av exempelvis kliniska bilder. Det som bör understrykas i detta arbete är att resultaten var mycket sämre än det som bäst uppvisats i föregående rapporter och att slutasatserna är baserade på relativt ickeövertygande värden. Därför efterkrävs mer forskning för att styrka slutsatserna.
|
280 |
A data-driven approach for the investigation of microstructural effects on the effective piezoelectric responses of additively manufactured triply periodic bi-continuous piezocompositeYang, Wenhua 10 December 2021 (has links) (PDF)
A two-scale model consisting of ceramic grain scale and composite scale are developed to systematically evaluate the effects of microstructures (e.g., residual pores, grain size, texture) and geometry on the piezoelectric responses of the polarized triply periodic bi-continuous (TPC) piezocomposites. These TPC piezocomposites were fabricated by a recently developed additive manufacturing (AM) process named suspension-enclosing projection-stereolithography (SEPS) under different process conditions. In the model, the Fourier spectral iterative perturbation method (FSIPM) and the finite element method will be adopted for the calculation at the grain and composite scale, respectively. On the grain scale, a DL approach based on stacked generative adversarial network (StackGAN-v2) is proposed to reconstruct microstructures. The presented modeling approach can reconstruct high-fidelity microstructures of additively manufactured piezoceramics with different resolutions, which are statistically equivalent to original microstructures either experimentally observed or numerically predicted. Design maps for hydrostatic piezoelectric charging coefficients dh show they can achieve optimal performance at wide ranges of micro-porosity and geometry parameter u for the proposed TPC piezocomposites. In addition, geometry parameter u plays a dominant role in determining the intensity of hydrostatic voltage coefficient gh and hydrostatic figure of merit (HFOM) of all the presented TPC piezocomposites in the vicinity of the starting point of three-dimension (3D) interconnectivity. Within this range, these properties would increase first with the increasing of micro-porosity volume fraction (VF) and start to decrease once they reach peak values. The presented TPC piezocomposites exhibit a superb hydrostatic properties, with the same 20% VF of ceramics and 2% VF of micro-porosity with respect to composites and ceramics, respectively, TPC of face center cubic (FCC) demonstrates 327-fold enhancement of HFOM than that of the piezocomposite with three intersecting ceramic cuboids. The piezoelectric properties of FCC are superior to those of body center cubic (BCC) and simple cubic (SC). The calculated piezoelectric charging constants d33 and relative permittivity κ33 were then compared with the data measured from the products fabricated by the SEPS under different process conditions. The calculation results at both grain scale and composite scale were found to agree well with experimental results.
|
Page generated in 0.103 seconds