1 |
Residual Capsule NetworkBhamidi, Sree Bala Shruthi 08 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / The Convolutional Neural Network (CNN) have shown a substantial improvement in the field of Machine Learning. But they do come with their own set of drawbacks. Capsule Networks have addressed the limitations of CNNs and have shown a great improvement by calculating the pose and transformation of the image. Deeper networks are more powerful than shallow networks but at the same time, more difficult to train. Residual Networks ease the training and have shown evidence that they can give good accuracy with considerable depth. Putting the best of Capsule Network and Residual Network together, we present Residual Capsule Network and 3-Level Residual Capsule Network, a framework that uses the best of Residual Networks and Capsule Networks. The conventional Convolutional layer in Capsule Network is replaced by skip connections like the Residual Networks to decrease the complexity of the Baseline Capsule Network and seven ensemble Capsule Network. We trained our models on MNIST and CIFAR-10 datasets and have seen a significant decrease in the number of parameters when compared to the Baseline models.
|
2 |
Contextual Recurrent Level Set Networks and Recurrent Residual Networks for Semantic LabelingLe, Ngan Thi Hoang 01 May 2018 (has links)
Semantic labeling is becoming more and more popular among researchers in computer vision and machine learning. Many applications, such as autonomous driving, tracking, indoor navigation, augmented reality systems, semantic searching, medical imaging are on the rise, requiring more accurate and efficient segmentation mechanisms. In recent years, deep learning approaches based on Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) have dramatically emerged as the dominant paradigm for solving many problems in computer vision and machine learning. The main focus of this thesis is to investigate robust approaches that can tackle the challenging semantic labeling tasks including semantic instance segmentation and scene understanding. In the first approach, we convert the classic variational Level Set method to a learnable deep framework by proposing a novel definition of contour evolution named Recurrent Level Set (RLS). The proposed RLS employs Gated Recurrent Units to solve the energy minimization of a variational Level Set functional. The curve deformation processes in RLS is formulated as a hidden state evolution procedure and is updated by minimizing an energy functional composed of fitting forces and contour length. We show that by sharing the convolutional features in a fully end-to-end trainable framework, RLS is able to be extended to Contextual Recurrent Level Set (CRLS) Networks to address semantic segmentation in the wild problem. The experimental results have shown that our proposed RLS improves both computational time and segmentation accuracy against the classic variational Level Set-based methods whereas the fully end-to-end system CRLS achieves competitive performance compared to the state-of-the-art semantic segmentation approaches on PAS CAL VOC 2012 and MS COCO 2014 databases. The second proposed approach, Contextual Recurrent Residual Networks (CRRN), inherits all the merits of sequence learning information and residual learning in order to simultaneously model long-range contextual infor- mation and learn powerful visual representation within a single deep network. Our proposed CRRN deep network consists of three parts corresponding to sequential input data, sequential output data and hidden state as in a recurrent network. Each unit in hidden state is designed as a combination of two components: a context-based component via sequence learning and a visualbased component via residual learning. That means, each hidden unit in our proposed CRRN simultaneously (1) learns long-range contextual dependencies via a context-based component. The relationship between the current unit and the previous units is performed as sequential information under an undirected cyclic graph (UCG) and (2) provides powerful encoded visual representation via residual component which contains blocks of convolution and/or batch normalization layers equipped with an identity skip connection. Furthermore, unlike previous scene labeling approaches [1, 2, 3], our method is not only able to exploit the long-range context and visual representation but also formed under a fully-end-to-end trainable system that effectively leads to the optimal model. In contrast to other existing deep learning networks which are based on pretrained models, our fully-end-to-end CRRN is completely trained from scratch. The experiments are conducted on four challenging scene labeling datasets, i.e. SiftFlow, CamVid, Stanford background, and SUN datasets, and compared against various state-of-the-art scene labeling methods.
|
3 |
Wide Activated Separate 3D Convolution for Video Super-ResolutionYu, Xiafei 18 December 2019 (has links)
Video super-resolution (VSR) aims to recover a realistic high-resolution (HR) frame
from its corresponding center low-resolution (LR) frame and several neighbouring supporting frames. The neighbouring supporting LR frames can provide extra information to help recover the HR frame. However, these frames are not aligned with the center frame due to the motion of objects. Recently, many video super-resolution methods based on deep learning have been proposed with the rapid development of neural networks. Most of these methods utilize motion estimation and compensation models as preprocessing to handle spatio-temporal alignment problem. Therefore, the accuracy of these motion estimation models are critical for predicting the high-resolution frames. Inaccurate results of motion compensation models will lead to artifacts and blurs, which also will damage the recovery of high-resolution frames. We propose an effective wide activated separate 3 dimensional (3D) Convolution Neural Network (CNN) for video super-resolution to overcome the drawback of utilizing motion compensation models. Separate 3D convolution factorizes the 3D convolution into convolutions in the spatial and temporal domain, which have benefit for the optimization of spatial and temporal convolution components. Therefore, our method can capture temporal and spatial information of input frames simultaneously without additional motion evaluation and compensation model. Moreover, the experimental results demonstrated the effectiveness of the proposed wide activated separate 3D CNN.
|
4 |
TOWARDS AN UNDERSTANDING OF RESIDUAL NETWORKS USING NEURAL TANGENT HIERARCHYYuqing Li (10223885) 06 May 2021 (has links)
<div>Deep learning has become an important toolkit for data science and artificial intelligence. In contrast to its practical success across a wide range of fields, theoretical understanding of the principles behind the success of deep learning has been an issue of controversy. Optimization, as an important component of theoretical machine learning, has attracted much attention. The optimization problems induced from deep learning is often non-convex and</div><div>non-smooth, which is challenging to locate the global optima. However, in practice, global convergence of first-order methods like gradient descent can be guaranteed for deep neural networks. In particular, gradient descent yields zero training loss in polynomial time for deep neural networks despite its non-convex nature. Besides that, another mysterious phenomenon is the compelling performance of Deep Residual Network (ResNet). Not only</div><div>does training ResNet require weaker conditions, the employment of residual connections by ResNet even enables first-order methods to train the neural networks with an order of magnitude more layers. Advantages arising from the usage of residual connections remain to be discovered.</div><div><br></div><div>In this thesis, we demystify these two phenomena accordingly. Firstly, we contribute to further understanding of gradient descent. The core of our analysis is the neural tangent hierarchy (NTH) that captures the gradient descent dynamics of deep neural networks. A recent work introduced the Neural Tangent Kernel (NTK) and proved that the limiting</div><div>NTK describes the asymptotic behavior of neural networks trained by gradient descent in the infinite width limit. The NTH outperforms the NTK in two ways: (i) It can directly study the time variation of NTK for neural networks. (ii) It improves the result to non-asymptotic settings. Moreover, by applying NTH to ResNet with smooth and Lipschitz activation function, we reduce the requirement on the layer width m with respect to the number of training samples n from quartic to cubic, obtaining a state-of-the-art result. Secondly, we extend our scope of analysis to structural properties of deep neural networks. By making fair and consistent comparisons between fully-connected network and ResNet, we suggest strongly that the particular skip-connection architecture possessed by ResNet is the main</div><div>reason for its triumph over fully-connected network.</div>
|
5 |
A comparative evaluation of 3d and spatio-temporal deep learning techniques for crime classification and predictionMatereke, Tawanda Lloyd January 2021 (has links)
>Magister Scientiae - MSc / This research is on a comparative evaluation of 3D and spatio-temporal deep learning
methods for crime classification and prediction using the Chicago crime dataset, which
has 7.29 million records, collected from 2001 to 2020. In this study, crime classification
experiments are carried out using two 3D deep learning algorithms, i.e., 3D Convolutional
Neural Network and the 3D Residual Network. The crime classification models
are evaluated using accuracy, F1 score, Area Under Receiver Operator Curve (AUROC),
and Area Under Curve - Precision-Recall (AUCPR). The effectiveness of spatial grid resolutions
on the performance of the classification models is also evaluated during training,
validation and testing.
|
6 |
LSTM Neural Networks for Detection and Assessment of Back Pain Risk in Manual LiftingThomas, Brennan January 2021 (has links)
No description available.
|
7 |
Precipitation Nowcasting using Residual NetworksVega Ezpeleta, Emilio January 2018 (has links)
The aim of this paper is to investigate if rainfall prediction (nowcasting) can successively be made using a deep learning approach. The input to the networks are different spatiotemporal variables including forecasts from a NWP model. The results indicate that these networks has some predictive power and could be use in real application. Another interesting empirical finding relates to the usage of transfer learning from a domain which is not related instead of random initialization. Using pretrained parameters resulted in better convergence and overall performance than random initialization of the parameters.
|
8 |
Residual Capsule NetworkSree Bala Shrut Bhamidi (6990443) 13 August 2019 (has links)
<p>The Convolutional Neural
Network (CNN) have shown a substantial improvement in the field of Machine
Learning. But they do come with their own set of drawbacks. Capsule Networks
have addressed the limitations of CNNs and have shown a great improvement by calculating
the pose and transformation of the image. Deeper networks are more powerful
than shallow networks but at the same time, more difficult to train. Residual
Networks ease the training and have shown evidence that they can give good
accuracy with considerable depth. Putting the best of Capsule Network and
Residual Network together, we present Residual Capsule Network and 3-Level
Residual Capsule Network, a framework that uses the best of Residual Networks
and Capsule Networks. The conventional Convolutional layer in Capsule Network
is replaced by skip connections like the Residual Networks to decrease the
complexity of the Baseline Capsule Network and seven ensemble Capsule Network.
We trained our models on MNIST and CIFAR-10 datasets and have seen a significant
decrease in the number of parameters when compared to the Baseline models.</p>
|
9 |
Verifikace osob podle hlasu bez extrakce příznaků / Speaker Verification without Feature ExtractionLukáč, Peter January 2021 (has links)
Verifikácia osôb je oblasť, ktorá sa stále modernizuje, zlepšuje a snaží sa vyhovieť požiadavkám, ktoré sa na ňu kladú vo oblastiach využitia ako sú autorizačné systmémy, forenzné analýzy, atď. Vylepšenia sa uskutočňujú vďaka pokrom v hlbokom učení, tvorením nových trénovacích a testovacích dátovych sad a rôznych súťaží vo verifikácií osôb a workshopov. V tejto práci preskúmame modely pre verifikáciu osôb bez extrakcie príznakov. Používanie nespracovaných zvukových stôp ako vstupy modelov zjednodušuje spracovávanie vstpu a teda znižujú sa výpočetné a pamäťové požiadavky a redukuje sa počet hyperparametrov potrebných pre tvorbu príznakov z nahrávok, ktoré ovplivňujú výsledky. Momentálne modely bez extrakcie príznakov nedosahujú výsledky modelov s extrakciou príznakov. Na základných modeloch budeme experimentovať s modernými technikamy a budeme sa snažiť zlepšiť presnosť modelov. Experimenty s modernými technikamy značne zlepšili výsledky základných modelov ale stále sme nedosiahli výsledky vylepšeného modelu s extrakciou príznakov. Zlepšenie je ale dostatočné nato aby sme vytovrili fúziu so s týmto modelom. Záverom diskutujeme dosiahnuté výsledky a navrhujeme zlepšenia na základe týchto výsledkov.
|
10 |
[en] AN EVALUATION OF DEEP LEARNING TECHNIQUES FOR FOREST PARAMETERS ESTIMATION IN THE BRAZILIAN LEGAL AMAZON FROM MULTI-SOURCE REMOTE SENSING IMAGERY / [pt] AVALIAÇÃO DE MODELOS DE DEEP LEARNING PARA ESTIMAÇÃO DE PARÂMETROS DE FLORESTA NA AMAZÔNIA BRASILEIRA LEGAL A PARTIR DE IMAGENS DE SENSORIAMENTO REMOTOPAOLA EDITH AYMA QUIRITA 25 March 2025 (has links)
[pt] Nos últimos anos, a estimativa de parâmetros florestais, como a altura
das árvores (CH) e a biomassa acima do solo (AGB) tem ganhado muita
importância devido ao seu papel essencial na compreensão do ciclo global do
carbono, na mitigação das mudanças climáticas e na prevenção da perda de
biodiversidade. A inferência precisa desses parâmetros é crucial porque eles
são indicadores chave da saúde da floresta e da capacidade de armazenamento
de carbono. A Amazônia brasileira, uma floresta tropical vital, desempenha
um papel crucial na absorção de tanto carbono quanto o que é liberado pelo
desmatamento e pela degradação. A compreensão e o monitoramento da CH e
da AGB permitem melhores estratégias de gestão e conservação, e promovem
práticas sustentáveis. Tradicionalmente, esses parâmetros florestais, têm sido
estimados por meio de métodos de campo, como inventário florestal, que
envolvem medição física das árvores. No entanto, esses métodos são altamente
precisos, são trabalhosos e, muitas vezes, impraticáveis para avaliações em
larga escala devido à natureza vasta e inacessível das florestas. Além disso, a
aplicação de técnicas de aprendizado de máquina (ML) e aprendizado profundo
(DL) oferece vantagens significativas em relação aos métodos tradicionais,
fornecendo soluções rápidas e dimensionáveis para a estimação dos parâmetros
florestais em áreas extensas. Além disso, esses métodos podem integrar dados
de diferentes fontes, aumentando a robustez das estimativas. Embora muitos
estudos tenham utilizado dados de inventário florestal, RS e técnicas de ML,
as técnicas de DL permanecem pouco exploradas em estudos na Amazônia
brasileira. Este estudo visa avaliar técnicas de DL para estimar a CH e a AGB
em florestas tropicais densas usando diferentes imagens de RS, incluindo o
Sentinel-1, ALOS-2/PALSAR-2, Sentinel-2 e GEDI. Três modelos de DL foram
testados para a estimativa da CH, sendo que o melhor modelo alcançou um
R(2) de 0.751, um MAE de 4.068 metros, e um RMSE 5.737 metros. Além disso,
várias técnicas de ML foram avaliadas para a estimativa de AGB, resultando
em um R(2) de 0.648, MAE de 48.842 Mg·ha(-1), e RMSE of 70.745 Mg·ha(-1)
. / [en] In recent years, estimating forest parameters such as Tree Height (CH)
and AboveGround Biomass (AGB) has gained importance due to their essential
role in understanding the global carbon cycle, mitigating climate change, and
preventing biodiversity loss. Accurate inference of these parameters is crucial
because they are key indicators of forest health and carbon storage capacity.
The Brazilian Amazon, a vital tropical forest, plays a crucial role in absorbing as much carbon as is released through deforestation and degradation.
Understanding and monitoring CH and AGB enable better management and
conservation strategies and promote sustainable practices. Traditionally, these
forest parameters have been estimated through ground-based methods, such
as forest inventory plots, which involve physically measuring trees. While these
methods are highly accurate, they are labor-intensive and often impractical for
large-scale assessments due to the vast and inaccessible nature of forests. Additionally, the application of Machine Learning (ML) and Deep Learning (DL)
techniques offers significant advantages over traditional methods, providing
rapid and scalable solutions for estimating forest parameters across extensive
areas. Moreover, they can integrate data from various sources, enhancing the
robustness of the estimates. While many studies have utilized forest inventory
plots, RS, and ML techniques, DL techniques remain underexplored in studies
within the Brazilian Amazon. This study aims to evaluate DL techniques for
estimating TH and AGB in dense tropical forests using various RS imagery,
including Sentinel-1, ALOS-2/PALSAR-2, Sentinel-2, and GEDI. Three DL
models were tested for CH estimation, where the best of the models achieve a
R(2) of 0.751, an MAE of 4.068 meters, and an RMSE of 5.737 meters. Furthermore, various ML techniques were evaluated for AGB estimation, resulting in
an R(2) of 0.648, an MAE of 48.842 Mg·ha(-1), and RMSE of 70.745 Mg·ha(-1).
|
Page generated in 0.1639 seconds