81 |
Strawberry Detection Under Various Harvestation StagesFitter, Yavisht 01 March 2019 (has links) (PDF)
This paper analyzes three techniques attempting to detect strawberries at various stages in its growth cycle. Histogram of Oriented Gradients (HOG), Local Binary Patterns (LBP) and Convolutional Neural Networks (CNN) were implemented on a limited custom-built dataset. The methodologies were compared in terms of accuracy and computational efficiency. Computational efficiency is defined in terms of image resolution as testing on a smaller dimensional image is much quicker than larger dimensions. The CNN based implementation obtained the best results with an 88% accuracy at the highest level of efficiency as well (600x800). LBP generated moderate results with a 74% detection accuracy at an inefficient rate (5000x4000). Finally, HOG’s results were inconclusive as it performed poorly early on, generating too many misclassifications.
|
82 |
Deformable 3D Brain MRI Registration with Deep Learning / Deformerbar 3D MRI-registrering med djupinlärningJoos, Louis January 2019 (has links)
Traditional deformable registration methods have achieved impressive performances but are computationally time-consuming since they have to optimize an objective function for each new pair of images. Very recently some learning-based approaches have been proposed to enable fast registration by learning to estimate the spatial transformation parameters directly from the input images. Here we present a method for 3D fast pairwise registration of brain MR images. We model the deformation function with B-splines and learn the optimal control points using a U-Net like CNN architecture. An inverse-consistency loss has been used to enforce diffeomorphicity of the deformation. The proposed algorithm does not require supervised information such as segmented labels but some can be used to help the registration process. We also implemented several strategies to account for the multi-resolution nature of the problem. The method has been evaluated on MICCAI 2012 brain MRI datasets, and evaluated on both similarity and invertibility of the computed transformation.
|
83 |
SQUEEZE AND EXCITE RESIDUAL CAPSULE NETWORK FOR EMBEDDED EDGE DEVICESSami Naqvi (13154274) 08 September 2022 (has links)
<p>During recent years, the field of computer vision has evolved rapidly. Convolutional Neural Networks (CNNs) have become the chosen default for implementing computer vision tasks. The popularity is based on how the CNNs have successfully performed the wellknown</p>
<p>computer vision tasks such as image annotation, instance segmentation, and others with promising outcomes. However, CNNs have their caveats and need further research to turn them into reliable machine learning algorithms. The disadvantages of CNNs become more evident as the approach to breaking down an input image becomes apparent. Convolutional neural networks group blobs of pixels to identify objects in a given image. Such a</p>
<p>technique makes CNNs incapable of breaking down the input images into sub-parts, which could distinguish the orientation and transformation of objects and their parts. The functions in a CNN are competent at learning only the shift-invariant features of the object in an image. The discussed limitations provides researchers and developers a purpose for further enhancing an effective algorithm for computer vision.</p>
<p>The opportunity to improve is explored by several distinct approaches, each tackling a unique set of issues in the convolutional neural network’s architecture. The Capsule Network (CapsNet) which brings an innovative approach to resolve issues pertaining to affine transformations</p>
<p>by sharing transformation matrices between the different levels of capsules. While, the Residual Network (ResNet) introduced skip connections which allows deeper networks</p>
<p>to be more powerful and solves vanishing gradient problem.</p>
<p>The motivation of these fusion of these advantageous ideas of CapsNet and ResNet with Squeeze and Excite (SE) Block from Squeeze and Excite Network, this research work presents SE-Residual Capsule Network (SE-RCN), an efficient neural network model. The proposed model, replaces the traditional convolutional layer of CapsNet with skip connections and SE Block to lower the complexity of the CapsNet. The performance of the model is demonstrated on the well known datasets like MNIST and CIFAR-10 and a substantial reduction in the number of training parameters is observed in comparison to similar neural networks. The proposed SE-RCN produces 6.37 Million parameters with an accuracy of 99.71% on the MNIST dataset and on CIFAR-10 dataset it produces 10.55 Million parameters with 83.86% accuracy.</p>
|
84 |
Secure and reliable deep learning in signal processingLiu, Jinshan 09 June 2021 (has links)
In conventional signal processing approaches, researchers need to manually extract features from raw data that can better describe the underlying problem. Such a process requires strong domain knowledge about the given problems. On the contrary, deep learning-based signal processing algorithms can discover features and patterns that would not be apparent to humans by feeding a sufficient amount of training data. In the past decade, deep learning has proved to be efficient and effective at delivering high-quality results.
Deep learning has demonstrated its great advantages in image processing and text mining. One of the most promising applications of deep learning-based signal processing techniques is autonomous driving. Today, many companies are developing and testing autonomous vehicles. High-level autonomous vehicles are expected to be commercialized in the near future. Besides, deep learning has demonstrated great potential in wireless communications applications. Researchers have addressed some of the most challenging problems such as transmitter classification and modulation recognition using deep learning.
Despite these advantages, there exist a wide range of security and reliability issues when applying deep learning models to real-world applications. First, deep learning models could not generate reliable results for testing data if the training data size is insufficient. Since generating training data is time consuming and resource intensive, it is important to understand the relationship between model reliability and the size of training data. Second, deep learning models could generate highly unreliable results if the testing data are significantly different from the training data, which we refer to as ``out-of-distribution (OOD)'' data. Failing to detect OOD testing data may expose serious security risks. Third, deep learning algorithms can be easily fooled when the input data are falsified. Such vulnerabilities may cause severe risks in safety-critical applications such as autonomous driving.
In this dissertation, we focus on the security and reliability issues in deep learning models in the following three aspects. (1) We systematically study how the model performance changes as more training data are provided in wireless communications applications. (2) We discuss how OOD data can impact the performance of deep learning-based classification models in wireless communications applications. We propose FOOD (Feature representation for OOD detection), a unified model that can detect OOD testing data effectively and perform classifications for regular testing data simultaneously. (3) We focus on the security issues of applying deep learning algorithms to autonomous driving. We discuss the impact of Perception Error Attacks (PEAs) on LIDAR and camera and propose a countermeasure called LIFE (LIDAR and Image data Fusion for detecting perception Errors). / Doctor of Philosophy / Deep learning has provided computers and mobile devices extraordinary powers to solve challenging signal processing problems. For example, current deep learning technologies are able to improve the quality of machine translation significantly, recognize speech as accurately as human beings, and even outperform human beings in face recognition.
Although deep learning has demonstrated great advantages in signal processing, it can be insecure and unreliable if the model is not trained properly or is tested under adversarial scenarios. In this dissertation, we study the following three security and reliability issues in deep learning-based signal processing methods. First, we provide insights on how the deep learning model reliability is changed as the size of training data increases. Since generating training data requires a tremendous amount of labor and financial resources, our research work could help researchers and product developers to gain insights on balancing the tradeoff between model performance and training data size. Second, we propose a novel model to detect the abnormal testing data that are significantly different from the training data. In deep learning, there is no performance guarantee when the testing data are significantly different from the training data. Failing to detect such data may cause severe security risks. Finally, we design a system to detect sensor attacks targeting autonomous vehicles. Deep learning can be easily fooled when the input sensor data are falsified. Security and safety can be enhanced significantly if the autonomous driving systems are able to figure out the falsified sensor data before making driving decisions.
|
85 |
Risk-Aware Planning by Extracting Uncertainty from Deep Learning-Based PerceptionToubeh, Maymoonah I. 07 December 2018 (has links)
The integration of deep learning models and classical techniques in robotics is constantly creating solutions to problems once thought out of reach. The issues arising in most models that work involve the gap between experimentation and reality, with a need for strategies that assess the risk involved with different models when applied in real-world and safety-critical situations. This work proposes the use of Bayesian approximations of uncertainty from deep learning in a robot planner, showing that this produces more cautious actions in safety-critical scenarios. The case study investigated is motivated by a setup where an aerial robot acts as a "scout'' for a ground robot when the below area is unknown or dangerous, with applications in space exploration, military, or search-and-rescue. Images taken from the aerial view are used to provide a less obstructed map to guide the navigation of the robot on the ground. Experiments are conducted using a deep learning semantic image segmentation, followed by a path planner based on the resulting cost map, to provide an empirical analysis of the proposed method. The method is analyzed to assess the impact of variations in the uncertainty extraction, as well as the absence of an uncertainty metric, on the overall system with the use of a defined factor which measures surprise to the planner. The analysis is performed on multiple datasets, showing a similar trend of lower surprise when uncertainty information is incorporated in the planning, given threshold values of the hyperparameters in the uncertainty extraction have been met. / Master of Science / Deep learning (DL) is the phrase used to refer to the use of large hierarchical structures, often called neural networks, to approximate semantic information from data input of various forms. DL has shown superior performance at many tasks, such as several forms of image understanding, often referred to as computer vision problems. Deep learning techniques are trained using large amounts of data to map input data to output interpretation. The method should then perform correct input-output mappings on new data, different from the data it was trained on.
Robots often carry various sensors from which it is possible to make interpretations about the environment. Inputs from a sensor can be high dimensional, such as pixels given by a camera, and processing these inputs can be quite tedious and inefficient given a human interpreter. Deep learning has recently been adopted by roboticists as a means of automatically interpreting and representing sensor inputs, like images. The issue that arises with the traditional use of deep learning is twofold: it forces an interpretation of the inputs even when an interpretation is not applicable, and it does not provide a measure of certainty with its outputs. Many techniques have been developed to address this issue with deep learning. These techniques aim to produce a measure of uncertainty associated with DL outputs, such that even when an incorrect or inapplicable output is produced, it is accompanied with a high level of uncertainty.
To explore the efficacy and applicability of these uncertainty extraction techniques, this thesis looks at their use as applied to part of a robot planning system. Specifically, the input to the robot planner is an overhead image taken by an unmanned aerial vehicle (UAV) and the output is a path from a set start and goal position to be taken by an unmanned ground vehicle (UGV) below. The image is passed through a deep learning portion of the system that performs what is called semantic segmentation, mapping each pixel to a meaningful class, on the image. Based on the segmentation, each pixel is given a cost proportionate to the perceived level of safety associated with that class. A cost map is thus formed on the entire image, from which traditional robotics techniques are used to plan a path from start to goal.
A comparison is performed between the risk-neutral case which uses the conventional DL method and the risk-aware case which uses uncertainty information accompanying the modified DL technique. The overall effects on the robot system are envisioned by observing a metric called the surprise factor, where a high surprise factor signifies a poor prediction of the actual cost associated with a path. The risk-neutral case is shown to have a higher surprise factor than the proposed risk-aware setup, both on average and in safety-critical case studies.
|
86 |
Deep learning based diatom-inspired metamaterial designShih, Ting-An 16 January 2023 (has links)
Diatom algae, abundantly found in the ocean, has hierarchical micro- and nanopores which inspired lots of metamaterial designs including dielectric metasurfaces. The conventional approach taken in the metamaterial design process is to generate the corresponding optical spectrum by utilizing physics-based simulation software. Although this approach provides high accuracy, the downside is that it is time-consuming and there are also constraints. By setting design parameters and the structure of the material, the optical response could be easily achieved. However, this approach is not able to deal with the inverse problem as simple as in the forward problem. In this study, a deep learning model that is capable of solving both the forward and the inverse problem of a diatom-inspired metamaterial design was developed and it was further verified experimentally. This method serves as an alternative way for the traditional metamaterial design process which greatly saves time and also presents functionality that simulation does not provide. To investigate the feasibility of this method, different input training datasets were examined, and several strategies were taken to improve the model performance. Though with the success in some cases, effort is still needed to employ the technique in a broader aspect. / 2024-01-15T00:00:00Z
|
87 |
Convolutional neural networks using cardiac magnetic resonance for early diagnosis and risk stratification of cardiac amyloidosisCockrum, Joshua W. January 2022 (has links)
No description available.
|
88 |
Identifying streamflow changes in western North America from 1979 to 2021 using Deep Learning approachesTang, Weigang 11 1900 (has links)
Streamflow in Western North America (WNA) has been experiencing pronounced changes in terms of volume and timing over the past century, primarily driven by natural climate variability and human-induced climate changes. This thesis advances on previous work by revealing the most recent streamflow changes in WNA using a comprehensive suite of classical hydrometric methods along with novel Deep Learning (DL) based approaches for change detection and classifica- tion. More than 500 natural streams were included in the analysis across western Canada and the United States. Trend analyses based on the Mann-Kendall test were conducted on a wide selection of classic hydrometric indicators to represent varying aspects of streamflow over 43 years from 1979 to 2021. A general geograph- ical divide at approximately 46◦N degrees latitude indicates that total streamflow is increasing to the north while declining to the south. Declining late summer flows (July–September) were also widespread across the WNA domain, coinciding with an overall reduction in precipitation. Some changing patterns are regional specific, including: 1) increased winter low flows at high latitudes; 2) earlier spring freshet in Rocky Mountains; 3) increased autumns flows in coastal Pacific North- west; and 4) dramatic drying in southwestern United States. In addition to classic hydrometrics, trend analysis was performed on Latent Features (LFs), which were extracted by Variation AutoEncoder (VAE) from raw streamflow data and are considered “machine-learned hydrometrics”. Some LFs with direct hydrological implications were closely associated with the classical hydrometric indicators such as flow quantity, seasonal distribution, timing and magnitude of freshet, and snow- to-rain transition. The changing patterns of streamflows revealed by LFs show direct agreement with the hydrometric trends. By reconstructing hydrographs from select LFs, VAE also provides a mechanism to project changes in streamflow patterns in the future. Furthermore, a parametric t-SNE method based on DL technology was developed to visualize similarity among a large number of hydro- graphs on a 2-D map. This novel method allowed fast grouping of hydrologically similar rivers based on their flow regime type and provides new opportunities for streamflow classification and regionalization. / Thesis / Doctor of Philosophy (PhD)
|
89 |
Yield Prediction Using Spatial and Temporal Deep Learning Algorithms and Data FusionBisht, Bhavesh 24 November 2023 (has links)
The world’s population is expected to grow to 9.6 billion by 2050. This exponential growth imposes a significant challenge on food security making the development of efficient crop production a growing concern. The traditional methods of analyzing soil and crop yield rely on manual field surveys and the use of expensive instruments. This process is not only time-consuming but also requires a team of specialists making this method of prediction expensive. Prediction of yield is an integral part of smart farming as it enables farmers to make timely informed decisions and maximize productivity while minimizing waste. Traditional statistical approaches fall short in optimizing yield prediction due to the multitude of diverse variables that influence crop production. Additionally, the interactions between these variables are non-linear which these methods fail to capture. Recent approaches in machine learning and data-driven models are better suited for handling the complexity and variability of crop yield prediction.
Maize, also known as corn, is a staple crop in many countries and is used in a variety of food products, including bread, cereal, and animal feed. In 2021-2022, the total production of corn was around 1.2 billion tonnes superseding that of wheat or rice, making it an essential element of food production. With the advent of remote sensing, Unmanned aerial vehicles or UAVs are widely used to capture high-quality field images making it possible to capture minute details for better analysis of the crops. By combining spatial features, such as topography and soil type, with crop growth information, it is possible to develop a robust and accurate system for predicting crop yield. Convolutional Neural Networks (CNNs) are a type of deep neural network that has shown remarkable success in computer vision tasks, achieving state-of-the-art performance. Their ability to automatically extract features and patterns from data sets makes them highly effective in analyzing complex and high-dimensional datasets, such as drone imagery. In this research, we aim to build an effective crop yield predictor using data fusion and deep learning. We propose several Deep CNN architectures that can accurately predict corn yield before the end of the harvesting season which can aid farmers by providing them with valuable information about potential harvest outcomes, enabling them to make informed decisions regarding resource allocation. UAVs equipped with RGB (Red Green Blue) and multi-spectral cameras were scheduled to capture high-resolution images for the entire growth period of 2021 of 3 fields located in Ottawa, Ontario, where primarily corn was grown. Whereas, the ground yield data was acquired at the time of harvesting using a yield monitoring device mounted on the harvester. Several data processing techniques were employed to remove erroneous measurements and the processed data was fed to different CNN architectures, and several analyses were done on the models to highlight the best techniques/methods that lead to the most optimal performance. The final best-performing model was a 3-dimensional CNN model that can predict yield utilizing the images from the Early(June) and Mid(July) growing stages with a Mean Absolute Percentage error of 15.18% and a Root Mean Squared Error of 17.63 (Bushels Per Acre). The model trained on data from Field 1 demonstrated an average Correlation Coefficient of 0.57 between the True and Predicted yield values from Field 2 and Field 3. This research provides a direction for developing an end-to-end yield prediction model. Additionally, by leveraging the results from the experiments presented in this research, image acquisition, and computation costs can be brought down.
|
90 |
Algebraic Learning: Towards Interpretable Information ModelingYang, Tong January 2021 (has links)
Thesis advisor: Jan Engelbrecht / Along with the proliferation of digital data collected using sensor technologies and a boost of computing power, Deep Learning (DL) based approaches have drawn enormous attention in the past decade due to their impressive performance in extracting complex relations from raw data and representing valuable information. At the same time, though, rooted in its notorious black-box nature, the appreciation of DL has been highly debated due to the lack of interpretability. On the one hand, DL only utilizes statistical features contained in raw data while ignoring human knowledge of the underlying system, which results in both data inefficiency and trust issues; on the other hand, a trained DL model does not provide to researchers any extra insight about the underlying system beyond its output, which, however, is the essence of most fields of science, e.g. physics and economics. The interpretability issue, in fact, has been naturally addressed in physics research. Conventional physics theories develop models of matter to describe experimentally observed phenomena. Tasks in DL, instead, can be considered as developing models of information to match with collected datasets. Motivated by techniques and perspectives in conventional physics, this thesis addresses the issue of interpretability in general information modeling. This thesis endeavors to address the two drawbacks of DL approaches mentioned above. Firstly, instead of relying on an intuition-driven construction of model structures, a problem-oriented perspective is applied to incorporate knowledge into modeling practice, where interesting mathematical properties emerge naturally which cast constraints on modeling. Secondly, given a trained model, various methods could be applied to extract further insights about the underlying system, which is achieved either based on a simplified function approximation of the complex neural network model, or through analyzing the model itself as an effective representation of the system. These two pathways are termed as guided model design (GuiMoD) and secondary measurements, respectively, which, together, present a comprehensive framework to investigate the general field of interpretability in modern Deep Learning practice. Remarkably, during the study of GuiMoD, a novel scheme emerges for the modeling practice in statistical learning: Algebraic Learning (AgLr). Instead of being restricted to the discussion of any specific model structure or dataset, AgLr starts from idiosyncrasies of a learning task itself and studies the structure of a legitimate model class in general. This novel modeling scheme demonstrates the noteworthy value of abstract algebra for general artificial intelligence, which has been overlooked in recent progress, and could shed further light on interpretable information modeling by offering practical insights from a formal yet useful perspective. / Thesis (PhD) — Boston College, 2021. / Submitted to: Boston College. Graduate School of Arts and Sciences. / Discipline: Physics.
|
Page generated in 0.0675 seconds