161 |
Source-Space Analyses in MEG/EEG and Applications to Explore Spatio-temporal Neural Dynamics in Human VisionYang, Ying 01 February 2017 (has links)
Human cognition involves dynamic neural activities in distributed brain areas. For studying such neural mechanisms, magnetoencephalography (MEG) and electroencephalography (EEG) are two important techniques, as they non-invasively detect neural activities with a high temporal resolution. Recordings by MEG/EEG sensors can be approximated as a linear transformation of the neural activities in the brain space (i.e., the source space). However, we only have a limited number sensors compared with the many possible locations in the brain space; therefore it is challenging to estimate the source neural activities from the sensor recordings, in that we need to solve the underdetermined inverse problem of the linear transformation. Moreover, estimating source activities is typically an intermediate step, whereas the ultimate goal is to understand what information is coded and how information flows in the brain. This requires further statistical analysis of source activities. For example, to study what information is coded in different brain regions and temporal stages, we often regress neural activities on some external covariates; to study dynamic interactions between brain regions, we often quantify the statistical dependence among the activities in those regions through “connectivity” analysis. Traditionally, these analyses are done in two steps: Step 1, solve the linear problem under some regularization or prior assumptions, (e.g., each source location being independent); Step 2, do the regression or connectivity analysis. However, biases induced in the regularization in Step 1 can not be adapted in Step 2 and thus may yield inaccurate regression or connectivity results. To tackle this issue, we present novel one-step methods of regression or connectivity analysis in the source space, where we explicitly modeled the dependence of source activities on the external covariates (in the regression analysis) or the cross-region dependence (in the connectivity analysis), jointly with the source-to-sensor linear transformation. In simulations, we observed better performance by our models than by commonly used two-step approaches, when our model assumptions are reasonably satisfied. Besides the methodological contribution, we also applied our methods in a real MEG/EEG experiment, studying the spatio-temporal neural dynamics in the visual cortex. The human visual cortex is hypothesized to have a hierarchical organization, where low-level regions extract low-level features such as local edges, and high-level regions extract semantic features such as object categories. However, details about the spatio-temporal dynamics are less understood. Here, using both the two-step and our one-step regression models in the source space, we correlated neural responses to naturalistic scene images with the low-level and high-level features extracted from a well-trained convolutional neural network. Additionally, we also studied the interaction between regions along the hierarchy using the two-step and our one-step connectivity models. The results from the two-step and the one-step methods were generally consistent; however, the one-step methods demonstrated some intriguing advantages in the regression analysis, and slightly different patterns in the connectivity analysis. In the consistent results, we not only observed an early-to-late shift from low-level to high-level features, which support feedforward information flow along the hierarchy, but also some novel evidence indicating non-feedforward information flow (e.g., topdown feedback). These results can help us better understand the neural computation in the visual cortex. Finally, we compared the empirical sensitivity between MEG and EEG in this experiment, in detecting dependence between neural responses and visual features. Our results show that the less costly EEG was able to achieve comparable sensitivity with that in MEG when the number of observations was about twice of that in MEG. These results can help researchers empirically choose between MEG and EEG when planning their experiments with limited budgets.
|
162 |
Deep Convolutional Neural Networks For Detecting Cellular Changes Due To MalignancyWieslander, Håkan, Forslid, Gustav January 2017 (has links)
Discovering cancer at an early stage is an effective way to increase the chance of survival. However, since most screening processes are done manually it is time inefficient and thus costly. One way of automizing the screening process could be to classify cells using Convolutional Neural Networks. Convolutional Neural Networks have been proven to produce high accuracy for image classification tasks. This thesis investigates if Convolutional Neural Networks can be used as a tool to detect cellular changes due to malignancy in the oral cavity and uterine cervix. Two datasets containing oral cells and two datasets containing cervical cells were used. The cells were divided into normal and abnormal cells for a binary classification. The performance was evaluated for two different network architectures, ResNet and VGG. For the oral datasets the accuracy varied between 78-82% correctly classified cells depending on the dataset and network. For the cervical datasets the accuracy varied between 84-86% correctly classified cells depending on the dataset and network. These results indicates a high potential for classifying abnormalities for oral and cervical cells. ResNet was shown to be the preferable network, with a higher accuracy and a smaller standard deviation.
|
163 |
Learning Structured and Deep Representations for Traffc Scene UnderstandingYu, Zhiding 01 December 2017 (has links)
Recent advances in representation learning have led to an increasing variety of vision-based approaches in traffic scene understanding. This includes general vision problems such as object detection, depth estimation, edge/boundary/contour detection, semantic segmentation and scene classification, as well as application-driven problems such as pedestrian detection, vehicle detection, lane marker detection and road segmentation, etc. In this thesis, we approach some of these problems by exploring structured and invariant representations from the visual input. Our research is mainly motivated by two facts: 1. Traffic scenes often contain highly structured layouts. Exploring structured priors is expected to help considerably in improving the scene understanding performance. 2. A major challenge of traffic scene understanding lies in the diverse and changing nature of the contents. It is therefore important to find robust visual representations that are invariant against such variability. We start from highway scenarios where we are interested in detecting the hard road borders and estimating the drivable space before such physical boundary. To this end, we treat the task as a joint detection and tracking problem, and formulate it with structured Hough voting (SVH): A conditional random field model that explores both intra-frame geometric and interframe temporal information to generate more accurate and stable predictions. Turning from highway scenes to urban scenes, we consider dense prediction problems such as category-aware semantic edge detection and semantic segmentation. Category-aware semantic edge detection is challenging as the model is required to jointly localize object contours and classify each edge pixel to one or multiple predefined classes. We propose CASENet, a multilabel deep network with state of the art edge detection performance. To address the label misalignment problem in edge learning, we also propose SEAL, a framework towards simultaneous edge alignment and learning. Failure across different domains has been a common bottleneck of semantic segmentation methods. In this thesis, we address the problem of adapting a segmentation model trained on a source domain to another different target domain without knowing the target domain labels, and propose a class-balanced self-training approach for such unsupervised domain adaptation. We adopt the \synthetic-to-real" setting where a model is pre-trained on GTA-5 and adapted to real world datasets such as Cityscapes and Nexar, as well as the \cross-city" setting where a model is pre-trained on Cityscapes, and adapted to unseen data from Rio, Tokyo, Rome and Taipei. Experiment shows the superior performance of our method compared to state of the art methods, such as adversarial training based domain adaptation.
|
164 |
Identifying illicit graphic in the online community using the neural network frameworkVega Ezpeleta, Emilio January 2017 (has links)
In this paper two convolutional neural networks are estimated to classify whether an image contains a swastika or not. The images are gathered from the gaming platform Steam and by scraping a web search engine. The architecture of the networks is kept moderate and the difference between the models is the final layer. The first model uses an average type operation while the second uses the conventional fully-connected layer at the end. The results show that the performance of the two models is similar and the test error is in the 6-9 % range.
|
165 |
Navigability Assessment for Autonomous Systems Using Deep Neural NetworksWimby Schmidt, Ebba January 2017 (has links)
Automated navigability assessment based on image sensor data is an important concern in the design of autonomous robotic systems. The problem consists in finding a mapping from input data to the navigability status of different areas of the surrounding world. Machine learning techniques are often applied to this problem. This thesis investigates an approach to navigability assessment in the image plane, based on offline learning using deep convolutional neural networks, applied to RGB and depth data collected using a robotic platform. Training outputs were generated by manually marking out instances of near collision in the sequences and tracing back the location of the near-collision frame through the previous frames. Several combinations of network inputs were tried out. Inputs included grayscale gradient versions of the RGB frames, depth maps, image coordinate maps and motion information in the form of a previous RGB frame or heading maps. Some improvement compared to simple depth thresholding was demonstrated, mainly in the handling of noise and missing pixels in the depth maps. The resulting networks appear to be mostly dependent on depth information; an attempt to train a network without the depth frames was unsuccessful,and a network trained using the depth frames alone performed similarly to networks trained with additional inputs. An unsuccessful attempt at training a network towards a more motion-dependent navigability concept was also made. It was done by including training frames captured as the robot was moving away from the obstacle, where the corresponding training outputs were marked as obstacle-free.
|
166 |
PREDICTING ENERGETIC MATERIAL PROPERTIES AND INVESTIGATING THE EFFECT OF PORE MORPHOLOGY ON SHOCK SENSITIVITY VIA MACHINE LEARNINGAlex Donald Casey (9167681) 28 July 2020 (has links)
<div>An improved understanding of energy localization ("hot spots'') is needed to improve the safety and performance of explosives. In this work I establish a variety of experimental and computational methods to aid in the investigation of hot spots. In particular, focus is centered on the implicit relationship between hot spots and energetic material sensitivity. To begin, I propose a technique to visualize and quantify the properties of a dynamic hot spot from within an energetic composite subjected to ultrasonic mechanical excitation. The composite is composed of an optically transparent binder and a countable number of HMX crystals. The evolving temperature field is measured by observing the luminescence from embedded phosphor particles and subsequent application of the intensity ratio method. The spatial temperature precision is less than 2% of the measured absolute temperature in the temperature regime of interest (23-220 C). The temperature field is mapped from within an HMX-binder composite under periodic mechanical excitation.</div><div> </div><div> Following this experimental effort I examine the statistics behind the most prevalent and widely used sensitivity test (at least within the energetic materials community) and suggest adaptions to generalize the approach to bimodal latent distributions. Bimodal latent distributions may occur when manufacturing processes are inconsistent or when competing initiation mechanisms are present.</div><div> </div><div> Moving to simulation work, I investigate how the internal void structure of a solid explosive influences initiation behavior -- specifically the criticality of isolated hot spots -- in response to a shock insult. In the last decade, there has been a significant modeling and simulation effort to investigate the thermodynamic response of a shock induced pore collapse process in energetic materials. However, the majority of these studies largely ignore the geometry of the pore and assume simplistic shapes, typically a sphere. In this work, the influence of pore geometry on the sensitivity of shocked HMX is explored. A collection of pore geometries are retrieved from micrographs of pressed HMX samples via scanning electron microscopy. The shock induced collapse of these geometries are simulated using CTH and the response is reduced to a binary "critical'’ / "sub-critical'' result. The simulation results are used to assign a minimum threshold velocity required to exhibit a critical response to each pore geometry. The pore geometries are subsequently encoded to numerical representations and a functional mapping from pore shape to a threshold velocity is developed using supervised machine-learned models. The resulting models demonstrate good predictive capability and their relative performance is explored. The established models are exposed via a web application to further investigate which shape features most heavily influence sensitivity.</div><div> </div><div> Finally, I develop a convolutional neural network capable of directly parsing the 3D electronic structure of a molecule described by spatial point data for charge density and electrostatic potential represented as a 4D tensor. This method effectively bypasses the need to construct complex representations, or descriptors, of a molecule. This is beneficial because the accuracy of a machine learned model depends on the input representation. Ideally, input descriptors encode the essential physics and chemistry that influence the target property. Thousands of molecular descriptors have been proposed and proper selection of features requires considerable domain expertise or exhaustive and careful statistical downselection. In contrast, deep learning networks are capable of learning rich data representations. This provides a compelling motivation to use deep learning networks to learn molecular structure-property relations from "raw'' data. The convolutional neural network model is jointly trained on over 20,000 molecules that are potentially energetic materials (explosives) to predict dipole moment, total electronic energy, Chapman-Jouguet (C-J) detonation velocity, C-J pressure, C-J temperature, crystal density, HOMO-LUMO gap, and solid phase heat of formation. To my knowledge, this demonstrates the first use of the complete 3D electronic structure for machine learning of molecular properties. </div>
|
167 |
Visual Perception, Prediction and Understanding with RelationsJanuary 2020 (has links)
abstract: Rapid development of computer vision applications such as image recognition and object detection has been enabled by the emerging deep learning technologies. To improve the accuracy further, deeper and wider neural networks with diverse architecture are proposed for better feature extraction. Though the performance boost is impressive, only marginal improvement can be achieved with significantly increased computational overhead. One solution is to compress the exploding-sized model by dropping less important weights or channels. This is an effective solution that has been well explored. However, by utilizing the rich relation information of the data, one can also improve the accuracy with reasonable overhead. This work makes progress toward efficient and accurate visual tasks including detection, prediction and understanding by using relations.
For object detection, a novel approach, Graph Assisted Reasoning (GAR), is proposed to utilize a heterogeneous graph to model object-object relations and object-scene relations. GAR fuses the features from neighboring object nodes as well as scene nodes. In this way, GAR produces better recognition than that produced from individual object nodes. Moreover, compared to previous approaches using Recurrent Neural Network (RNN), GAR's light-weight and low-coupling architecture further facilitate its integration into the object detection module.
For trajectories prediction, a novel approach, namely Diverse Attention RNN (DAT-RNN), is proposed to handle the diversity of trajectories and modeling of neighboring relations. DAT-RNN integrates both temporal and spatial relations to improve the prediction under various circumstances.
Last but not least, this work presents a novel relation implication-enhanced (RIE) approach that improves relation detection through relation direction and implication. With the relation implication, the SGG model is exposed to more ground truth information and thus mitigates the overfitting problem of the biased datasets. Moreover, the enhancement with relation implication is compatible with various context encoding schemes.
Comprehensive experiments on benchmarking datasets demonstrate the efficacy of the proposed approaches. / Dissertation/Thesis / Doctoral Dissertation Engineering 2020
|
168 |
Abstractive Representation Modeling for Image ClassificationLi, Xin 05 October 2021 (has links)
No description available.
|
169 |
Sensor Fusion for 3D Object Detection for Autonomous VehiclesMassoud, Yahya 14 October 2021 (has links)
Thanks to the major advancements in hardware and computational power, sensor technology, and artificial intelligence, the race for fully autonomous driving systems is heating up. With a countless number of challenging conditions and driving
scenarios, researchers are tackling the most challenging problems in driverless cars.
One of the most critical components is the perception module, which enables an autonomous vehicle to "see" and "understand" its surrounding environment. Given
that modern vehicles can have large number of sensors and available data streams,
this thesis presents a deep learning-based framework that leverages multimodal
data – i.e. sensor fusion, to perform the task of 3D object detection and localization.
We provide an extensive review of the advancements of deep learning-based
methods in computer vision, specifically in 2D and 3D object detection tasks. We also
study the progress of the literature in both single-sensor and multi-sensor data fusion techniques. Furthermore, we present an in-depth explanation of our proposed
approach that performs sensor fusion using input streams from LiDAR and Camera
sensors, aiming to simultaneously perform 2D, 3D, and Bird’s Eye View detection.
Our experiments highlight the importance of learnable data fusion mechanisms and
multi-task learning, the impact of different CNN design decisions, speed-accuracy
tradeoffs, and ways to deal with overfitting in multi-sensor data fusion frameworks.
|
170 |
Temporal Convolutional Networks in Lieu of Fuel Performance Codes : Conceptual Study Using a Cladding Oxidation ModelNerlander, Viktor January 2021 (has links)
Fuel performance codes are used to demonstrate with confidencethat nuclear fuel rods will sustain normal operation and transientevents without being damaged. However, the execution time of a typ-ical fuel rod simulation ranges from tens of seconds to minutes which can be impractical in certain applications. In the scope of this work,at least two such applications are identified; code-calibration and fuelcore evaluations. In both of these cases, possible improvements can be obtainedby creating neural network surrogate models. For code calibration,a Deep Neural Network is enough since calibration is performed onmodel constants. But for full-core evaluations, a surrogate model mustbe able to predict a time-dependent target as a function of a time-dependent input. In this work, Temporal Convolutional Networks are investigated for the second application. In both applications, targetdata are generated with a Cladding Oxidation Model. The result of the study shows that both models succeeded in their respective tasks with good performance metrics. However, furtherwork is needed to increase the number of input and target variablesthat the Deep Neural Network can handle, verify the flexibility ofinput data files for the TCN, try out the TCN on a real code, and combine the two models and achieve a broader set of use-cases. / <p>Kursnamn: Fördjupande projektarbete i energisystem</p><p>Kurskod: 1FA394</p>
|
Page generated in 0.025 seconds