• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 15
  • 2
  • Tagged with
  • 22
  • 9
  • 9
  • 9
  • 8
  • 7
  • 7
  • 7
  • 6
  • 6
  • 6
  • 5
  • 4
  • 4
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Detection of Tornado Damage via Convolutional Neural Networks and Unmanned Aerial System Photogrammetry

Carani, Samuel James 21 October 2021 (has links)
Disaster damage assessments are a critical component to response and recovery operations. In recent years, the field of remote sensing has seen innovations in automated damage assessments and UAS collection capabilities. However, little work has been done to explore the intersection of automated methods and UAS photogrammetry to detect tornado damage. UAS imagery, combined with Structure from Motion (SfM) output, can directly be used to train models to detect tornado damage. In this research, we develop a CNN that can classify tornado damage in forests using SfM-derived orthophotos and digital surface models. The findings indicate that a CNN approach provides a higher accuracy than random forest classification, and that DSM-based derivatives add predictive value over the use of the orthophoto mosaic alone. This method has the potential to fill a gap in tornado damage assessment, as tornadoes that occur in wooded areas are typically difficult to survey on the ground and in the field; an improved record of tornado damage in these areas will improve our understanding of tornado climatology. / Master of Science / Disaster damage assessments are a critical component to response and recovery operations. In recent years, the field of remote sensing has seen innovations in automated damage assessments and Unmanned Aerial System (UAS) collection capabilities. However, little work has been done to explore the intersection of automated methods and UAS imagery to detect tornado damage. UAS imagery, combined with 3D models, can directly be used to train machine learning models to automatically detect tornado damage. In this research, we develop a machine learning model that can classify tornado damage in forests using UAS imagery and 3D derivatives. The findings indicate that the machine learning model approach provides a higher accuracy than traditional techniques. In addition, the 3D derivatives add value over the use of only the UAS imagery. This method has the potential to fill a gap in tornado damage assessment, as tornadoes that occur in wooded areas are typically difficult to survey on the ground and in the field; an improved record of tornado damage in these areas will improve our understanding of tornado climatology.
2

Convolutional Neural Nets for Crop Stress Diagnosis: A Holistic Approach in Addressing Existing Challenges

Wiegman, Christopher R. January 2021 (has links)
No description available.
3

Deep Neural Network Pruning and Sensor Fusion in Practical 2D Detection

Mousa Pasandi, Morteza 19 May 2023 (has links)
Convolutional Neural Networks (CNNs) have been extensively studied and applied to various computer vision problems, including object detection, semantic segmentation, and autonomous driving. Convolutional Neural Networks (CNN)s extract complex features from input images or data to represent objects or patterns. Their highly complex architecture, however, and the size of their learned weights make their time and resource intensive. Measures like pruning and fusion, which aim to simplify the structure and lessen the load on the network’s resources, should be considered to resolve this problem. In this thesis, we intend to explore the effect of pruning on segmentation and object detection as well as the benefits of using sensor fusion operators in the 2d space to boost the existing networks’ performance. Specifically, we focus on structured pruning, quantization, and simple and learnable fusion operators. We also study the scalability of different algorithms in terms of the number of parameters and floating points used. First, we provide a general overview of CNNs and the history of pruning and fusion operations. Second, we explain the advantages of pruning and discuss the contrast between the unstructured and structured types. Third, we discuss the differences between simple fusion and learnable fusion. In order to evaluate our algorithms, we use several classification and object detection datasets such as Cifar-10, KITTI and Microsoft COCO. By applying our proposed methods to the studied datasets, we can assess the efficiency of the algorithms. Furthermore, this allows us to observe the improvements in task-specific losses. In conclusion, our work is focused on analyzing the effect of pruning and fusion to simplify existing networks and improve their performance in terms of scalability, task-specific losses, and resource consumption. We also discuss various algorithms, as well as datasets which serve as a basis for the evaluation of our proposed approaches.
4

Comparative Analysis of Convolutional Neural Network (CNN) Architectures for Content Restriction

Daher, Abdulhadi January 2024 (has links)
Ökningen av sociala medieanvändare har introducerat betydande utmaningar i att hantera de stora mängder data som delas, särskilt bilder. Med mer än 63% av världens befolkning som använder sociala medieplattformar, har behovet av effektiv innehållsbegränsning blivit kritiskt. Manuell moderering är inte längre praktisk på grund av den stora mängden innehåll. Denna studie adresserar det kritiska problemet med bildbegränsning genom att utvärdera prestandan hos avancerade bildklassificeringsmodeller, specifikt VGG16 och Inception_v3 konvolutionella neurala nätverk (CNNs). För att möta denna utmaning använder studien CIFAR-10 datasetet, vilket är allmänt känt som ett riktmärkesdataset inom bildklassificeringsforskning. Forskningen innebär att implementera förtränade modeller och genomföra en omfattande jämförelse med olika prestandamått, inklusive noggrannhet, precision, återkallning, F1-poäng, förväxlingsmatris, ROC-kurva och AUC. Dessa mått ger en omfattande utvärdering av modellens förmåga att korrekt klassificera bilder. Vidare inkluderar studien en finjusteringsfas efter den inledande jämförelsen för att ytterligare förbättra modellens prestanda. Detta innebär att justera parametrarna i den förtränade modellen för att bättre passa de specifika egenskaperna hos CIFAR-10 datasetet. Efter finjusteringen genomförs ytterligare en jämförande analys för att bedöma förbättringarna och fastställa den mest effektiva modellen. Resultaten visar att både VGG16 och Inception_V3 visade betydande förbättringar i prestanda efter finjustering, med märkbara ökningar i noggrannhet och andra mått. Emellertid visade VGG16 bättre övergripande prestanda, vilket gör den till den föredragna modellen för denna applikation. Huvudsyftet med denna forskning är att identifiera den mest effektiva modellen för bildklassificering och därigenom etablera ett fundamentalt konceptbevis för användningen av konvolutionella neurala nätverk (CNNs) i innehållsbegränsning på sociala medieplattformar. / The increase in social media usage has introduced significant challenges in managing the large amounts of data being shared, particularly images. With more than 63% of the global population using social media platforms, the need for effective content restriction has become critical. Manual moderation is no longer practical due to the large amount of content. This thesis addresses the critical issue of image restriction by evaluating the performance of advanced image classification models, specifically VGG16 and Inception_v3 Convolutional Neural Networks (CNNs). In order to address this challenge, the study utilizes the CIFAR-10 dataset, which is widely known as a benchmark dataset in image classification research. The research involves implementing pre-trained models and conducting a comprehensive comparison using various performance metrics, including Accuracy, Precision, Recall, F1 Score, Confusion Matrix, ROC Curve, and AUC. These metrics provide a comprehensive evaluation of the model's ability to accurately classify images. Furthermore, the study includes a fine-tuning phase after the initial comparison to further improve the model's performance. This involves adjusting the parameters of the pre-trained model to better suit the specific characteristics of the CIFAR-10 dataset. Following the finetuning, another round of comparative analysis is conducted to assess the improvements and determine the most effective model. The results demonstrate that both VGG16 and Inception_V3 showed significant improvements in performance after fine-tuning, with notable increases in accuracy and other metrics. However, VGG16 showed a better overall performance, making it the preferred model for this application. The primary objective of this research is to identify the most effective model for image classification, thereby establishing a foundational proof of concept for the application of Convolutional Neural Networks (CNNs) in content restriction on social media platforms.
5

Dynamics of Two Neuron Cellular Neural Networks

Viñoles Serra, Mireia 18 January 2011 (has links)
Les xarxes neuronals cel·lulars altrament anomenades CNNs, són un tipus de sistema dinàmic que relaciona diferents elements que s'anomenen neurones via unes plantilles de paràmetres. Aquest sistema queda completament determinat coneixent quines són les entrades a la xarxa, les sortides i els paràmetres o pesos. En aquest treball fem un estudi exhaustiu sobre aquest tipus de xarxa en el cas més senzill on només hi intervenen dues neurones. Tot i la simplicitat del sistema, veurem que pot tenir una dinàmica molt rica. Primer de tot, revisem l'estabilitat d'aquest sistema des de dos punts de vista diferents. Usant la teoria de Lyapunov, trobem el rang de paràmetres en el que hem de treballar per aconseguir la convergència de la xarxa cap a un punt fix. Aquest mètode ens obre les portes per abordar els diferents tipus de problemes que es poden resoldre usant una xarxa neuronal cel·lular de dues neurones. D'altra banda, el comportament dinàmic de la CNN està determinat per la funció lineal a trossos que defineix les sortides del sistema. Això ens permet estudiar els diferents sistemes que apareixen en cada una de les regions on el sistema és lineal, aconseguint un estudi complet de l'estabilitat de la xarxa en funció de les posicions locals dels diferents punts d'equilibri del sistema. D'aquí obtenim bàsicament dos tipus de convergència, cap a un punt fix o bé cap a un cicle límit. Aquests resultats ens permeten organitzar aquest estudi bàsicament en aquests dos tipus de convergència. Entendre el sistema d'equacions diferencials que defineixen la CNN en dimensió 1 usant només dues neurones, ens permet trobar les dificultats intrínseques de les xarxes neuronals cel·lulars així com els possibles usos que els hi podem donar. A més, ens donarà les claus per a poder entendre el cas general. Un dels primers problemes que abordem és la dependència de les sortides del sistema respecte les condicions inicials. La funció de Lyapunov que usem en l'estudi de l'estabilitat es pot veure com una quàdrica si la pensem com a funció de les sortides. La posició i la geometria d'aquesta forma quadràtica ens permeten trobar condicions sobre els paràmetres que descriuen el sistema dinàmic. Treballant en aquestes regions aconseguim abolir el problema de la dependència. A partir d'aquí ja comencem a estudiar les diferents aplicacions de les CNN treballant en un rang de paràmetres on el sistema convergeix a un punt fix. Una primera aplicació la trobem usant aquest tipus de xarxa per a reproduir distribucions de probabilitat tipus Bernoulli usant altre cop la funció de Lyapunov emprada en l'estudi de l'estabilitat. Una altra aplicació apareix quan ens centrem a treballar dins del quadrat unitat. En aquest cas, el sistema és capaç de reproduir funcions lineals. L'existència de la funció de Lyapunov permet també de construir unes gràfiques que depenen dels paràmetres de la CNN que ens indiquen la relació que hi ha entre les entrades de la CNN i les sortides. Aquestes gràfiques ens donen un algoritme per a dissenyar plantilles de paràmetres reproduint aquestes relacions. També ens obren la porta a un nou problema: com composar diferents plantilles per aconseguir una determinada relació entrada¬sortida. Tot aquest estudi ens porta a pensar en buscar una relació funcional entre les entrades externes a la xarxa i les sortides. Com que les possibles sortides és un conjunt discret d'elements gràcies a la funció lineal a trossos, la correspondència entrada¬sortida es pot pensar com un problema de classificació on cada una de les classes està definida per les diferent possibles sortides. Pensant¬ho d'aquesta manera, estudiem quins problemes de classificació es poden resoldre usant una CNN de dues neurones i trobem quina relació hi ha entre els paràmetres de la CNN, les entrades i les sortides. Això ens permet trobar un mètode per a dissenyar plantilles per a cada problema concret de classificació. A més, els resultats obtinguts d'aquest estudi ens porten cap al problema de reproduir funcions Booleanes usant CNNs i ens mostren alguns dels límits que tenen les xarxes neuronals cel·lulars tot intentant reproduir el capçal de la màquina universal de Turing descoberta per Marvin Minsky l'any 1962. A partir d'aquí comencem a estudiar la xarxa neuronal cel·lular quan convergeix cap a un cicle límit. Basat en un exemple particular extret del llibre de L.O Chua, estudiem primer com trobar cicles límit en el cas que els paràmetres de la CNN que connecten les diferents neurones siguin antisimètrics. D'aquesta manera trobem en quin rang de paràmetres hem de treballar per assegurar que l'estat final de la xarxa sigui una corba tancada. A més ens dona la base per poder abordar el problema en el cas general. El comportament periòdic d'aquestes corbes ens incita primer a calcular aquest període per cada cicle i després a pensar en possibles aplicacions com ara usar les CNNs per a generar senyals de rellotge. Finalment, un cop estudiats els diferents tipus de comportament dinàmics i les seves possibles aplicacions, fem un estudi comparatiu de la xarxa neuronal cel·lular quan la sortida està definida per la funció lineal a trossos i quan està definida per la tangent hiperbòlica ja que moltes vegades en la literatura s'usa l'una en comptes de l'altra aprofitant la seva diferenciabilitat. Aquest estudi ens indica que no sempre es pot usar la tangent hiperbòlica en comptes de la funció lineal a trossos ja que la convergència del sistema és diferent en un segons com es defineixin les sortides de la CNN. / Les redes neuronales celulares o CNNs, son un tipo de sistema dinámico que relaciona diferentes elementos llamados neuronas a partir de unas plantillas de parámetros. Este sistema queda completamente determinado conociendo las entradas de la red, las salidas y los parámetros o pesos. En este trabajo hacemos un estudio exhaustivo de estos tipos de red en el caso más sencillo donde sólo intervienen dos neuronas. Este es un sistema muy sencillo que puede llegar a tener una dinámica muy rica. Primero, revisamos la estabilidad de este sistema desde dos puntos de vista diferentes. Usando la teoría de Lyapunov, encontramos el rango de parámetros en el que hemos de trabajar para conseguir que la red converja hacia un punto fijo. Este método nos abre las puertas parar poder abordar los diferentes tipos de problemas que se pueden resolver usando una red neuronal celular de dos neuronas. Por otro lado, el comportamiento dinámico de la CNN está determinado por la función lineal a tramos que define las salidas del sistema. Esto nos permite estudiar los diferentes sistemas que aparecen en cada una de las regiones donde el sistema es lineal, consiguiendo un estudio completo de la estabilidad de la red en función de las posiciones locales de los diferentes puntos de equilibrio del sistema. Obtenemos básicamente dos tipos de convergencia, hacia a un punto fijo o hacia un ciclo límite. Estos resultados nos permiten organizar este estudio básicamente en estos dos tipos de convergencia. Entender el sistema de ecuaciones diferenciales que definen la CNN en dimensión 1 usando solamente dos neuronas, nos permite encontrar las dificultades intrínsecas de las redes neuronales celulares así como sus posibles usos. Además, nos va a dar los puntos clave para poder entender el caso general. Uno de los primeros problemas que abordamos es la dependencia de las salidas del sistema respecto de las condiciones iniciales. La función de Lyapunov que usamos en el estudio de la estabilidad es una cuadrica si la pensamos como función de las salidas. La posición y la geometría de esta forma cuadrática nos permiten encontrar condiciones sobre los parámetros que describen el sistema dinámico. Trabajando en estas regiones logramos resolver el problema de la dependencia. A partir de aquí ya podemos empezar a estudiar las diferentes aplicaciones de las CNNs trabajando en un rango de parámetros donde el sistema converge a un punto fijo. Una primera aplicación la encontramos usando este tipo de red para reproducir distribuciones de probabilidad tipo Bernoulli usando otra vez la función de Lyapunov usada en el estudio de la estabilidad. Otra aplicación aparece cuando nos centramos en trabajar dentro del cuadrado unidad. En este caso, el sistema es capaz de reproducir funciones lineales. La existencia de la función de Lyapuno v permite también construir unas graficas que dependen de los parámetros de la CNN que nos indican la relación que hay entre las entradas de la CNN y las salidas. Estas graficas nos dan un algoritmo para diseñar plantillas de parámetros reproduciendo estas relaciones. También nos abren la puerta hacia un nuevo problema: como componer diferentes plantillas para conseguir una determinada relación entrada¬salida. Todo este estudio nos lleva a pensar en buscar una relación funcional entre las entradas externas a la red y las salidas. Teniendo en cuenta que las posibles salidas es un conjunto discreto de elementos gracias a la función lineal a tramos, la correspondencia entrada¬salida se puede pensar como un problema de clasificación donde cada una de las clases está definida por las diferentes posibles salidas. Pensándolo de esta forma, estudiamos qué problemas de clasificación se pueden resolver usando una CNN de dos neuronas y encontramos la relación que hay entre los parámetros de la CNN, las entradas y las salidas. Esto nos permite encontrar un método de diseño de plantillas para cada problema concreto de clasificación. Además, los resultados obtenidos en este estudio nos conducen hacia el problema de reproducir funciones Booleanas usando CNNs y nos muestran algunos de los límites que tienen las redes neuronales celulares al intentar reproducir el cabezal (la cabeza) de la máquina universal de Turing descubierta por Marvin Minsky el año 1962. A partir de aquí empezamos a estudiar la red neuronal celular cuando ésta converge hacia un ciclo límite. Basándonos en un ejemplo particular sacado del libro de L.O Chua, estudiamos primero como encontrar ciclos límite en el caso que los parámetros de la CNN que conectan las diferentes neuronas sean anti¬simétricos. De esta forma encontramos el rango de parámetros en el cuál hemos de trabajar para asegurar que el estado final de la red sea una curva cerrada. Además nos da la base para poder abordar el problema en el caso general. El comportamiento periódico de estas curvas incita primero a calcular su periodo para cada ciclo y luego a pensar en posibles aplicaciones como por ejemplo usar las CNNs para generar señales de reloj. Finalmente, estudiados ya los diferentes tipos de comportamiento dinámico y sus posibles aplicaciones, hacemos un estudio comparativo de la red neuronal celular cuando la salida está definida por la función lineal a trozos y cuando está definida por la tangente hiperbólica ya que muchas veces en la literatura se usa una en vez de la otra intentado aprovechar su diferenciabilidad. Este estudio nos indica que no siempre se puede intercambiar dichas funciones ya que la convergencia del sistema es distinta según como se definan las salidas de la CNN. / In this dissertation we review the two neuron cellular neural network stability using the Lyapunov theory, and using the different local dynamic behavior derived from the piecewise linear function use. We study then a geometrical way to understand the system dynamics. The Lyapunov stability, gives us the key point to tackle the different convergence problems that can be studied when the CNN system converges to a fixed¬point. The geometric stability shed light on the convergence to limit cycles. This work is basically organized based on these two convergence classes. We try to make an exhaustive study about Cellular Neural Networks in order to find the intrinsic difficulties, and the possible uses of a CNN. Understanding the CNN system in a lower dimension, give us some of the main keys in order to understand the general case. That's why we will focus our study in the one dimensional CNN case with only two neurons. From the results obtained using the Lyapunov function, we propose some methods to avoid the dependence on initial conditions problem. Its intrinsic characteristics as a quadratic form of the output values gives us the key points to find parameters where the final outputs do not depend on initial conditions. At this point, we are able to study different CNN applications for parameter range where the system converges to a fixed¬point. We start by using CNNs to reproduce Bernoulli probability distributions, based on the Lyapunov function geometry. Secondly, we reproduce linear functions while working inside the unit square. The existence of the Lyapunov function allows us to construct a map, called convergence map, depending on the CNN parameters, which relates the CNN inputs with the final outputs. This map gives us a recipe to design templates performing some desired input¬output associations. The results obtained drive us into the template composition problem. We study the way different templates can be applied in sequence. From the results obtained in the template design problem, we may think on finding a functional relation between the external inputs and the final outputs. Because the set of final states is discrete, thanks to the piecewise linear function, this correspondence can be thought as a classification problem. Each one of the different classes is defined by the different final states which, will depend on the CNN parameters. Next, we study which classifications problems can be solved by a two neuron CNN, and relate them with weight parameters. In this case, we also find a recipe to design templates performing these classification problems. The results obtained allow us to tackle the problem to realize Boolean functions using CNNs, and show us some CNN limits trying to reproduce the header of a universal Turing machine. Based on a particular limit cycle example extracted from Chua's book, we start this study with anti symmetric connections between cells. The results obtained can be generalized for CNNs with opposite sign parameters. We have seen in the stability study that limit cycles have the possibility to exist for this parameter range. Periodic behavior of these curves is computed in a particular case. The limit cycle period can be expressed as a function of the CNN parameters, and can be used to generate clock signals. Finally, we compare the CNN dynamic behavior using different output functions, hyperbolic tangent and piecewise linear function. Many times in the literature, hyperbolic tangent is used instead of piecewise linear function because of its differentiability along the plane. Nevertheless, in some particular regions in the parameter space, they exhibit a different number of equilibrium points. Then, for theoretical results, hyperbolic tangent should not be used instead of piecewise linear function.
6

Design Space Exploration of DNNs for Autonomous Systems

Duggal, Jayan Kant 08 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Developing intelligent agents that can perceive and understand the rich visualworld around us has been a long-standing goal in the field of AI. Recently, asignificant progress has been made by the CNNs/DNNs to the incredible advances& in a wide range of applications such as ADAS, intelligent cameras surveillance,autonomous systems, drones, & robots. Design space exploration (DSE) of NNs andother techniques have made CNN/DNN memory & computationally efficient. Butthe major design hurdles for deployment are limited resources such as computation,memory, energy efficiency, and power budget. DSE of small DNN architectures forADAS emerged with better and efficient architectures such as baseline SqueezeNetand SqueezeNext. These architectures are exclusively known for their small modelsize, good model speed & model accuracy.In this thesis study, two new DNN architectures are proposed. Before diving intothe proposed architectures, DSE of DNNs explores the methods to improveDNNs/CNNs.Further, understanding the different hyperparameters tuning &experimenting with various optimizers and newly introduced methodologies. First,High Performance SqueezeNext architecture ameliorate the performance of existingDNN architectures. The intuition behind this proposed architecture is to supplantconvolution layers with a more sophisticated block module & to develop a compactand efficient architecture with a competitive accuracy. Second, Shallow SqueezeNextarchitecture is proposed which achieves better model size results in comparison tobaseline SqueezeNet and SqueezeNext is presented. It illustrates the architecture is xviicompact, efficient and flexible in terms of model size and accuracy.Thestate-of-the-art SqueezeNext baseline and SqueezeNext baseline are used as thefoundation to recreate and propose the both DNN architectures in this study. Dueto very small model size with competitive model accuracy and decent model testingspeed it is expected to perform well on the ADAS systems.The proposedarchitectures are trained and tested from scratch on CIFAR-10 [30] & CIFAR-100[34] datasets. All the training and testing results are visualized with live loss andaccuracy graphs by using livelossplot. In the last, both of the proposed DNNarchitectures are deployed on BlueBox2.0 by NXP.
7

A STANDARD CELL LIBRARY USING CMOS TRANSCONDUCTANCE AMPLIFIERS FOR CELLULAR NEURAL NETWORKS

MAILAVARAM, MADHURI 03 April 2006 (has links)
No description available.
8

Brain disease classification using multi-channel 3D convolutional neural networks

Christopoulos Charitos, Andreas January 2021 (has links)
Functional magnetic resonance imaging (fMRI) technology has been used in the investigation of human brain functionality and assist in brain disease diagnosis. While fMRI can be used to model both spatial and temporal brain functionality, the analysis of the fMRI images and the discovery of patterns for certain brain diseases is still a challenging task in medical imaging. Deep learning has been used more and more in medical field in an effort to further improve disease diagnosis due to its effectiveness in discovering high-level features in images. Convolutional neural networks (CNNs) is a class of deep learning algorithm that have been successfully used in medical imaging and extract spatial hierarchical features. The application of CNNs in fMRI and the extraction of brain functional patterns is an open field for research. This project focuses on how fMRIs can be used to improve Autism Spectrum Disorders (ASD) detection and diagnosis with 3D resting-state functional MRI (rs-fMRI) images. ASDs are a range of neurodevelopment brain diseases that mostly affect social function. Some of the symptoms include social and communicating difficulties, and also restricted  and repetitive  behaviors. The  symptoms appear on early childhood and tend to develop in time thus an early diagnosis is required. Finding a proper model for identifying between ASD and healthy subject is a challenging task and involves a lot of hyper-parameter tuning. In this project a grid search approach is followed in the quest of the optimal CNN architecture. Additionally, regularization and augmentation techniques are implemented in an effort to further improve the models performance.
9

Object Tracking in Games Using Convolutional Neural Networks

Venkatesh, Anirudh 01 June 2018 (has links) (PDF)
Computer vision research has been growing rapidly over the last decade. Recent advancements in the field have been widely used in staple products across various industries. The automotive and medical industries have even pushed cars and equipment into production that use computer vision. However, there seems to be a lack of computer vision research in the game industry. With the advent of e-sports, competitive and casual gaming have reached new heights with regard to players, viewers, and content creators. This has allowed for avenues of research that did not exist prior. In this thesis, we explore the practicality of object detection as applied in games. We designed a custom convolutional neural network detection model, SmashNet. The model was improved through classification weights generated from pre-training on the Caltech101 dataset with an accuracy of 62.29%. It was then trained on 2296 annotated frames from the competitive 2.5-dimensional fighting game Super Smash Brothers Melee to track coordinate locations of 4 specific characters in real-time. The detection model performs at a 68.25% accuracy across all 4 characters. In addition, as a demonstration of a practical application, we designed KirbyBot, a black-box adaptive bot which performs basic commands reactively based only on the tracked locations of two characters. It also collects very simple data on player habits. KirbyBot runs at a rate of 6-10 fps. Object detection has several practical applications with regard to games, ranging from better AI design, to collecting data on player habits or game characters for competitive purposes or improvement updates.
10

A Dataset of Vehicle and Pedestrian Trajectories from Normal Driving and Crash Events in One Year of Virginia Traffic Camera Data

Bareiss, Max G. 07 June 2023 (has links)
Traffic cameras are those cameras operated with the purpose of observing traffic, often streaming video in real-time to traffic management centers. These camera video streams allow transportation authorities to respond to traffic events and maintain situational awareness. However, traffic cameras also have the potential to directly capture crashes and conflicts, providing enough information to perform reconstruction and gain insights regarding causation and remediation. Beyond crash events, traffic camera video also offers an opportunity to study normal driving. Normal driver behavior is important for traffic planners, vehicle designers, and in the form of numerical driver models is vital information for the development of automated vehicles. Traffic cameras installed by state departments of transportation have already been placed in locations relevant to their interests. A wide range of driver behavior can be studied from these locations by observing vehicles at all times and under all weather conditions. Current systems to analyze traffic camera video focus on detecting when traffic events occur, with very little information about the specifics of those events. Prior studies into traffic event detection or reconstruction used 1-7 cameras placed by the researchers and collected dozens of hours of video. Crashes and other interesting events are rare and cannot be sufficiently characterized by camera installations of that size. The objective of this dissertation was to explore the utility of traffic camera data for transportation research by modeling and characterizing crash and non-crash behavior in pedestrians and drivers using a captured dataset of traffic camera video from the Commonwealth of Virginia, named the VT-CAST (Virginia Traffic Cameras for Advanced Safety Technologies) 2020 dataset. A total of 6,779,726 hours of traffic camera video was captured from live internet streams from December 17, 2019 at 4:00PM to 11:59PM on December 31, 2020. Video was analyzed by a custom R-CNN convolutional neural network keypoint detector to identify the locations of vehicles on the ground. The OpenPifPaf model was used to identify the locations of pedestrians on the ground. The location, pan, tilt, zoom, and altitude of each traffic camera was reconstructed to develop a mapping between the locations of vehicles and pedestrians on-screen and their physical location on the surface of the Earth. These physical detections were tracked across time to determine the trajectories on the surface of the Earth for each visible vehicle and pedestrian in a random sample of the captured video. Traffic camera video offers a unique opportunity to study crashes in-depth which are not police reported. Crashes in the traffic camera video were identified, analyzed, and compared to nationally representative datasets. Potential crashes were identified during the study interval by inspecting Virginia 511 traffic alerts for events which occurred near traffic cameras and impacted the flow of traffic. The video from these cameras was manually reviewed to determine whether a crash was visible. Pedestrian crashes, which did not significantly impact traffic, were identified from police accident reports (PARs) as a separate analysis. A total of 292 crashes were identified from traffic alerts, and six pedestrian crashes were identified from PARs. Road departure and rear-end crashes occurred in similar proportions to national databases, but intersection crashes were underrepresented and severe and rollover cases were overrepresented. Among these crashes, 32% of single-vehicle crashes and 50% of multi-vehicle crashes did not appear in the Virginia crash database. This finding shows promise for traffic cameras as a future data source for crash reconstruction, indicating traffic cameras are a capable tool to study unreported crashes. The safe operation of autonomous vehicles requires perception systems which make accurate short-term predictions of driver and pedestrian behavior. While road user behavior can be observed by the autonomous vehicles themselves, traffic camera video offers another potential information source for algorithm development. As a fixed roadside data source, these cameras capture a very large number of traffic interactions at a single location. This allows for detailed analyses of important roadway configurations across a wide range of drivers. To evaluate the efficacy of this approach, a total of 58 intersections in the VT-CAST 2020 dataset were sampled for driver trajectories at intersection entry, yielding 58,180 intersection entry trajectories. K-means clustering was used to group these trajectories into a family of 45 trajectory clusters. Likely as a function of signal phase, distinct groups of accelerating, constant speed, and decelerating trajectories were present. Accelerating and decelerating trajectories each occurred more frequently than constant speed trajectories. The results indicate that roadside data may be useful for understanding broad trends in typical intersection approaches for application to automated vehicle systems or other investigations; however, data utility would be enhanced with detailed signal phase information. A similar analysis was conducted of the interactions between drivers and pedestrians. A total of 35 crosswalks were identified in the VT-CAST 2020 dataset with sufficient trajectory information, yielding 1,488 trajectories of drivers interacting with pedestrians. K-means clustering was used to group these trajectories into a family of 16 trajectory clusters. Distinct groups of accelerating, constant speed, and decelerating trajectories were present, including trajectory clusters which described vehicles slowing down around pedestrians. Constant speed trajectories occurred the most often, followed by accelerating trajectories and decelerating trajectories. As with the prior investigation, this finding suggests that roadside data may be used in the development of driver-pedestrian interaction models for automated vehicles and other use cases involving a combination of pedestrians and vehicles. Overall, this dissertation demonstrates the utility of standard traffic camera data for use in traffic safety research. As evidence, there are already three current studies (beyond this dissertation) using the video data and trajectories from the VT-CAST 2020 dataset. Potential future studies include analyzing the mobile phone use of pedestrians, analyzing mid-block pedestrian crossings, automatically performing roadway safety assessments, considering the behavior of drivers following congested driving, evaluating the effectiveness of work zone hazard countermeasures, and understanding roadway encroachments. / Doctor of Philosophy / Traffic cameras are those cameras operated with the purpose of observing traffic, often streaming video in real-time to traffic management centers. These video streams allow transportation authorities to maintain situational awareness and respond to traffic events. However, traffic cameras also have the potential to directly capture crashes, providing enough information to perform reconstruction and gain insights regarding causation and remediation. Beyond crash events, traffic camera video also offers an opportunity to study normal driving, which is vital information for the operation of automated vehicles. Traffic cameras installed by state departments of transportation have already been placed in thousands of locations around the country capturing traffic scenes relevant to their interests. A wide range of driver and pedestrian behavior can be studied from these locations by observing vehicles at all times and under all weather conditions. Current systems to analyze traffic camera video focus on detecting when traffic events occur, with very little information about the specifics of those events. Previous studies into traffic event detection or reconstruction used 1-7 cameras placed by the researchers and collected dozens of hours of video. Crashes and other interesting events are rare and cannot be sufficiently characterized by camera installations of that size. The objective of this dissertation was to explore the utility of traffic camera data for transportation research by modeling and characterizing crash and non-crash behavior in pedestrians and drivers using a dataset of statewide traffic camera video captured from the Commonwealth of Virginia. A total of 6,779,726 hours of traffic camera video from live internet streams was captured from December 17, 2019 at 4:00PM to 11:59PM on December 31, 2020. This captured video was processed by a trajectory analysis system which determined the path on the ground for each visible vehicle and pedestrian in a random sample of the captured video. Additionally, 298 crashes visible in the traffic camera video were analyzed, comparing them to nationally representative crash datasets. With anticipated uses in traffic modeling and automated vehicle development, two additional potential use cases of the dataset were explored: cases where a driver enters an intersection, and cases where a driver interacts with a pedestrian.

Page generated in 0.063 seconds