Global ETD Search

211	Time Domain Multiply and Accumulate Engine for Convolutional NeuralNetworks Du, Kevin Tan January 2020 (has links) No description available. Electrical Engineering Time domain Multiply and Accumulate CNN Delay Line DTC TDC Gated Ring Oscillator Near Memory
212	Deep Learning-Based Bone Segmentation of the Metatarsophalangeal Joint : Using an Automatic and an Interactive Approach / Djupinlärningsbaserad bensegmentering av metatarsophalangealleden : Användning av ett automatiskt och ett interaktivt tillvägagångssätt Krogh, Hannah January 2023 (has links) The first Metatarsophalangeal (MTP) joint is essential for foot biomechanics and weight-bearing activities. Osteoarthritis in this joint can lead to pain, discomfort, and limited mobility. In order to treat this, Episurf Medical is working to produce individualized implants based on 3D segmentations of the joint. As manual segmentations are both time- and cost-consuming, and susceptible to human errors, automatic approaches are preferred. This thesis uses U-Net and DeepEdit as deep-learning based methods for segmentation of the MTP joint, with the latter being evaluated with and without user interactions. The dataset used in this study consisted of 38 CT images, where each model was trained on 30 images, and the remaining images were used as a test set. The final models were evaluated and compared with regards to the Dice Similarity Coefficient (DSC), precision, and recall. The U-Net model achieved DSC 0.944, precision 0.961, and recall 0.929. The automatic DeepEdit approach obtained DSC of 0.861, precision of 0.842, and recall of 0.891, while the interactive DeepEdit approach resulted in DSC of 0.918, precision of 0.912, and recall of 0.928. All pairwise comparisons in terms of precision and DSC showed significant differences (p<0.05), where U-Net had the highest performance, while the difference in recall was not found to be significant (p>0.05) for any comparison. The lower performances of DeepEdit compared to U-Net could be due to lower spatial resolution in the segmentations. Nevertheless, DeepEdit remains a promising method, and further investigations of unexplored areas could be addressed as future work. / Den första Metatarsalphalangeal(MTP) leden är viktig för fotens biomekanik och viktbärande aktiviteter. Artros i denna led kan leda till smärta, obehag och begränsad rörlighet. För att behandla detta arbetar Episurf Medical med att producera individanpassade implantat baserat på 3D segmenteringar av leden. Då manuella segmenteringar både är tids- och kostnadskrävande, samt känsliga för mänskliga fel, föredras automatiska metoder. Denna avhandling använder U-Net och DeepEdit som djupinlärningsbaserade metoder för segm- entering av MTP leden, varav det senare utvärderas med och utan användarint- eraktion. Datasetet som användes i denna studie bestod av 38 CT bilder, där varje modell tränades på 30 bilder och de återstående användes som testdata. De slutliga modellerna utvärderades och jämfördes med avseende på Dice Similarity Coefficient (DSC), precision och recall. U-Net modellen uppnådde DSC 0.944, precision 0.961 och recall 0.929. Den automatiska DeepEdit metoden erhöll DSC 0.861, precision 0.842 och recall 0.891, medan den interaktiva DeepEdit metoden resulterade i DSC 0.918, precision 0.912 och recall 0.928. Alla parvisa jämförelser avseende precision och DSC visade signifikanta skillnader (p<0.05), där U-Net hade den högsta prestandan, medan skillnaden i recall inte visade sig vara signifikant (p>0.05) för någon jämförelse. Den lägre prestandan för DeepEdit jämfört med U-Net kan bero på lägre spatiell upplösning i segmenteringarna. Dock är DeepEdit fortfarande en lovande metod, och ytterligare undersökningar av outforskade områden kan tas upp som framtida arbete. Deep Learning CNN U-Net DeepEdit Bone Segmentation CT MTP Joint Medical Engineering Medicinteknik
213	IMBALANCED TIME SERIES FORECASTING AND NEURAL TIME SERIES CLASSIFICATION Chen, Xiaoqian 01 August 2023 (has links) (PDF) This dissertation will focus on the forecasting and classification of time series. Specifically, the forecasting problem will focus on imbalanced time series (ITS) which contain a mix of a mix of low probability extreme observations and high probability normal observations. Two approaches are proposed to improve the forecasting of ITS. In the first approach proposed in chapter 2, an ITS will be modelled as a composition of normal and extreme observations, the input predictor variables and the associated forecast output will be combined into moving blocks, and the blocks will be categorized as extreme event (EE) or normal event (NE) blocks. Imbalance will be decreased by oversampling the minority EE blocks and undersampling the majority NE blocks using modifications of block bootstrapping and synthetic minority oversampling technique (SMOTE). Convolution neural networks (CNNs) and long-short term memory (LSTMs) will be selected for forecast modelling. In the second approach described in chapter 3, which focuses on improving the forecasting accuracies LSTM models, a training strategy called Circular-Shift Circular Epoch Training (CSET), is proposed to preserve the natural ordering of observations in epochs during training without any attempt to balance the extreme and normal observations. The strategy will be universal because it could be applied to train LSTMs to forecast events in normal time series or in imbalanced time series in exactly the same manner. The CSET strategy will be formulated for both univariate and multivariate time series forecasting. The classification problem will focus on the classification event-related potential neural time series by exploiting information offered by the cone of influence (COI) of the continuous wavelet transform (CWT). The COI is a boundary that is superimposed on the wavelet scalogram to delineate the coefficients that are accurate from those that are inaccurate due to edge effects. The features derived from the inaccurate coefficients are, therefore, unreliable. It is hypothesized that the classifier performance would improve if unreliable features, which are outside the COI, are zeroed out, and the performance would improve even further if those features are cropped out completely. Two CNN multidomain models will be introduced to fuse the multichannel Z-scalograms and the V-scalograms. In the first multidomain model, referred to as the Z-CuboidNet, the input to the CNN will be generated by fusing the Z-scalograms of the multichannel ERPs into a frequency-time-spatial cuboid. In the second multidomain model, referred to as the V-MatrixNet, the CNN input will be formed by fusing the frequency-time vectors of the V-scalograms of the multichannel ERPs into a frequency-time-spatial matrix. CNN cone of influence Deep learning imbalanced time series forecasting LSTM neural time series classification
214	Multi-spectral Fusion for Semantic Segmentation Networks Edwards, Justin 05 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Semantic segmentation is a machine learning task that is seeing increased utilization in multiples fields, from medical imagery, to land demarcation, and autonomous vehicles. Semantic segmentation performs the pixel-wise classification of images, creating a new, seg- mented representation of the input that can be useful for detected various terrain and objects within and image. Recently, convolutional neural networks have been heavily utilized when creating neural networks tackling the semantic segmentation task. This is particularly true in the field of autonomous driving systems. The requirements of automated driver assistance systems (ADAS) drive semantic seg- mentation models targeted for deployment on ADAS to be lightweight while maintaining accuracy. A commonly used method to increase accuracy in the autonomous vehicle field is to fuse multiple sensory modalities. This research focuses on leveraging the fusion of long wave infrared (LWIR) imagery with visual spectrum imagery to fill in the inherent perfor- mance gaps when using visual imagery alone. This comes with a host of benefits, such as increase performance in various lighting conditions and adverse environmental conditions. Utilizing this fusion technique is an effective method of increasing the accuracy of a semantic segmentation model. Being a lightweight architecture is key for successful deployment on ADAS, as these systems often have resource constraints and need to operate in real-time. Multi-Spectral Fusion Network (MFNet) [1] accomplishes these parameters by leveraging a sensory fusion approach, and as such was selected as the baseline architecture for this research. Many improvements were made upon the baseline architecture by leveraging a variety of techniques. Such improvements include the proposal of a novel loss function categori- cal cross-entropy dice loss, introduction of squeeze and excitation (SE) blocks, addition of pyramid pooling, a new fusion technique, and drop input data augmentation. These improve- ments culminated in the creation of the Fast Thermal Fusion Network (FTFNet). Further improvements were made by introducing depthwise separable convolutional layers leading to lightweight FTFNet variants, FTFNet Lite 1 & 2. 13 The FTFNet family was trained on the Multi-Spectral Road Scenarios (MSRS) and MIL- Coaxials visual/LWIR datasets. The proposed modifications lead to an improvement over the baseline in mean intersection over union (mIoU) of 2.92% and 2.03% for FTFNet and FTFNet Lite 2 respectively when trained on the MSRS dataset. Additionally, when trained on the MIL-Coaxials dataset, the FTFNet family showed improvements in mIoU of 8.69%, 4.4%, and 5.0% for FTFNet, FTFNet Lite 1, and FTFNet Lite 2. Neural Networks Semantic Segmentation Sensory Fusion Thermal Imagery Convolutional Neural Networks CNN
215	Channel Estimation Optimization in 5G New Radio using Convolutional Neural Networks / Kanalestimeringsoptimering i 5G NR med konvolutionellt neuralt nätverk Adolfsson, David January 2023 (has links) Channel estimation is the process of understanding and analyzing the wireless communication channel's properties. It helps optimize data transmission by providing essential information for adjusting encoding and decoding parameters. This thesis explores using a Convolutional Neural Network~(CNN) for channel estimation in the 5G Link Level Simulator, 5G-LLS, developed by Tietoevry. The objectives were to create a Python framework for channel estimation experimentation and to evaluate CNN's performance compared to the conventional algorithms Least Squares~(LS), Minimum Mean Square Error~(MMSE) and Linear Minimum Mean Square Error~(LMMSE). Two distinct channel model scenarios were investigated in this study. The results from the study suggest that CNN outperforms LMMSE, LS, and MMSE regarding Mean Squared Error~(MSE) for both channel models, with LMMSE at second place. It managed to lower to the MSE by 85\% compared to the LMMSE for the correlated channel and 78\% for the flat fading channel. In terms of the overall system-level performance, as measured by Bit-Error Rate (BER), the CNN only managed to outperform LS and MMSE. The CNN and the LMMSE yielded similar results. This was due to that the LMMSE's MSE was still good enough to demodulate the symbols for the QPSK modulation scheme correctly. The insights in this thesis work enables Tietoevry to implement more machine learning algorithms and further develop channel estimation in 5G telecommunications and wireless communication networks through experiments in 5G-LLS. Given that the CNN did not increase the performance of the communication system, future studies should test a broader range of channel models and consider more complex modulation schemes. Also, studying other and more advanced machine learning techniques than CNN is an avenue for future research. / Kanalestimering är en process i trådlösa kommunikationssystem som handlar om att analysera och förstå det trådlösa mediumets egenskaper. Genom effektiv kanalestimering kan dataöverföringen optimeras genom att anpassa signalen efter den trådlösa kanalen. Detta arbete utforskar användningen av ett konvolutionellt neuralt nätverk (CNN) för kanalestimering i Tietoevrys 5G-datalänkslagersimulator (5G-LLS). Målen är att (1) skapa ett Python-ramverk för kanalestimeringsexperiment samt att (2) utvärdera CNN:s prestanda jämfört med konventionella algoritmerna minsta kvadratmetoden (LS), minimalt medelkvadratsfel (MMSE) och linjärt minimalt medelkvadratsfel (LMMSE). Två olika kanalmodellsituationer undersöks i detta arbete. Resultaten visar att CNN överträffar LMMSE, LS och MMSE i form av medelkvadratisk fel (MSE) för båda kanalmodellerna, med LMMSE på andra plats. CNN:n lyckades minska MSE:n med 85\% jämfört med LMMSE för den korrelerade kanalen och med 78\% för den snabbt dämpande kanalen. Vad gäller systemnivåprestanda, mätt med hjälp av bitfelsfrekvens (BER), lyckades CNN endast överträffa LS och MMSE. CNN och LMMSE gav liknande resultat. Detta beror på att LMMSE:s MSE fortfarande var tillräckligt låg för att korrekt demodulera symbolerna för QPSK-modulationsschemat. Resultatet från detta examensarbete möjliggör för Tietoevry att implementera fler maskininlärningsalgoritmer och vidareutveckla kanalestimering inom 5G-telekommunikation och trådlösa kommunikationsnätverk genom experiment i 5G-LLS. Med tanke på att CNN inte överträffade samtliga kanalestimeringstekniker bör framtida studier testa ett bredare utbud av kanalmodeller och överväga mer komplexa moduleringsscheman. Framtida arbeten bör även utforska fler och mer avancerade maskininlärningsalgoritmer än CNN. Channel Estimation 5G NR CNN Kanalestimering 5G NR Computer Sciences Datavetenskap (datalogi)
216	Comparison of Discriminative and Generative Image Classifiers Budh, Simon, Grip, William January 2022 (has links) In this report a discriminative and a generative image classifier, used for classification of images with handwritten digits from zero to nine, are compared. The aim of this project was to compare the accuracy of the two classifiers in absence and presence of perturbations to the images. This report describes the architectures and training of the classifiers using PyTorch. Images were perturbed in four ways for the comparison. The first perturbation was a model-specific attack that perturbed images to maximize likelihood of misclassification. The other three image perturbations changed pixels in a stochastic fashion. Furthermore, The influence of training using perturbed images on the robustness of the classifier, against image perturbations, was studied. The conclusions drawn in this report was that the accuracy of the two classifiers on unperturbed images was similar and the generative classifier was more robust against the model-specific attack. Also, the discriminative classifier was more robust against the stochastic noise and was significantly more robust against image perturbations when trained on perturbed images. / I den här rapporten jämförs en diskriminativ och en generativ bildklassificerare, som används för klassificering av bilder med handskrivna siffror från noll till nio. Syftet med detta projekt var att jämföra träffsäkerheten hos de två klassificerarna med och utan störningar i bilderna. Denna rapport beskriver arkitekturerna och träningen av klassificerarna med hjälp av PyTorch. Bilder förvrängdes på fyra sätt för jämförelsen. Den första bildförvrängningen var en modellspecifik attack som förvrängde bilder för att maximera sannolikheten för felklassificering. De andra tre bildförvrängningarna ändrade pixlar på ett stokastiskt sätt. Dessutom studerades inverkan av träning med störda bilder på klassificerarens robusthet mot bildstörningar. Slutsatserna som drogs i denna rapport är att träffsäkerheten hos de två klassificerarna på oförvrängda bilder var likartad och att den generativa klassificeraren var mer robust mot den modellspecifika attacken. Dessutom var den diskriminativa klassificeraren mer robust mot slumpmässiga bildförvrängningar och var betydligt mer robust mot bildstörningar när den tränades på förvrängda bilder. / Kandidatexjobb i elektroteknik 2022, KTH, Stockholm Image classification CNN Normalizing flows RealNVP Adversarial examples Elektroteknik och elektronik
217	<strong>A LARGE-SCALE UAV AUDIO DATASET AND AUDIO-BASED UAV CLASSIFICATION USING CNN</strong> Yaqin Wang (8797037) 17 July 2023 (has links) <p>The growing popularity and increased accessibility of unmanned aerial vehicles (UAVs) have raised concerns about potential threats they may pose. In response, researchers have devoted significant efforts to developing UAV detection and classification systems, utilizing diverse methodologies such as computer vision, radar, radio frequency, and audio-based approaches. However, the availability of publicly accessible UAV audio datasets remains limited. Consequently, this research endeavor was undertaken to address this gap by undertaking the collection of a comprehensive UAV audio dataset, alongside the development of a precise and efficient audio-based UAV classification system.</p> <p>This research project is structured into three distinct phases, each serving a unique purpose in data collection and training the proposed UAV classifier. These phases encompass data collection, dataset evaluation, the implementation of a proposed convolutional neural network, training procedures, as well as an in-depth analysis and evaluation of the obtained results. To assess the effectiveness of the model, several evaluation metrics are employed, including training accuracy, loss rate, the confusion matrix, and ROC curves.</p> <p>The findings from this study conclusively demonstrate that the proposed CNN classi- fier exhibits nearly flawless performance in accurately classifying UAVs across 22 distinct categories.</p> Deep learning Neural networks UAV Classification CNN Audio Classification Deep Learning
218	Ablation Study on Deeplabv3+ for Semantic Segmentation Lei, Bowen 01 September 2023 (has links) (PDF) Semantic segmentation is a fundamental task in computer vision that aims to classify every pixel in an image into different categories. Deep convolutional neural networks (CNNs) have achieved state-of-the-art results in semantic segmentation. Deeplabv3+ is a deep CNN-based model that uses atrous convolution and a decoder network to improve the accuracy of semantic segmentation. In this research, we conduct an ablation study on Deeplabv3+ to analyze the importance of its different components and their impact on the performance of the model, which provides valuable insights for developing more efficient and accurate semantic segmentation models. Our study encompasses a comprehensive examination of Deeplabv3+. We explore its constituent elements, including the backbone network, the Atrous Spatial Pyramid Pooling (ASPP) module, and the decoder network. Our investigation delves into the reasons underlying performance changes resulting from the removal of these architectural components. This analysis provides a deeper understanding of their intrinsic roles in shaping the model’s segmentation efficacy. Notably, we identify that the backbone exerts a substantial impact. Changes to other components yield relatively minor effects, while modifications to the backbone wield a remarkable influence. The Encoder-decoder structure also bears significant weight, playing a pivotal role in the upsampling process. This structure significantly impacts precision, enhancing boundary clarity and positional accuracy. Moreover, we recognize the vital role of feature integration. Features aid in establishing pixel position information, enhancing boundary definition, and positioning accuracy. Furthermore, the ASPP module emerges as a critical factor. ASPP leverages multi-scale information to differentiate complex object boundaries, further enriching the model’s semantic understanding. CNN Deeplabv3+ Ablation Study Segmentation Electrical and Computer Engineering
219	Time Series Forecasting on Database Storage Patel, Pranav January 2024 (has links) Time Series Forecasting has become vital in various industries ranging from weather forecasting to business forecasting. There is a need to research database storage solutions for companies in order to optimize resource allocation, enhance decision making process and enable predictive data storage maintenance. With the introduction of Artificial Intelligence and a branch of AI, Machine Learning, Time Series Forecasting has become more powerful and efficient. This project attempts to validate the possibility of using time series forecasting on database storage data to make business predictions. Currently, predicting capabilities of database storage is an area which is not fully explored, despite the growing necessity of databases. Currently, most of the optimization of databases is left to human touch which is ultimately slower and more error prone. As such, this research will investigate the possibilities of time series forecasting in database storage. This project will use Machine Learning and Time-series Forecasting to predict the future trend of database storage to give information on how the trend of the data will change. Examining the pattern of database storage fluctuations will allow the respective owners an overview of their storage and in turn, make decisions on optimizing the database to prevent critical problems ahead of time. Three distinct approaches - employing a traditional linear model fore forecasting, utilizing a Convolutional Neural Network (CNN) to detect local changes in time series data, and leveraging a Recurrent Neural Network (RNN) to capture long term temporal dependencies - are implemented to assess which of these techniques is better suited for the provided dataset. Furthermore, two settings (single step and multi step) have been tested in order to test the changes in accuracy from a small prediction step to a major. The research indicates that currently the models do not have the possibility to be used. This is due to the mean absolute error being very big. The main purpose of the project was to establish which of the three different techniques is the best for the particular dataset provided by the company. In general, across all approaches (Linear, CNN, RNN), their performance was superior in the single step method. In the multi step aspect, The linear model suffered the greatest in the accuracy drop with CNN and RNN performing slightly better. The findings also indicated that the model with local change detection (CNN) performs better for the provided dataset in both single and multi step settings, as evidenced by its minimal Mean Absolute Error (MAE). This is because the dataset is comprised of local data and the models are only trained to check for normal changes. If the research had also checked for seasonality or sequential patterns, then it is possible that LSTM may have had a better outcome due to its capability of capturing those dependencies. The accuracy of single step forecasting using CNN is good (MAE = 0.25) but must be further explored and improved. Machine Learning Time Series Forecasting Prediction Neural Networks CNN RNN Database Storage Computer Sciences Datavetenskap (datalogi)
220	Parameter Estimation of LPI Radar in Noisy Environments using Convolutional Neural Networks / Parameteruppskattning av LPI radar i brusiga miljöer med faltningsnätverk Appelgren, Filip January 2021 (has links) Low-probability-of-intercept (LPI) radars are notoriously difficult for electronic support receivers to detect and identify due to their changing radar parameters and low power. Previous work has been done to create autonomous methods that can estimate the parameters of some LPI radar signals, utilizing methods outside of Deep Learning. Designs using the Wigner-Ville Distribution in combination with the Hough and the Radon transform have shown some success. However, these methods lack full autonomous operation, require intermediary steps, and fail to operate in too low Signal-to-Noise ratios (SNR). An alternative method is presented here, utilizing Convolutional Neural Networks, with images created by the Smoothed-Pseudo Wigner-Ville Distribution (SPWVD), to extract parameters. Multiple common LPI modulations are studied, frequency modulated continuous wave (FMWC), Frank code and, Costas sequences. Five Convolutional Neural Networks (CNNs) of different sizes and layouts are implemented to monitor estimation performance, inference time, and their relationship. Depending on how the parameters are represented, either with continuous values or discrete, they are estimated through different methods, either regression or classification. Performance for the networks’ estimations are presented, but also their inference times and potential maximum throughput of images. The results indicate good performance for the largest networks, across most variables estimated and over a wide range of SNR levels, with decaying performance as network size decreases. The largest network achieves a standard deviation for the estimation errors of, at most, 6%, for the regression variables in the FMCW and the Frank modulations. For the parameters estimated through classification, accuracy is at least 56% over all modulations. As network size decreases, so does the inference time. The smallest network achieves a throughput of about 61000 images per second, while the largest achieves 2600. / Low-Probability-of-Intercept (LPI) radar är designad för att vara svåra att upptäcka och identifiera. En LPI radar uppnår detta genom att använda en låg effekt samt ändra något hos radarsignalen över tid, vanligtvis frekvens eller fas. Estimering av parametrarna hos vissa typer av LPI radar har gjorts förut, med andra metoder än djupinlärning. De metoderna har använt sig av Wigner-Ville Distributionen tillsammans med Hough och Radon transformer för att extrahera parametrar. Nackdelarna med dessa är framför allt att de inte fungerar fullständigt i för höga brusnivåer utan blir opålitliga i deras estimeringar. Utöver det kräver de också visst manuellt arbete, t.ex. i form av att sätta tröskelvärden. Här presenteras istället en metod som använder faltningsnätverk tillsammans med bilder som genererats genom Smoothed- Pseudo Wigner-Ville Distributionen, för att estimera parametrarna hos LPI radar. Vanligt förekommande LPI-modulationer studeras, som frequency modulated continuous wave (FMCW), Frank-koder och Costas-sekvenser. Fem faltningsnätverk av olika storlek implementeras, för att kunna studera prestandan, analystiden per bild, och deras förhållande till varandra. Beroende på hur parametrarna representeras, antingen med kontinuerliga värden eller diskreta värden, estimeras de med olika metoder, antingen regression eller klassificering. Prestanda för nätverkens estimeringar presenteras, men också deras analystid och potentiella maximala genomströmning av bilder. Testen för parameterestimering visar på god prestanda, speciellt för de större nätverken som studerats. För det största nätverket är standardavvikelsen på estimeringsfelen som mest 6%, för FMCW- och Frank-modulationerna. För alla parametrar som estimeras genom klassificering uppnås som minst 56% precision för det största nätverket. Även i testerna för analystid är nätverksstorlek relevant. När storleken minskar, går antalet beräkningar som behöver göras ned, och bilderna behandlas snabbare. Det minsta nätverket kan analysera ungefär 61000 bilder per sekund, medan det största uppnår ungefär 2600 per sekund. Deep Learning LPI Radar CNN Parameter estimation Djupinlärning LPI Radar Faltningsnätverk Parameterestimering Computer Sciences Datavetenskap (datalogi)

Search results