• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 246
  • 36
  • 15
  • 10
  • 4
  • 4
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 393
  • 393
  • 393
  • 256
  • 193
  • 176
  • 119
  • 90
  • 84
  • 79
  • 72
  • 65
  • 59
  • 58
  • 55
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Self-supervised monocular image depth learning and confidence estimation

Chen, L., Tang, W., Wan, Tao Ruan, John, N.W. 17 June 2020 (has links)
No / We present a novel self-supervised framework for monocular image depth learning and confidence estimation. Our framework reduces the amount of ground truth annotation data required for training Convolutional Neural Networks (CNNs), which is often a challenging problem for the fast deployment of CNNs in many computer vision tasks. Our DepthNet adopts a novel fully differential patch-based cost function through the Zero-Mean Normalized Cross Correlation (ZNCC) to take multi-scale patches as matching and learning strategies. This approach greatly increases the accuracy and robustness of the depth learning. Whilst the proposed patch-based cost function naturally provides a 0-to-1 confidence, it is then used to self-supervise the training of a parallel network for confidence map learning and estimation by exploiting the fact that ZNCC is a normalized measure of similarity which can be approximated as the confidence of the depth estimation. Therefore, the proposed corresponding confidence map learning and estimation operate in a self-supervised manner and is a parallel network to the DepthNet. Evaluation on the KITTI depth prediction evaluation dataset and Make3D dataset show that our method outperforms the state-of-the-art results.
12

Squeeze and Excite Residual Capsule Network for Embedded Edge Devices

Naqvi, Sami 08 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / During recent years, the field of computer vision has evolved rapidly. Convolutional Neural Networks (CNNs) have become the chosen default for implementing computer vision tasks. The popularity is based on how the CNNs have successfully performed the well-known computer vision tasks such as image annotation, instance segmentation, and others with promising outcomes. However, CNNs have their caveats and need further research to turn them into reliable machine learning algorithms. The disadvantages of CNNs become more evident as the approach to breaking down an input image becomes apparent. Convolutional neural networks group blobs of pixels to identify objects in a given image. Such a technique makes CNNs incapable of breaking down the input images into sub-parts, which could distinguish the orientation and transformation of objects and their parts. The functions in a CNN are competent at learning only the shift-invariant features of the object in an image. The discussed limitations provides researchers and developers a purpose for further enhancing an effective algorithm for computer vision. The opportunity to improve is explored by several distinct approaches, each tackling a unique set of issues in the convolutional neural network’s architecture. The Capsule Network (CapsNet) which brings an innovative approach to resolve issues pertaining to affine transformations by sharing transformation matrices between the different levels of capsules. While, the Residual Network (ResNet) introduced skip connections which allows deeper networks to be more powerful and solves vanishing gradient problem. The motivation of these fusion of these advantageous ideas of CapsNet and ResNet with Squeeze and Excite (SE) Block from Squeeze and Excite Network, this research work presents SE-Residual Capsule Network (SE-RCN), an efficient neural network model. The proposed model, replaces the traditional convolutional layer of CapsNet with skip connections and SE Block to lower the complexity of the CapsNet. The performance of the model is demonstrated on the well known datasets like MNIST and CIFAR-10 and a substantial reduction in the number of training parameters is observed in comparison to similar neural networks. The proposed SE-RCN produces 6.37 Million parameters with an accuracy of 99.71% on the MNIST dataset and on CIFAR-10 dataset it produces 10.55 Million parameters with 83.86% accuracy.
13

Deep face recognition using imperfect facial data

Elmahmudi, Ali A.M., Ugail, Hassan 27 April 2019 (has links)
Yes / Today, computer based face recognition is a mature and reliable mechanism which is being practically utilised for many access control scenarios. As such, face recognition or authentication is predominantly performed using ‘perfect’ data of full frontal facial images. Though that may be the case, in reality, there are numerous situations where full frontal faces may not be available — the imperfect face images that often come from CCTV cameras do demonstrate the case in point. Hence, the problem of computer based face recognition using partial facial data as probes is still largely an unexplored area of research. Given that humans and computers perform face recognition and authentication inherently differently, it must be interesting as well as intriguing to understand how a computer favours various parts of the face when presented to the challenges of face recognition. In this work, we explore the question that surrounds the idea of face recognition using partial facial data. We explore it by applying novel experiments to test the performance of machine learning using partial faces and other manipulations on face images such as rotation and zooming, which we use as training and recognition cues. In particular, we study the rate of recognition subject to the various parts of the face such as the eyes, mouth, nose and the cheek. We also study the effect of face recognition subject to facial rotation as well as the effect of recognition subject to zooming out of the facial images. Our experiments are based on using the state of the art convolutional neural network based architecture along with the pre-trained VGG-Face model through which we extract features for machine learning. We then use two classifiers namely the cosine similarity and the linear support vector machines to test the recognition rates. We ran our experiments on two publicly available datasets namely, the controlled Brazilian FEI and the uncontrolled LFW dataset. Our results show that individual parts of the face such as the eyes, nose and the cheeks have low recognition rates though the rate of recognition quickly goes up when individual parts of the face in combined form are presented as probes.
14

Deep Learning for Taxonomy Prediction

Ramesh, Shreyas 04 June 2019 (has links)
The last decade has seen great advances in Next-Generation Sequencing technologies, and, as a result, there has been a rise in the number of genomes sequenced each year. In 2017, there were as many as 10,000 new organisms sequenced and added into the RefSeq Database. Taxonomy prediction is a science involving the hierarchical classification of DNA fragments up to the rank species. In this research, we introduce Predicting Linked Organisms, Plinko, for short. Plinko is a fully-functioning, state-of-the-art predictive system that accurately captures DNA - Taxonomy relationships where other state-of-the-art algorithms falter. Plinko leverages multi-view convolutional neural networks and the pre-defined taxonomy tree structure to improve multi-level taxonomy prediction. In the Plinko strategy, each network takes advantage of different word usage patterns corresponding to different levels of evolutionary divergence. Plinko has the advantages of relatively low storage, GPGPU parallel training and inference, making the solution portable, and scalable with anticipated genome database growth. To the best of our knowledge, Plinko is the first to use multi-view convolutional neural networks as the core algorithm in a compositional,alignment-free approach to taxonomy prediction. / Master of Science / Taxonomy prediction is a science involving the hierarchical classification of DNA fragments up to the rank species. Given species diversity on Earth, taxonomy prediction gets challenging with (i) increasing number of species (labels) to classify and (ii) decreasing input (DNA) size. In this research, we introduce Predicting Linked Organisms, Plinko, for short. Plinko is a fully-functioning, state-of-the-art predictive system that accurately captures DNA - Taxonomy relationships where other state-of-the-art algorithms falter. Three major challenges in taxonomy prediction are (i) large dataset sizes (order of 109 sequences) (ii) large label spaces (order of 103 labels) and (iii) low resolution inputs (100 base pairs or less). Plinko leverages multi-view convolutional neural networks and the pre-defined taxonomy tree structure to improve multi-level taxonomy prediction for hard to classify sequences under the three conditions stated above. Plinko has the advantage of relatively low storage footprint, making the solution portable, and scalable with anticipated genome database growth. To the best of our knowledge, Plinko is the first to use multi-view convolutional neural networks as the core algorithm in a compositional, alignment-free approach to taxonomy prediction.
15

Measuring the Functionality of Amazon Alexa and Google Home Applications

Wang, Jiamin 01 1900 (has links)
Voice Personal Assistant (VPA) is a software agent, which can interpret the user's voice commands and respond with appropriate information or action. The users can operate the VPA by voice to complete multiple tasks, such as read the message, order coffee, send an email, check the news, and so on. Although this new technique brings in interesting and useful features, they also pose new privacy and security risks. The current researches have focused on proof-of-concept attacks by pointing out the potential ways of launching the attacks, e.g., craft hidden voice commands to trigger malicious actions without noticing the user, fool the VPA to invoke the wrong applications. However, lacking a comprehensive understanding of the functionality of the skills and its commands prevents us from analyzing the potential threats of these attacks systematically. In this project, we developed convolutional neural networks with active learning and keyword-based approach to investigate the commands according to their capability (information retrieval or action injection) and sensitivity (sensitive or nonsensitive). Through these two levels of analysis, we will provide a complete view of VPA skills, and their susceptibility to the existing attacks. / M.S. / Voice Personal Assistant (VPA) is a software agent, which can interpret the users' voice commands and respond with appropriate information or action. The current popular VPAs are Amazon Alexa, Google Home, Apple Siri and Microsoft Cortana. The developers can build and publish third-party applications, called skills in Amazon Alex and actions in Google Homes on the VPA server. The users simply "talk" to the VPA devices to complete different tasks, like read the message, order coffee, send an email, check the news, and so on. Although this new technique brings in interesting and useful features, they also pose new potential security threats. Recent researches revealed that the vulnerabilities exist in the VPA ecosystems. The users can incorrectly invoke the malicious skill whose name has similar pronunciations to the user-intended skill. The inaudible voice triggers the unintended actions without noticing users. All the current researches focused on the potential ways of launching the attacks. The lack of a comprehensive understanding of the functionality of the skills and its commands prevents us from analyzing the potential consequences of these attacks systematically. In this project, we carried out an extensive analysis of third-party applications from Amazon Alexa and Google Home to characterize the attack surfaces. First, we developed a convolutional neural network with active learning framework to categorize the commands according to their capability, whether they are information retrieval or action injection commands. Second, we employed the keyword-based approach to classifying the commands into sensitive and nonsensitive classes. Through these two levels of analysis, we will provide a complete view of VPA skills' functionality, and their susceptibility to the existing attacks.
16

Towards Explainable Decision-making Strategies of Deep Convolutional Neural Networks : An exploration into explainable AI and potential applications within cancer detection

Hammarström, Tobias January 2020 (has links)
The influence of Artificial Intelligence (AI) on society is increasing, with applications in highly sensitive and complicated areas. Examples include using Deep Convolutional Neural Networks within healthcare for diagnosing cancer. However, the inner workings of such models are often unknown, limiting the much-needed trust in the models. To combat this, Explainable AI (XAI) methods aim to provide explanations of the models' decision-making. Two such methods, Spectral Relevance Analysis (SpRAy) and Testing with Concept Activation Methods (TCAV), were evaluated on a deep learning model classifying cat and dog images that contained introduced artificial noise. The task was to assess the methods' capabilities to explain the importance of the introduced noise for the learnt model. The task was constructed as an exploratory step, with the future aim of using the methods on models diagnosing oral cancer. In addition to using the TCAV method as introduced by its authors, this study also utilizes the CAV-sensitivity to introduce and perform a sensitivity magnitude analysis. Both methods proved useful in discerning between the model’s two decision-making strategies based on either the animal or the noise. However, greater insight into the intricacies of said strategies is desired. Additionally, the methods provided a deeper understanding of the model’s learning, as the model did not seem to properly distinguish between the noise and the animal conceptually. The methods thus accentuated the limitations of the model, thereby increasing our trust in its abilities. In conclusion, the methods show promise regarding the task of detecting visually distinctive noise in images, which could extend to other distinctive features present in more complex problems. Consequently, more research should be conducted on applying these methods on more complex areas with specialized models and tasks, e.g. oral cancer.
17

Interiörs påverkan på lägenheters pris och värdering / The effects of interior condition on price and evaluation of real estate

Hemmingsson, Jesper, Häusler Redhe, Adrian January 2021 (has links)
Fastighetsvärderingar har historiskt sett utförts av mäklare eller experter på området. Med den växande mängden verktyg på internet för värderingar uppstår frågan hur väl verktygen presterar och vad som kan göras för att förbättra dem. Moderna metoder utgår ifrån försäljningsstatistik av liknande objekt när en värdering görs med tekniska verktyg. Detta med hjälp av olika former av metadata, bland annat storlek, läge och byggnations år. Den här studien utforskar möjligheten att använda interiört skick som variabel i värderingar av lägenheter genom att träna Convolutional Neural Networks för att klassificera for lägenheter i Stockholm, samt undersöka sambandet mellan det interiöra skicket och den felterms om Boolis varderingsalgoritm ger upphov till. Klassificeringsmodellerna tränades på insamlad data för skick av lägenhetsägare samt tillhörande visningsbilder 200 stycken bilder som erhållits av Booli. Studien visar ett statistiskt signifikant samband mellan interiört skick och feltermen från värderingar. Värderingar på lägenheter av högt skick tenderar i genomsnitt att vara 3% för låga, och 3% för höga för lägenheter av lågt skick. Resultaten indikerar att interiör som variabel kan användas för att reducera felet i Boolis varderingsalgoritm. Dock lyckades ej experimentet med att reducera feltermen i någon större utsträckning i detta arbete. / Housing evaluations has historically been made in person by real estate agents or other experts. With growing online tools for evaluations, the question arises how well they perform, and what can be done to improve them. Modern approaches use sales data for similar housing when evaluating a certain house or apartment, with variables mainly being different forms of metadata such as living area, location and year or construction. This study explores the possibility to use the interior condition as a variable in housing evaluations by training Convolutional Neural Networks to classify the condition of kitchens and bathrooms for apartments in Stockholm, Sweden and testing the relationship between said conditions and the error of Booli’s evaluation algorithm. The classification models were trained on crowd sourced data of the condition and the advertisement images for 200 images provided by Booli. The study finds that a statistically significant relationship exists between interior condition and the evaluation error, and the evaluations of apartments tends to be 3% too small on high condition apartments, while being on average 3% too large for low condition apartments. The results of the study indicates that including interior as a variable might reduce the error of Booli’s evaluation algorithm. However, the experiment for doing so in this study failed to do so in any sizeable manner.
18

Interpretation of Swedish Sign Language using Convolutional Neural Networks and Transfer Learning

Halvardsson, Gustaf, Peterson, Johanna January 2020 (has links)
The automatic interpretation of signs of a sign language involves image recognition. An appropriate approach for this task is to use Deep Learning, and in particular, Convolutional Neural Networks. This method typically needs large amounts of data to be able to perform well. Transfer learning could be a feasible approach to achieve high accuracy despite using a small data set. The hypothesis of this thesis is to test if transfer learning works well to interpret the hand alphabet of the Swedish Sign Language. The goal of the project is to implement a model that can interpret signs, as well as to build a user-friendly web application for this purpose. The final testing accuracy of the model is 85%. Since this accuracy is comparable to those received in other studies, the project’s hypothesis is shown to be supported. The final network is based on the pre-trained model InceptionV3 with five frozen layers, and the optimization algorithm mini-batch gradient descent with a batch size of 32, and a step-size factor of 1.2. Transfer learning is used, however, not to the extent that the network became too specialized in the pre-trained model and its data. The network has shown to be unbiased for diverse testing data sets. Suggestions for future work include integrating dynamic signing data to interpret words and sentences, evaluating the method on another sign language’s hand alphabet, and integrate dynamic interpretation in the web application for several letters or words to be interpreted after each other. In the long run, this research could benefit deaf people who have access to technology and enhance good health, quality education, decent work, and reduced inequalities. / Automatisk tolkning av tecken i ett teckenspråk involverar bildigenkänning. Ett ändamålsenligt tillvägagångsätt för denna uppgift är att använda djupinlärning, och mer specifikt, Convolutional Neural Networks. Denna metod behöver generellt stora mängder data för att prestera väl. Därför kan transfer learning vara en rimlig metod för att nå en hög precision trots liten mängd data. Avhandlingens hypotes är att utvärdera om transfer learning fungerar för att tolka det svenska teckenspråkets handalfabet. Målet med projektet är att implementera en modell som kan tolka tecken, samt att bygga en användarvänlig webapplikation för detta syfte. Modellen lyckas klassificera 85% av testinstanserna korrekt. Då denna precision är jämförbar med de från andra studier, tyder det på att projektets hypotes är korrekt. Det slutgiltiga nätverket baseras på den förtränade modellen InceptionV3 med fem frysta lager, samt optimiseringsalgoritmen mini-batch gradient descent med en batchstorlek på 32 och en stegfaktor på 1,2. Transfer learning användes, men däremot inte till den nivå så att nätverket blev för specialiserat på den förtränade modellen och dess data. Nätverket har visat sig vara ickepartiskt för det mångfaldiga testningsdatasetet. Förslag på framtida arbeten inkluderar att integrera dynamisk teckendata för att kunna tolka ord och meningar, evaluera metoden på andra teckenspråkshandalfabet, samt att integrera dynamisk tolkning i webapplikationen så flera bokstäver eller ord kan tolkas efter varandra. I det långa loppet kan denna studie gagna döva personer som har tillgång till teknik, och därmed öka chanserna för god hälsa, kvalitetsundervisning, anständigt arbete och minskade ojämlikheter.
19

A Deep Learning Approach to Advertisement Detection in Newspapers / Detektion av annonser i Nyhetstidningar med hjälp av djupinlärning

Jonsson, Patrick January 2022 (has links)
Retrieving specific information from newspapers can be a difficult task due to differences in their design, layout, imagery, and typography. Using newspapers from different publishers that are archived at the National Library of Sweden, this thesis aims to train a deep learning model that is able to detect and classify advertisements. Experiments are performed to see how well the models generalize to different publishers, and to a time period that is nearby, but outside the time period in which the models were trained. Results from experiments show that using a CNN, advertisements can be detected and classified to a high degree. Models were found to perform particularly well on data from the same publisher and time period as it was trained. Performance losses were generally observed when models were tested on other publishers or in another time domain than the training data. Further drops in performance were seen when models were tested on a combination of both a different publisher and a different time period. / Att återhämta specifik information från digitalt lagrade nyhetstidningar kan vara en svår utmaning. Detta beror delvis på nyhetstidningars varierande design, men även dess användande av bild- och skriftspråk. I detta arbete används nyhetstidningar från olika utgivare som är arkiverat på Kungliga Biblioteket för att träna maskininlärnings modeller med målet att kunna detektera annonser i nyhetstidningar. Experiment utförs även för att undersöka hur väl de tränade modellerna generaliserar till andra utgivare, samt hur de generaliserar till en annan tidsperiod än tidsperioden som modellen var tränad på. Resultaten från experimenten visar att ett CNN kan detektera och klassificera annonser till en hög grad. Modeller hade högst prestation på nyhetstidningar inom samma tidsperiod och från samma utgivare som den tränats på. Generaliserings test visade lägre prestation när modeller testades på andra tidsperioder och utgivare, i synnerhet när de testades på en kombination av både en annan utgivare i en annan tidsperiod.
20

Age Prediction in Breast Cancer Risk Stratification : Additive Value of Age Prediction on Healthy Mammography Images in Breast Cancer Risk Models

Peterson, Johanna January 2022 (has links)
Breast cancer is the most common cancer type for women worldwide. Early detection is key to improve prognosis and treatment success. A cost-efficient way of finding breast cancer early is mammography screening on a population basis. A major issue with mammography screening is in-between screening cancers. One method of targeting this issue is calculating breast cancer risk stratification on healthy mammography scans, however, this method is as of today insufficient. One proposed addition to refine risk stratification is to use Artificial Intelligence guided age prediction. The aim of this study was to investigate to what extent there is an additive value of age prediction on breast cancer risk stratification. Convolutional Neural Networks (CNNs) were used to train a model on an age prediction task using healthy mammography scans from the Cohort of Screen-Aged Women. The predicted ages and delta ages, calculated as predicted age minus chronological age, were then added to a logistic regression task together with, and without, the known risk factor mammographic density. The results showed an increase in breast cancer detection with the risk model incorporating age prediction for some age groups. This suggests age prediction using CNNs might increase breast cancer detection. More studies are needed to confirm these findings. / Bröstcancer är den vanligaste cancertypen för kvinnor globalt. Tidig upptäckt är en nyckelfaktor för att förbättra prognos och behandlingsframgång. Ett kostnadseffektivt sätt att hitta tidigt utvecklad bröstcancer är allmän screening med mammografi. Ett problem med denna screening är cancer som uppkommer mellan screeningtillfällen. En metod för att lösa detta problem är riskstratifiering som ämnar att beräkna risken att utveckla cancer från friska mammografibilder, men denna metod är idag otillräcklig. Ett förslag på tillägg för att förfina resultatet av detta är att använda artificiell intelligens guidad åldersbedömning. I den här studien var syftet att undersöka i vilken utsträckning det finns ett additivt värde av åldersbedömning för modellering av risken att utveckla bröstcancer. Convolutional Neural Networks (CNNs) användes för att träna en åldersbedömningssmodell på friska mammografibilder från Cohort of ScreenAged Women. De bedömda åldrarna samt deltaåldrarna, beräknade som bedömd ålder minus kronologisk ålder, användes sedan som input till en logistisk regressionsuppgift tillsammans med, samt utan, den kända riskfaktorn mammografisk densitet. Resultaten visade en ökad upptäckt av bröstcancer för vissa åldersgrupper då riskmodellen inkluderade deltaåldrarna. Detta tyder på att åldersbedömning med CNNs kan öka upptäckten av bröstcancer. Fler studier behövs för att bekräfta dessa fynd.

Page generated in 0.5073 seconds