Global ETD Search

41	Procedural Pre-Training for Visual Recognition Anderson, Connor S. 18 June 2024 (has links) (PDF) Deep learning models can perform many tasks very capably, provided they are trained correctly. Usually, this requires a large amount of data. Pre-training refers to a process of creating a strong initial model by first training it on a large-scale dataset. Such a model can then be adapted to many different tasks, while only requiring a comparatively small amount of task-specific training data. Pre-training is the standard approach in most computer vision scenarios, but it's not without drawbacks. Aside from the cost and effort involved in collecting large pre-training datasets, such data may also contain unwanted biases, violations of privacy, inappropriate content, or copyright material used without permission. Such issues can lead to concerns about the ethical use of models trained using the data. This dissertation addresses a different approach to pre-training visual models by using abstract, procedurally generated data. Such data is free from the concerns around human bias, privacy, and intellectual property. It also has the potential to scale more easily, and provide precisely controllable sources of supervision that are difficult or impossible to extract from data collected in-the-wild from sources like the internet. The obvious disadvantage of such data is that it doesn't model real-world semantics, and thus introduces a large domain-gap. Surprisingly, however, such pre-training can lead to performance not far below models trained in the conventional way. This is shown for different visual recognition tasks, models, and procedural data-generation processes. computer vision deep learning pre-training procedural data image recognition correspondence prediction Physical Sciences and Mathematics
42	An evaluation of image preprocessing for classification of Malaria parasitization using convolutional neural networks / En utvärdering av bildförbehandlingsmetoder för klassificering av malariaparasiter med hjälp av Convolutional Neural Networks Engelhardt, Erik, Jäger, Simon January 2019 (has links) In this study, the impact of multiple image preprocessing methods on Convolutional Neural Networks (CNN) was studied. Metrics such as accuracy, precision, recall and F1-score (Hossin et al. 2011) were evaluated. Specifically, this study is geared towards malaria classification using the data set made available by the U.S. National Library of Medicine (Malaria Datasets n.d.). This data set contains images of thin blood smears, where uninfected and parasitized blood cells have been segmented. In the study, 3 CNN models were proposed for the parasitization classification task. Each model was trained on the original data set and 4 preprocessed data sets. The preprocessing methods used to create the 4 data sets were grayscale, normalization, histogram equalization and contrast limited adaptive histogram equalization (CLAHE). The impact of CLAHE preprocessing yielded a 1.46% (model 1) and 0.61% (model 2) improvement over the original data set, in terms of F1-score. One model (model 3) provided inconclusive results. The results show that CNN’s can be used for parasitization classification, but the impact of preprocessing is limited. / I denna studie studerades effekten av flera bildförbehandlingsmetoder på Convolutional Neural Networks (CNN). Mätvärden såsom accuracy, precision, recall och F1-score (Hossin et al. 2011) utvärderades. Specifikt är denna studie inriktad på malariaklassificering med hjälp av ett dataset som tillhandahålls av U.S. National Library of Medicine (Malaria Datasets n.d.). Detta dataset innehåller bilder av tunna blodutstryk, med segmenterade oinfekterade och parasiterade blodceller. I denna studie föreslogs 3 CNN-modeller för parasiteringsklassificeringen. Varje modell tränades på det ursprungliga datasetet och 4 förbehandlade dataset. De förbehandlingsmetoder som användes för att skapa de 4 dataseten var gråskala, normalisering, histogramutjämning och kontrastbegränsad adaptiv histogramutjämning (CLAHE). Effekten av CLAHE-förbehandlingen gav en förbättring av 1.46% (modell 1) och 0.61% (modell 2) jämfört med det ursprungliga datasetet, vad gäller F1-score. En modell (modell 3) gav inget resultat. Resultaten visar att CNN:er kan användas för parasiteringsklassificering, men effekten av förbehandling är begränsad. Deep Learning Convolutional Neural Network Malaria Image Recognition Preprocessing Computer Aided Diagnosis Grayscale Normalization Histogram Equalization CLAHE. Deep Learning Convolutional Neural Network Malaria Image Recognition Preprocessing Computer Aided Diagnosis Grayscale Normalization Histogram Equalization CLAHE. Computer and Information Sciences Data- och informationsvetenskap
43	Automated image-based recognition and targeted laser transfection techniques for drug development and stem cell research Yapp, Clarence Han-Wei January 2011 (has links) Advances in several areas of scientific research is currently hampered by the slow progress in developing a non-viral, high precision technique capable of safely and efficiently injecting targeted single cells with impermeable molecules. To date, one of the most promising techniques employs the laser to temporarily create a pore in the cell membrane to allow the entry of exogenous molecules. This technique has potentially wide applications. In this thesis, I utilised the precision of laser transfection, also known as optoporation, to deliver two histone demethylase inhibitors (8-hydroxyquinoline and FMF1293) of the JmjC-domain protein JMJD3 into vital cells. The enzyme, JMJD3, demethylates histone H3 lysine K27, the methylation state of which has been shown in previous studies to regulate genes in such a way as to play a key role in the formation of tumours and even maintenance of stem cell pluripotency. The research here shows proof of principle that optoporation can be employed to quickly screen and test the efficacy of novel drugs by delivering them into cells at significantly low concentrations while still maintaining inhibition activity. I also used optoporation to deliver relatively large proteins such as bovine serum albumin (BSA), phalloidin and novel synthetic antibodies into living cells without fixatives. This offers the possibility of using reporter systems to monitor living cells over time. Finally, an attempt was made to generate iPS colonies by optoporating plasmid DNA into somatic cells, however, I find that this technique was unable to efficiently transfect and reprogram primary cells. Two automated image-based systems that can be integrated into existing microscopes are presented here. First, an image processing algorithm that can quickly identify stem cell colonies non-invasively was implemented. When tested, the algorithm’s resulting specificity was excellent (95 – 98.5%). Second, because optoporation is a manual and time consuming procedure, an algorithm to automate optoporation by using image processing to locate the position of cells was developed. To my knowledge, this is the first publication of a system which automates optoporation of human fibroblasts in this way. 616.027
44	Análise colorimétrica de faces humanas: uma abordagem para auxílio ao reconhecimento de imagens / Colorimetric analysis of human faces: an approach to image recognition assistance Luciana de Sousa Santos 31 July 2013 (has links) A quantificação colorimétrica da pele do rosto humano apresenta uma grande dispersão de valores. Esta dispersão varia de acordo com o espaço de cor (HSV ou YCbCr) adotado para a análise e quanto menor a dispersão mais adequado é o espaço ao reconhecimento facial. O objetivo deste trabalho é analisar a distribuição estatística da colorimetria de imagens de rostos digitalizadas. A análise poderá dizer se as coordenadas de cor, tais como saturação, matiz e valor podem auxiliar em técnicas de reconhecimento de faces. Como resultado da análise, espera-se concluir qual dos sistemas de coordenadas de cor (HSV ou YCbCr) é o mais adequado à aplicações em reconhecimento facial. Os resultados obtidos serão apresentados com fundamentação no design da informação. O grande número de amostras fotográficas disponíveis para análise (530) e o correto equilíbrio de iluminação, contraste e temperatura de cor constituem o principal diferencial desse trabalho. / The colorimetric quantification of the human face skin presents a large dispersion of values. This dispersion varies according to the color space (YCbCr or HSV) adopted for the analysis. The smaller the dispersion the more appropriate is a certain color space for face recognition methods. The objective of this paper is to analyze the colorimetric statistical distribution of digital face images. The analysis will show how color coordinates (such as hue, saturation and brightness), can help with facial recognition techniques. The result of this analysis will tell which color space (HSV or YCbCr) is more adequate to be used in face recognition systems. The results obtained will be presented in accordance with the fundamentals of information design. The large number of photographic samples available for analysis (530) and the right balance of lighting, contrast and color temperature are the main differential of this work. Fotografia Design da informação Colorimetria Reconhecimento de imagem Visualização da informação Colorimetry Image recognition Information design Information visualization Photography DESENHO INDUSTRIAL
45	Applications of algebraic geometry to object/image recognition Abbott, Kevin Toney 02 June 2009 (has links) In recent years, new approaches to the problem of Automated Target Recognition using techniques of shape theory and algebraic geometry have been explored. The power of this shape theoretic approach is that it allows one to develop tests for object/image matching that do not require knowledge of the object’s position in relation to the sensor nor the internal parameters of the sensor. Furthermore, these methods do not depend on the choice of coordinate systems in which the objects and images are represented. In this dissertation, we will expand on existing shape theoretic techniques and adapt these techniques to new sensor models. In each model, we develop an appropriate notion of shape for our objects and images and define the spaces of such shapes. The goal in each case is to develop tests for matching object and image shapes under an appropriate class of projections. The first tests we develop take the form of systems of polynomial equations (the so-called object/image relations) that check for exact matches of object/image pairs. Later, a more robust approach to matching is obtained by defining metrics on the shape spaces. This allows us in each model to develop a measure of “how close” an object is to being able to produce a given image. We conclude this dissertation by computing a number of examples using these tests for object/image matching. object/image recognition shape shape space object/image equations generalized weak perspective full perspective conformal shape theory algebraic geometry
46	Αυτόματη αναγνώριση CAPTCHAs με χρήση τεχνικών ΨΕΕ Γκολώνη, Σταυρούλα 06 March 2015 (has links) Γνωρίζουμε πως στην εποχή που διανύουμε το Διαδίκτυο προσφέρει μια παγκόσμια επικοινωνία δημιουργώντας ταυτόχρονα και μια παγκόσμια οικονομία. Η παροχή δωρεάν υπηρεσιών από αρκετές ιστοσελίδες οδήγησε στη συστηματική κατάχρησή τους με ευνόητο σκοπό το κέρδος. Ως αντίσταση σε αυτή την κακόβουλη νέα πηγή εσόδων για κάποιους, προβάλλονται τα CAPTCHAs. Στόχος τους είναι να εξακριβώσουν αν μία αίτηση σε μία υπηρεσία γίνεται από έναν χρήστη ή από ένα αυτοματοποιημένο πρόγραμμα. Κάθε ιστοσελίδα που δίνει την δυνατότητα στον χρήστη να δημιουργήσει δικό του περιεχόμενο ή να χρησιμοποιήσει τις υπηρεσίες της οφείλει πλέον να εμπεριέχει CAPTCHAs. Η παρούσα ειδική επιστημονική εργασία έχει ως στόχο τη μελέτη και την ερμηνεία όλων των διαφορετικών ειδών CAPTCHAs που έχουν δημιουργηθεί ώστε να είναι ανθεκτικά στις κακόβουλες προσπάθειες λύσης και εξετάζεται το κατά πόσο αυτό είναι εφικτό. Γίνεται μια προσπάθεια αρχικά να κατανοήσουμε ακριβώς τι είναι τα CAPTCHAs και γιατί η χρήση τους καθίσταται αναγκαία. Αυτό που επίσης θα διερευνηθεί, μέσα από συγκεκριμένες δημοσιεύσεις που έχουν πραγματοποιηθεί, είναι ποιες αρχές πρέπει να διέπουν το σχεδιασμό ενός CAPTCHA. Προκειμένου να συμβεί αυτό θα ανατρέξουμε σε διαφορετικές προσεγγίσεις και αφού τις παρουσιάσουμε γίνεται μια κριτική ανάλυση των μεθόδων της κάθε ερευνητικής ομάδας. / It is fact that in the modern world the Internet offers a global communication while creating a global economy. The provision of free services from several websites has led to this systematic abuse solely for the purpose of making a profit. In order to stem the tide of this malicious new source of income for some, “CAPTCHAs” are employed. Their goal is to determine whether a request to a service is made by a user or by an automated program. Every website that enables the user to create their own content or use its services must now deploy CAPTCHAs. This current special scientific work aims at the study and interpretation of all different kinds of CAPTCHAs created so that they are resistant to malicious efforts of solutions and examines whether this is possible. An attempt is made initially to understand exactly what the “CAPTCHAs” are and why their use is necessary. What will also be explored through specific publications made, is what principles should govern the design of a CAPTCHA. In order to do this we will go back to different approaches and after presenting them, a critical analysis of the methods of each research group will be conducted. Ασφάλεια Ευρωστία Ευχρηστία Τμηματοποίηση 006.42 CAPTCHAs Image recognition CAPTCHAs Human interactive proof Attacks Spamming
47	λ-connectedness and its application to image segmentation, recognition and reconstruction Chen, Li January 2001 (has links) Seismic layer segmentation, oil-gas boundary surfaces recognition, and 3D volume data reconstruction are three important tasks in three-dimensional seismic image processing. Geophysical and geological parameters and properties have been known to exhibit progressive changes in a layer. However, there are also times when sudden changes can occur between two layers. λ-connectedness was proposed to describe such a phenomenon. Based on graph theory, λ-connectedness describes the relationship among pixels in an image. It is proved that λ-connectedness is an equivalence relation. That is, it can be used to partition an image into different classes and hence can be used to perform image segmentation. Using the random graph theory and λ-connectivity of the image, the length of the path in a λ-connected set can be estimated. In addition to this, the normal λ-connected subsets preserve every path that is λ-connected in the subsets. An O(nlogn) time algorithm is designed for the normal λ-connected segmentation. Techniques developed are used to find objects in 2D/3D seismic images. Finding the interface between two layers or finding the boundary surfaces of an oil-gas reserve is often asked. This is equivalent to finding out whether a λ-connected set is an interface or surface. The problem that is raised is how to recognize a surface in digital spaces. λ-connectedness is a natural and intuitive way for describing digital surfaces and digital manifolds. Fast algorithms are designed to recognize whether an arbitrary set is a digital surface. Furthermore, the classification theorem of simple surface points is deduced: there are only six classes of simple surface points in 3D digital spaces. Our definition has been proved to be equivalent to Morgenthaler-Rosenfeld's definition of digital surfaces in direct adjacency. Reconstruction of a surface and data volume is important to the seismic data processing. Given a set of guiding pixels, the problem of generating a λ-connected (subset of image) surface is an inverted problem of λ-connected segmentation. In order to simplify the fitting algorithm, gradual variation, an equivalent concept of λ-connectedness, is used to preserve the continuity of the fitted surface. The key theorem, the necessary and sufficient condition for the gradually varied interpolation, has been mathematically proven. A random gradually varied surface fitting is designed, and other theoretical aspects are investigated. The concepts are used to successfully reconstruct 3D seismic real data volumes. This thesis proposes λ-connectedness and its applications as applied to seismic data processing. It is used for other problems such as ionogram scaling and object tracking. It has the potential to become a general technique in image processing and computer vision applications. Concepts and knowledge from several areas in mathematics such as Set Theory, Fuzzy Set Theory, Graph Theory, Numerical Analysis, Topology, Discrete Geometry, Computational Complexity, and Algorithm Design and Analysis have been applied to the work of this thesis. 006.37
48	Διαχείριση πολιτισμικού περιεχομένου με χρήση φορητών συσκευών σε περιβάλλον Windows phone 7.5 Λιβαθινού, Αγγελική 01 October 2012 (has links) Η παρούσα εργασία ασχολείται με τη μελέτη των τεχνολογιών ανάπτυξης εφαρμογών σε περιβάλλον φορητών συσκευών και συγκεκριμένα σε Windows Phone 7.5 (ή Mango) συσκευές. Ιδιαίτερα δόθηκε έμφαση στη δημιουργία πολυμεσικών εφαρμογών πολιτισμικού περιεχομένου, οι οποίες θα είναι όχι μόνο λειτουργικές αλλά θα διαθέτουν επίσης ένα φιλικό περιβάλλον διεπαφής έτσι ώστε να αποτελούν μια ευχάριστη εμπειρία για τον τελικό χρήστη. Με αφορμή αυτό το ερέθισμα, σχεδιάσαμε και αναπτύξαμε το Museum Guide – ένα ολοκληρωμένο περιβάλλον διαχείρισης περιεχομένου που έχει σαν στόχο την ξενάγηση σε ένα χώρο πολιτισμού, όπως για παράδειγμα ένα μουσείο, εκμεταλλευόμενο τις πολλές δυνατότητες που προσφέρει μια Windows Phone 7.5 συσκευή. Έτσι χρησιμοποιώντας το Museum Guide ο επισκέπτης του μουσείου μπορεί να βλέπει τα εκθέματα καθώς και συνοδευτικό υλικό που περιλαμβάνει κείμενο, εικόνες, αρχεία ήχου και βίντεο. Επιπλέον μπορεί να επιλέγει κάποια από αυτά τα εκθέματα ως αγαπημένα και να μοιράζεται τις εμπειρίες του με τα υπόλοιπα μέλη της ομάδας του μέσω άμεσων μηνυμάτων. Για τη λειτουργία των άμεσων μηνυμάτων δημιουργήσαμε έναν ενσωματωμένο messenger που λειτουργεί πάνω από xmpp πρωτόκολλο με τη βοήθεια της βιβλιοθήκης Matrix. Εκτός από τα παραπάνω, το Museum Guide δίνει τη δυνατότητα στον χρήστη να λάβει πληροφορίες για ένα συγκεκριμένο έκθεμα, βγάζοντας απλά μια φωτογραφία τού εν λόγω εκθέματος με την κάμερα του κινητού. Την λειτουργία αυτή την ονομάζουμε on-demand ξενάγηση και δόθηκε ιδιαίτερη έμφαση στον αλγόριθμο με βάση τον οποίο γίνεται η αναγνώριση της φωτογραφίας που βγάζει ο χρήστης. Για το σκοπό αυτό μελετήσαμε και αξιολογήσαμε διάφορους αλγορίθμους αναγνώρισης εικόνας και καταλήξαμε στη χρήση hashing τεχνικών με τη βοήθεια της βιβλιοθήκης pHash. / The current thesis is on study of application development technologies in portable devices environment and namely in Windows Phone 7.5 (or Mango) devices. Especially emphasis was placed on creation of multimedia applications with cultural content which will not only be functional but also have a friendly interface in order to be a pleasant experience for the end user. In response to this stimulus, we designed and developed the Museum Guide – an integrated content management environment which aims at guidance in an area of culture, for example a museum, exploiting the capabilities of a Windows Phone 7.5 device. Thereby the visitor of the museum can use the Museum Guide to see the exhibits as well as accompanying material including text, photos, audios and videos. Additionally the visitor can mark any of these exhibits as favorites and share their experiences with the other members of their group through instant messages. For the operation of instant messaging, we have created an embedded messenger which works over the xmpp protocol using the Matrix library. Besides the above, the Museum Guide gives to the user the opportunity to receive information about a specific exhibit, just taking a photo of this exhibit with phone’s camera. We call this operation on-demand guidance and we emphasized especially on the algorithm of user’s photo recognition. For this purpose we studied and evaluated various image recognition algorithms and came to the conclusion to use hashing techniques with help of pHash library. On-demand ξενάγηση Αναγνώριση εικόνας Κατακερματισμός 651.59 On-demand tour Image recognition Hashing Windows phone 7.5 Messenger Favorites Museum Guide
49	Probabilistic incremental learning for image recognition : modelling the density of high-dimensional data Carvalho, Edigleison Francelino January 2014 (has links) Atualmente diversos sistemas sensoriais fornecem dados em fluxos e essas observações medidas são frequentemente de alta dimensionalidade, ou seja, o número de variáveis medidas é grande, e as observações chegam em sequência. Este é, em particular, o caso de sistemas de visão em robôs. Aprendizagem supervisionada e não-supervisionada com esses fluxos de dados é um desafio, porque o algoritmo deve ser capaz de aprender com cada observação e depois descartá-la antes de considerar a próxima, mas diversos métodos requerem todo o conjunto de dados a fim de estimar seus parâmetros e, portanto, não são adequados para aprendizagem em tempo real. Além disso, muitas abordagens sofrem com a denominada maldição da dimensionalidade (BELLMAN, 1961) e não conseguem lidar com dados de entrada de alta dimensionalidade. Para superar os problemas descritos anteriormente, este trabalho propõe um novo modelo de rede neural probabilístico e incremental, denominado Local Projection Incremental Gaussian Mixture Network (LP-IGMN), que é capaz de realizar aprendizagem perpétua com dados de alta dimensionalidade, ou seja, ele pode aprender continuamente considerando a estabilidade dos parâmetros do modelo atual e automaticamente ajustar sua topologia levando em conta a fronteira do subespaço encontrado por cada neurônio oculto. O método proposto pode encontrar o subespaço intrísico onde os dados se localizam, o qual é denominado de subespaço principal. Ortogonal ao subespaço principal, existem as dimensões que são ruidosas ou que carregam pouca informação, ou seja, com pouca variância, e elas são descritas por um único parâmetro estimado. Portanto, LP-IGMN é robusta a diferentes fontes de dados e pode lidar com grande número de variáveis ruidosas e/ou irrelevantes nos dados medidos. Para avaliar a LP-IGMN nós realizamos diversos experimentos usando conjunto de dados simulados e reais. Demonstramos ainda diversas aplicações do nosso método em tarefas de reconhecimento de imagens. Os resultados mostraram que o desempenho da LP-IGMN é competitivo, e geralmente superior, com outras abordagens do estado da arte, e que ela pode ser utilizada com sucesso em aplicações que requerem aprendizagem perpétua em espaços de alta dimensionalidade. / Nowadays several sensory systems provide data in ows and these measured observations are frequently high-dimensional, i.e., the number of measured variables is large, and the observations are arriving in a sequence. This is in particular the case of robot vision systems. Unsupervised and supervised learning with such data streams is challenging, because the algorithm should be capable of learning from each observation and then discard it before considering the next one, but several methods require the whole dataset in order to estimate their parameters and, therefore, are not suitable for online learning. Furthermore, many approaches su er with the so called curse of dimensionality (BELLMAN, 1961) and can not handle high-dimensional input data. To overcome the problems described above, this work proposes a new probabilistic and incremental neural network model, called Local Projection Incremental Gaussian Mixture Network (LP-IGMN), which is capable to perform life-long learning with high-dimensional data, i.e., it can continuously learn considering the stability of the current model's parameters and automatically adjust its topology taking into account the subspace's boundary found by each hidden neuron. The proposed method can nd the intrinsic subspace where the data lie, which is called the principal subspace. Orthogonal to the principal subspace, there are the dimensions that are noisy or carry little information, i.e., with small variance, and they are described by a single estimated parameter. Therefore, LP-IGMN is robust to di erent sources of data and can deal with large number of noise and/or irrelevant variables in the measured data. To evaluate LP-IGMN we conducted several experiments using simulated and real datasets. We also demonstrated several applications of our method in image recognition tasks. The results have shown that the LP-IGMN performance is competitive, and usually superior, with other stateof- the-art approaches, and it can be successfully used in applications that require life-long learning in high-dimensional spaces. Redes neurais Inteligência artificial Local projection Probabilistic learning Online learning Incremental learning High-dimensional data Gaussian mixture models Image recognition
50	Análise colorimétrica de faces humanas: uma abordagem para auxílio ao reconhecimento de imagens / Colorimetric analysis of human faces: an approach to image recognition assistance Luciana de Sousa Santos 31 July 2013 (has links) A quantificação colorimétrica da pele do rosto humano apresenta uma grande dispersão de valores. Esta dispersão varia de acordo com o espaço de cor (HSV ou YCbCr) adotado para a análise e quanto menor a dispersão mais adequado é o espaço ao reconhecimento facial. O objetivo deste trabalho é analisar a distribuição estatística da colorimetria de imagens de rostos digitalizadas. A análise poderá dizer se as coordenadas de cor, tais como saturação, matiz e valor podem auxiliar em técnicas de reconhecimento de faces. Como resultado da análise, espera-se concluir qual dos sistemas de coordenadas de cor (HSV ou YCbCr) é o mais adequado à aplicações em reconhecimento facial. Os resultados obtidos serão apresentados com fundamentação no design da informação. O grande número de amostras fotográficas disponíveis para análise (530) e o correto equilíbrio de iluminação, contraste e temperatura de cor constituem o principal diferencial desse trabalho. / The colorimetric quantification of the human face skin presents a large dispersion of values. This dispersion varies according to the color space (YCbCr or HSV) adopted for the analysis. The smaller the dispersion the more appropriate is a certain color space for face recognition methods. The objective of this paper is to analyze the colorimetric statistical distribution of digital face images. The analysis will show how color coordinates (such as hue, saturation and brightness), can help with facial recognition techniques. The result of this analysis will tell which color space (HSV or YCbCr) is more adequate to be used in face recognition systems. The results obtained will be presented in accordance with the fundamentals of information design. The large number of photographic samples available for analysis (530) and the right balance of lighting, contrast and color temperature are the main differential of this work. Fotografia Design da informação Colorimetria Reconhecimento de imagem Visualização da informação Colorimetry Image recognition Information design Information visualization Photography DESENHO INDUSTRIAL

Search results