Global ETD Search

201	Analisando a viabilidade de deep learning para reconhecimento de a??es em datasets pequenos Santos Junior, Juarez Monteiro dos 06 March 2018 (has links) Submitted by PPG Ci?ncia da Computa??o (ppgcc@pucrs.br) on 2018-05-03T18:10:00Z No. of bitstreams: 1 JUAREZ_MONTEIRO_DIS.pdf: 4814365 bytes, checksum: 44d808dc5b6459f46854eb7cbd2b78a4 (MD5) / Approved for entry into archive by Sheila Dias (sheila.dias@pucrs.br) on 2018-05-15T11:13:48Z (GMT) No. of bitstreams: 1 JUAREZ_MONTEIRO_DIS.pdf: 4814365 bytes, checksum: 44d808dc5b6459f46854eb7cbd2b78a4 (MD5) / Made available in DSpace on 2018-05-15T11:30:05Z (GMT). No. of bitstreams: 1 JUAREZ_MONTEIRO_DIS.pdf: 4814365 bytes, checksum: 44d808dc5b6459f46854eb7cbd2b78a4 (MD5) Previous issue date: 2018-03-06 / Action recognition is the computer vision task of identifying which action is happening in a given sequence of frames. Traditional approaches rely on handcrafted features and domain specific algorithms, often resulting in limited accuracy. The substantial advances in deep learning and the availability of larger datasets have allowed techniques that yield better performance without domain-specific knowledge to recognize actions being performed based on the raw information from video sequences. However, deep learning algorithms usually require very large labeled datasets for training, and due to their increased capacity their often overfit small data, hence providing lower generalization power. This work aims to explore deep learning in the context of small-sized action recognition datasets. Our goal is to achieve significant performance even in cases in which labeled data is not abundant. In order to do so, we investigate distinct network architectures, data pre-processing, and fusion methods, providing guidelines and good practices for using deep learning in small-sized datasets. / Reconhecimento de a??o ? a tarefa de vis?o computacional que identifica qual a??o esta ocorrendo em dada sequ?ncia de frames. Abordagens tradicionais dependem de caracter?sticas extra?das dessas imagens e algoritmos espec?ficos de dom?nio, muitas vezes resultando em uma precis?o limitada. Os avan?os substanciais na aprendizagem profunda e a disponibilidade de conjuntos de dados maiores permitiram que t?cnicas produzam um desempenho sem conhecimento espec?fico do dom?nio para reconhecer as a??es que est?o sendo realizadas, tendo como base apenas sequ?ncias de v?deo. No entanto, os algoritmos de aprendizagem profunda geralmente requerem conjuntos de dados rotulados muito grandes para o treinamento. Devido ? sua maior capacidade, tais algoritmos geralmente sofrem com overfitting em conjunto de dados pequenos, proporcionando assim um menor poder de generaliza??o. Este trabalho tem como objetivo explorar a aprendizagem profunda no contexto de conjuntos de dados pequenos para reconhecimento de a??es. Nosso objetivo ? alcan?ar resultados, mesmo nos casos em que os dados rotulados n?o sejam abundantes. Para isso, investigamos diferentes arquiteturas profundas, diferentes m?todos de processamento, e diferentes m?todos de fus?o, fornecendo diretrizes e boas pr?ticas para o aprendizado profundo em conjuntos de dados de tamanho pequeno. Aprendizado de M?quina Redes Neurais Redes Neurais Convolucionais Reconhecimento de A??es Machine Learning Neural Networks Convolutional Neural Networks Action Recognition
202	Reconhecimento de imagens de marcas de gado utilizando redes neurais convolucionais e máquinas de vetores de suporte Santos, Carlos Alexandre Silva dos 26 September 2017 (has links) Submitted by Marlucy Farias Medeiros (marlucy.farias@unipampa.edu.br) on 2017-10-31T17:44:17Z No. of bitstreams: 1 Carlos_Alexandre Silva_dos Santos - 2017.pdf: 27850839 bytes, checksum: c4399fa8396d3b558becbfa67b7dd777 (MD5) / Approved for entry into archive by Marlucy Farias Medeiros (marlucy.farias@unipampa.edu.br) on 2017-10-31T18:24:21Z (GMT) No. of bitstreams: 1 Carlos_Alexandre Silva_dos Santos - 2017.pdf: 27850839 bytes, checksum: c4399fa8396d3b558becbfa67b7dd777 (MD5) / Made available in DSpace on 2017-10-31T18:24:21Z (GMT). No. of bitstreams: 1 Carlos_Alexandre Silva_dos Santos - 2017.pdf: 27850839 bytes, checksum: c4399fa8396d3b558becbfa67b7dd777 (MD5) Previous issue date: 2017-09-26 / O reconhecimento automático de imagens de marca de gado é uma necessidade para os órgãos governamentais responsáveis por esta atividade. Para auxiliar neste processo, este trabalho propõe uma arquitetura que seja capaz de realizar o reconhecimento automático dessas marcas. Nesse sentido, uma arquitetura foi implementada e experimentos foram realizados com dois métodos: Bag-of-Features e Redes Neurais Convolucionais (CNN). No método Bag-of-Features foi utilizado o algoritmo SURF para extração de pontos de interesse das imagens e para criação do agrupa mento de palavras visuais foi utilizado o clustering K-means. O método Bag-of-Features apresentou acurácia geral de 86,02% e tempo de processamento de 56,705 segundos para um conjunto de 12 marcas e 540 imagens. No método CNN foi criada uma rede completa com 5 camadas convolucionais e 3 camadas totalmente conectadas. A 1 ª camada convolucional teve como entrada imagens transformadas para o formato de cores RGB. Para ativação da CNN foi utilizada a função ReLU, e a técnica de maxpooling para redução. O método CNN apresentou acurácia geral de 93,28% e tempo de processamento de 12,716 segundos para um conjunto de 12 marcas e 540 imagens. O método CNN consiste de seis etapas: a) selecionar o banco de imagens; b) selecionar o modelo de CNN pré-treinado; c) pré-processar as imagens e aplicar a CNN; d) extrair as características das imagens; e) treinar e classificar as imagens utilizando SVM; f) avaliar os resultados da classificação. Os experimentos foram realizados utilizando o conjunto de imagens de marcas de gado de uma prefeitura municipal. Para avaliação do desempenho da arquitetura proposta foram utilizadas as métricas de acurácia geral, recall, precisão, coeficiente Kappa e tempo de processamento. Os resultados obtidos foram satisfatórios, nos quais o método CNN apresentou os melhores resultados em comparação ao método Bag-of-Features, sendo 7,26% mais preciso e 43,989 segundos mais rápido. Também foram realizados experimentos com o método CNN em conjuntos de marcas com número maior de amostras, o qual obteve taxas de acurácia geral de 94,90% para 12 marcas e 840 imagens, e 80,57% para 500 marcas e 22.500 imagens, respectivamente. / The automatic recognition of cattle branding is a necessity for government agencies responsible for this activity. In order to improve this process, this work proposes an architecture which is able of performing the automatic recognition of these brandings. The proposed software implements two methods, namely: Bag-of-Features and CNN. For the Bag-of-Features method, the SURF algorithm was used in order to extract points of interest from the images. We also used K-means clustering to create the visual word cluster. The Bag-of-Features method presented a overall accuracy of 86.02% and a processing time of 56.705 seconds in a set containing 12 brandings and 540 images. For the CNN method, we created a complete network with five convolutional layers, and three layers fully connected. For the 1st convolutional layer we converted the input images into the RGB color for mat. In order to activate the CNN, we performed an application of the ReLU, and used the maxpooling technique for the reduction. The CNN method presented 93.28% of overall accuracy and a processing time of 12.716 seconds for a set containing 12 brandings and 540 images. The CNN method includes six steps: a) selecting the image database; b) selecting the pre-trained CNN model; c) pre-processing the images and applying the CNN; d) extracting the features from the images; e) training and classifying the images using SVM; f) assessing the classification results. The experiments were performed using the cattle branding image set of a City Hall. Metrics of overall accuracy, recall, precision, Kappa coefficient, and processing time were used in order to assess the performance of the proposed architecture. Results were satisfactory. The CNN method showed the best results when compared to Bag-of-Features method, considering that it was 7.26% more accurate and 43.989 seconds faster. Also, some experiments were conducted with the CNN method for sets of brandings with a greater number of samples. These larger sets presented a overall accuracy rate of 94.90% for 12 brandings and 840 images, and 80.57% for 500 brandings and 22,500 images, respectively. CNPQ::ENGENHARIAS Aprendizagem profunda Redes neurais convolucionais Máquinas de vetores de suporte Reconhecimento de imagens Marcas de gado Engenharia elétrica Deep learning Convolutional neural networks Support vector machines Image recognition Cattle branding
203	Structural priors in deep neural networks Ioannou, Yani Andrew January 2018 (has links) Deep learning has in recent years come to dominate the previously separate fields of research in machine learning, computer vision, natural language understanding and speech recognition. Despite breakthroughs in training deep networks, there remains a lack of understanding of both the optimization and structure of deep networks. The approach advocated by many researchers in the field has been to train monolithic networks with excess complexity, and strong regularization --- an approach that leaves much to desire in efficiency. Instead we propose that carefully designing networks in consideration of our prior knowledge of the task and learned representation can improve the memory and compute efficiency of state-of-the art networks, and even improve generalization --- what we propose to denote as structural priors. We present two such novel structural priors for convolutional neural networks, and evaluate them in state-of-the-art image classification CNN architectures. The first of these methods proposes to exploit our knowledge of the low-rank nature of most filters learned for natural images by structuring a deep network to learn a collection of mostly small, low-rank, filters. The second addresses the filter/channel extents of convolutional filters, by learning filters with limited channel extents. The size of these channel-wise basis filters increases with the depth of the model, giving a novel sparse connection structure that resembles a tree root. Both methods are found to improve the generalization of these architectures while also decreasing the size and increasing the efficiency of their training and test-time computation. Finally, we present work towards conditional computation in deep neural networks, moving towards a method of automatically learning structural priors in deep networks. We propose a new discriminative learning model, conditional networks, that jointly exploit the accurate representation learning capabilities of deep neural networks with the efficient conditional computation of decision trees. Conditional networks yield smaller models, and offer test-time flexibility in the trade-off of computation vs. accuracy.
204	An analysis of hierarchical text classification using word embeddings Stein, Roger Alan 28 March 2018 (has links) Submitted by JOSIANE SANTOS DE OLIVEIRA (josianeso) on 2019-03-07T14:41:05Z No. of bitstreams: 1 Roger Alan Stein_.pdf: 476239 bytes, checksum: a87a32ffe84d0e5d7a882e0db7b03847 (MD5) / Made available in DSpace on 2019-03-07T14:41:05Z (GMT). No. of bitstreams: 1 Roger Alan Stein_.pdf: 476239 bytes, checksum: a87a32ffe84d0e5d7a882e0db7b03847 (MD5) Previous issue date: 2018-03-28 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Efficient distributed numerical word representation models (word embeddings) combined with modern machine learning algorithms have recently yielded considerable improvement on automatic document classification tasks. However, the effectiveness of such techniques has not been assessed for the hierarchical text classification (HTC) yet. This study investigates application of those models and algorithms on this specific problem by means of experimentation and analysis. Classification models were trained with prominent machine learning algorithm implementations—fastText, XGBoost, and Keras’ CNN—and noticeable word embeddings generation methods—GloVe, word2vec, and fastText—with publicly available data and evaluated them with measures specifically appropriate for the hierarchical context. FastText achieved an LCAF1 of 0.871 on a single-labeled version of the RCV1 dataset. The results analysis indicates that using word embeddings is a very promising approach for HTC. / Modelos eficientes de representação numérica textual (word embeddings) combinados com algoritmos modernos de aprendizado de máquina têm recentemente produzido uma melhoria considerável em tarefas de classificação automática de documentos. Contudo, a efetividade de tais técnicas ainda não foi avaliada com relação à classificação hierárquica de texto. Este estudo investiga a aplicação daqueles modelos e algoritmos neste problema em específico através de experimentação e análise. Modelos de classificação foram treinados usando implementações proeminentes de algoritmos de aprendizado de máquina—fastText, XGBoost e CNN (Keras)— e notórios métodos de geração de word embeddings—GloVe, word2vec e fastText—com dados disponíveis publicamente e avaliados usando métricas especificamente adequadas ao contexto hierárquico. Nesses experimentos, fastText alcançou um LCAF1 de 0,871 usando uma versão da base de dados RCV1 com apenas uma categoria por tupla. A análise dos resultados indica que a utilização de word embeddings é uma abordagem muito promissora para classificação hierárquica de texto. Classificação hierárquica Classificação textual Redes neurais (computação) FastText Hierarchical classification Text classification Word embeddings Convolutional neural networks FastText
205	Towards non-conventional face recognition : shadow removal and heterogeneous scenario / Vers la reconnaissance faciale non conventionnelle : suppression des ombres et scénario hétérogène Zhang, Wuming 17 July 2017 (has links) Ces dernières années, la biométrie a fait l’objet d’une grande attention en raison du besoin sans cesse croissant d’authentification d’identité, notamment pour sécuriser de plus en plus d’applications enlignes. Parmi divers traits biométriques, le visage offre des avantages compétitifs sur les autres, e.g., les empreintes digitales ou l’iris, car il est naturel, non-intrusif et facilement acceptable par les humains. Aujourd’hui, les techniques conventionnelles de reconnaissance faciale ont atteint une performance quasi-parfaite dans un environnement fortement contraint où la pose, l’éclairage, l’expression faciale et d’autres sources de variation sont sévèrement contrôlées. Cependant, ces approches sont souvent confinées aux domaines d’application limités parce que les environnements d’imagerie non-idéaux sont très fréquents dans les cas pratiques. Pour relever ces défis d’une manière adaptative, cette thèse porte sur le problème de reconnaissance faciale non contrôlée, dans lequel les images faciales présentent plus de variabilités sur les éclairages. Par ailleurs, une autre question essentielle vise à profiter des informations limitées de 3D pour collaborer avec les techniques basées sur 2D dans un système de reconnaissance faciale hétérogène. Pour traiter les diverses conditions d’éclairage, nous construisons explicitement un modèle de réflectance en caractérisant l’interaction entre la surface de la peau, les sources d’éclairage et le capteur de la caméra pour élaborer une explication de la couleur du visage. A partir de ce modèle basé sur la physique, une représentation robuste aux variations d’éclairage, à savoir Chromaticity Invariant Image (CII), est proposée pour la reconstruction des images faciales couleurs réalistes et sans ombre. De plus, ce processus de la suppression de l’ombre en niveaux de couleur peut être combiné avec les techniques existantes sur la normalisation d’éclairage en niveaux de gris pour améliorer davantage la performance de reconnaissance faciale. Les résultats expérimentaux sur les bases de données de test standard, CMU-PIE et FRGC Ver2.0, démontrent la capacité de généralisation et la robustesse de notre approche contre les variations d’éclairage. En outre, nous étudions l’usage efficace et créatif des données 3D pour la reconnaissance faciale hétérogène. Dans un tel scénario asymétrique, un enrôlement combiné est réalisé en 2D et 3D alors que les images de requête pour la reconnaissance sont toujours les images faciales en 2D. A cette fin, deux Réseaux de Neurones Convolutifs (Convolutional Neural Networks, CNN) sont construits. Le premier CNN est formé pour extraire les descripteurs discriminants d’images 2D/3D pour un appariement hétérogène. Le deuxième CNN combine une structure codeur-décodeur, à savoir U-Net, et Conditional Generative Adversarial Network (CGAN), pour reconstruire l’image faciale en profondeur à partir de son homologue dans l’espace 2D. Plus particulièrement, les images reconstruites en profondeur peuvent être également transmise au premier CNN pour la reconnaissance faciale en 3D, apportant un schéma de fusion qui est bénéfique pour la performance en reconnaissance. Notre approche a été évaluée sur la base de données 2D/3D de FRGC. Les expérimentations ont démontré que notre approche permet d’obtenir des résultats comparables à ceux de l’état de l’art et qu’une amélioration significative a pu être obtenue à l’aide du schéma de fusion. / In recent years, biometrics have received substantial attention due to the evergrowing need for automatic individual authentication. Among various physiological biometric traits, face offers unmatched advantages over the others, such as fingerprints and iris, because it is natural, non-intrusive and easily understandable by humans. Nowadays conventional face recognition techniques have attained quasi-perfect performance in a highly constrained environment wherein poses, illuminations, expressions and other sources of variations are strictly controlled. However these approaches are always confined to restricted application fields because non-ideal imaging environments are frequently encountered in practical cases. To adaptively address these challenges, this dissertation focuses on this unconstrained face recognition problem, where face images exhibit more variability in illumination. Moreover, another major question is how to leverage limited 3D shape information to jointly work with 2D based techniques in a heterogeneous face recognition system. To deal with the problem of varying illuminations, we explicitly build the underlying reflectance model which characterizes interactions between skin surface, lighting source and camera sensor, and elaborate the formation of face color. With this physics-based image formation model involved, an illumination-robust representation, namely Chromaticity Invariant Image (CII), is proposed which can subsequently help reconstruct shadow-free and photo-realistic color face images. Due to the fact that this shadow removal process is achieved in color space, this approach could thus be combined with existing gray-scale level lighting normalization techniques to further improve face recognition performance. The experimental results on two benchmark databases, CMU-PIE and FRGC Ver2.0, demonstrate the generalization ability and robustness of our approach to lighting variations. We further explore the effective and creative use of 3D data in heterogeneous face recognition. In such a scenario, 3D face is merely available in the gallery set and not in the probe set, which one would encounter in real-world applications. Two Convolutional Neural Networks (CNN) are constructed for this purpose. The first CNN is trained to extract discriminative features of 2D/3D face images for direct heterogeneous comparison, while the second CNN combines an encoder-decoder structure, namely U-Net, and Conditional Generative Adversarial Network (CGAN) to reconstruct depth face image from its counterpart in 2D. Specifically, the recovered depth face images can be fed to the first CNN as well for 3D face recognition, leading to a fusion scheme which achieves gains in recognition performance. We have evaluated our approach extensively on the challenging FRGC 2D/3D benchmark database. The proposed method compares favorably to the state-of-the-art and show significant improvement with the fusion scheme. Reconnaissance faciale Suppression des ombres Normalisation d’éclairage Apprentissage profond Réseaux de neurones convolutionnels Reconstruction de profondeur Face recognition Shadow removal Lighting normalization Deep learning Convolutional neural networks Depth recovery
206	Uma abordagem de redes neurais convolucionais para an?lise de sentimento multi-lingual Becker, Willian Eduardo 24 November 2017 (has links) Submitted by PPG Ci?ncia da Computa??o (ppgcc@pucrs.br) on 2018-09-03T14:11:33Z No. of bitstreams: 1 WILLIAN EDUARDO BECKER_DIS.pdf: 2142751 bytes, checksum: e6501a586bb81f7cbad7fa5ef35d32f2 (MD5) / Approved for entry into archive by Sheila Dias (sheila.dias@pucrs.br) on 2018-09-04T14:43:25Z (GMT) No. of bitstreams: 1 WILLIAN EDUARDO BECKER_DIS.pdf: 2142751 bytes, checksum: e6501a586bb81f7cbad7fa5ef35d32f2 (MD5) / Made available in DSpace on 2018-09-04T14:57:29Z (GMT). No. of bitstreams: 1 WILLIAN EDUARDO BECKER_DIS.pdf: 2142751 bytes, checksum: e6501a586bb81f7cbad7fa5ef35d32f2 (MD5) Previous issue date: 2017-11-24 / Nowadays, the use of social media has become a daily activity of our society. The huge and uninterrupt flow of information in these spaces opens up the possibility of exploring this data in different ways. Sentiment Analysis (SA) is a task that aims to obtain knowledge about the polarity of a given text relying on several techniques of Natural Language Processing, with most of solutions dealing with only one language at a time. However, approaches that are not restricted to explore only one language are more related to extract the whole knowledge and possibilities of these data. Recent approaches based on Machine Learning propose to solve SA by using mainly Deep Learning Neural Networks have obtained good results in this task. In this work is proposed three Convolutional Neural Network architectures that deal with multilingual Twitter data of four languages. The first and second proposed models are characterized by the fact they require substantially less learnable parameters than other considered baselines while are more accurate than several other Deep Neural architectures. The third proposed model is able to perform a multitask classification by identifying the polarity of a given sentences and also its language. This model reaches an accuracy of 74.43% for SA and 98.40% for Language Identification in the four-language multilingual dataset. Results confirm that proposed model is the best choice for both sentiment and language classification by outperforming the considered baselines. / A utiliza??o de redes sociais tornou-se uma atividade cotidiana na sociedade atual. Com o enorme, e ininterrupto, fluxo de informa??es geradas nestes espa?os, abre-se a possibilidade de explorar estes dados de diversas formas. A An?lise de Sentimento (AS) ? uma tarefa que visa obter conhecimento sobre a polaridade das mensagens postadas, atrav?s de diversas t?cnicas de Processamento de Linguagem Natural, onde a maioria das solu??es lida com somente um idioma de cada vez. Entretanto, abordagens que n?o restringem se a explorar somente uma l?ngua, est?o mais pr?ximas de extra?rem todo o conhecimento e possibilidades destes dados. Abordagens recentes baseadas em Aprendizado de M?quina prop?em-se a resolver a AS apoiando-se principalmente nas Redes Neurais Profundas (Deep Learning), as quais obtiveram bons resultados nesta tarefa. Neste trabalho s?o propostas tr?s arquiteturas de Redes Neurais Convolucionais que lidam com dados multi-linguais extra?dos do Twitter contendo quatro l?nguas. Os dois primeiros modelos propostos caracterizam-se pelo fato de possu?rem um total de par?metros muito menor que os demais baselines considerados, e ainda assim, obt?m resultados superiores com uma boa margem de diferen?a. O ?ltimo modelo proposto ? capaz de realizar uma classifica??o multitarefa, identificando a polaridade das senten?as e tamb?m a l?ngua. Com este ?ltimo modelo obt?m-se uma acur?cia de 74.43% para AS e 98.40% para Identifica??o da L?ngua em um dataset com quatro l?nguas, mostrando-se a melhor escolha entre todos os baselines analisados. Redes Neurais Convolucionais Intelig?ncia Artificial An?lise de Sentimento Aprendizado Profundo PLN Convolutional Neural Networks Artificial Intelligence Sentiment Analysis Deep Learning NLP
207	Tactile Sensing and Position Estimation Methods for Increased Proprioception of Soft-Robotic Platforms Day, Nathan McClain 01 July 2018 (has links) Soft robots have the potential to transform the way robots interact with their environment. This is due to their low inertia and inherent ability to more safely interact with the world without damaging themselves or the people around them. However, existing sensing for soft robots has at least partially limited their ability to control interactions with their environment. Tactile sensors could enable soft robots to sense interaction, but most tactile sensors are made from rigid substrates and are not well suited to applications for soft robots that can deform. In addition, the benefit of being able to cheaply manufacture soft robots may be lost if the tactile sensors that cover them are expensive and their resolution does not scale well for manufacturability. Soft robots not only need to know their interaction forces due to contact with their environment, they also need to know where they are in Cartesian space. Because soft robots lack a rigid structure, traditional methods of joint estimation found in rigid robots cannot be employed on soft robotic platforms. This requires a different approach to soft robot pose estimation. This thesis will discuss both tactile force sensing and pose estimation methods for soft-robots. A method to make affordable, high-resolution, tactile sensor arrays (manufactured in rows and columns) that can be used for sensorizing soft robots and other soft bodies isReserved developed. However, the construction results in a sensor array that exhibits significant amounts of cross-talk when two taxels in the same row are compressed. Using the same fabric-based tactile sensor array construction design, two different methods for cross-talk compensation are presented. The first uses a mathematical model to calculate a change in resistance of each taxel directly. The second method introduces additional simple circuit components that enable us to isolate each taxel electrically and relate voltage to force directly. This thesis also discusses various approaches in soft robot pose estimation along with a method for characterizing sensors using machine learning. Particular emphasis is placed on the effectiveness of parameter-based learning versus parameter-free learning, in order to determine which method of machine learning is more appropriate and accurate for soft robot pose estimation. Various machine learning architectures, such as recursive neural networks and convolutional neural networks, are also tested to demonstrate the most effective architecture to use for characterizing soft-robot sensors. Tactile Sensing Machine Learning Cross-talk Compensation Parameter-free Learning Parameter-based Learning Soft Robot Sensing Convolutional Neural Networks Recursive Neural Networks Mechanical Engineering
208	Motion-Induced Artifact Mitigation and Image Enhancement Strategies for Four-Dimensional Fan-Beam and Cone-Beam Computed Tomography Riblett, Matthew J 01 January 2018 (has links) Four dimensional imaging has become part of the standard of care for diagnosing and treating non-small cell lung cancer. In radiotherapy applications 4D fan-beam computed tomography (4D-CT) and 4D cone-beam computed tomography (4D-CBCT) are two advanced imaging modalities that afford clinical practitioners knowledge of the underlying kinematics and structural dynamics of diseased tissues and provide insight into the effects of regular organ motion and the nature of tissue deformation over time. While these imaging techniques can facilitate the use of more targeted radiotherapies, issues surrounding image quality and accuracy currently limit the utility of these images clinically. The purpose of this project is to develop methods that retrospectively compensate for anatomical motion in 4D-CBCT and correct motion artifacts present in 4D-CT to improve the image quality of reconstructed volume and assist in localizing respiration-influenced, diseased tissue and mobile structures of interest. In the first half of the project, a series of motion compensation (MoCo) workflow methods incorporating groupwise deformable image registration and projection-warped reconstruction were developed for use with 4D-CBCT imaging. In the latter half of the project, novel motion artifact observation and artifact- weighted groupwise registration-based image correction algorithms were designed and tested. Both deliverable components of this project were evaluated for their ability to enhance image quality when applied to clinical patient datasets and demonstrated qualitative and quantitative improvements over current state-of-the-art. lung cancer computed tomography cone-beam motion compensation machine learning convolutional neural networks Investigative Techniques Oncology Other Physics Radiology
209	DRESS & GO: Deep belief networks and Rule Extraction Supported by Simple Genetic Optimization / DRESS & GO: Deep belief networks and Rule Extraction Supported by Simple Genetic Optimization Švaralová, Monika January 2018 (has links) Recent developments in social media and web technologies offer new opportunities to access, analyze and process ever-increasing amounts of fashion-related data. In the appealing context of design and fashion, our main goal is to automatically suggest fashionable outfits based on the preferences extracted from real-world data provided either by individual users or gathered from the internet. In our case, the clothing items have the form of 2D-images. Especially for visual data processing tasks, recent models of deep neural networks are known to surpass human performance. This fact inspired us to apply the idea of transfer learning to understand the actual variability in clothing items. The principle of transfer learning consists in extracting the internal representa- tions formed in large convolutional networks pre-trained on general datasets, e.g., ImageNet, and visualizing its (similarity) structure. Together with transfer learn- ing, clustering algorithms and the image color schemes can be, namely, utilized when searching for related outfit items. Viable means applicable to generating new out- fits include deep belief networks and genetic algorithms enhanced by a convolutional network that models the outfit fitness. Although fashion-related recommendations remain highly subjective, the results we have achieved...
210	Real-time 3D Semantic Segmentation of Timber Loads with Convolutional Neural Networks Sällqvist, Jessica January 2018 (has links) Volume measurements of timber loads is done in conjunction with timber trade. When dealing with goods of major economic values such as these, it is important to achieve an impartial and fair assessment when determining price-based volumes. With the help of Saab’s missile targeting technology, CIND AB develops products for digital volume measurement of timber loads. Currently there is a system in operation that automatically reconstructs timber trucks in motion to create measurable images of them. Future iterations of the system is expected to fully automate the scaling by generating a volumetric representation of the timber and calculate its external gross volume. The first challenge towards this development is to separate the timber load from the truck. This thesis aims to evaluate and implement appropriate method for semantic pixel-wise segmentation of timber loads in real time. Image segmentation is a classic but difficult problem in computer vision. To achieve greater robustness, it is therefore important to carefully study and make use of the conditions given by the existing system. Variations in timber type, truck type and packing together create unique combinations that the system must be able to handle. The system must work around the clock in different weather conditions while maintaining high precision and performance. Semantic Segmentation Pixel-wise classification Convolutional Neural Networks Fully Convolutional Networks Patch-based training.

Search results