Spelling suggestions: "subject:"pixel wide"" "subject:"pixel wie""
1 |
[en] A DATA-CENTRIC APPROACH TO IMPROVING SEGMENTATION MODELS WITH DEEP LEARNING IN MAMMOGRAPHY IMAGES / [pt] UMA ABORDAGEM CENTRADA EM DADOS PARA O APRIMORAMENTO DE MODELOS DE SEGMENTAÇÃO COM APRENDIZADO PROFUNDO EM IMAGENS DE MAMOGRAFIASANTIAGO STIVEN VALLEJO SILVA 07 December 2023 (has links)
[pt] A segmentação semântica das estruturas anatômicas em imagens de mamografia desempenha um papel significativo no apoio da análise médica. Esta
tarefa pode ser abordada com o uso de um modelo de aprendizado de máquina,
que deve ser capaz de identificar e delinear corretamente as estruturas de
interesse tais como papila, tecido fibroglandular, músculo peitoral e tecido
gorduroso. No entanto, a segmentação de estruturas pequenas como papila e
peitoral é frequentemente um desafio. Sendo o maior desafio o reconhecimento
ou deteção do músculo peitoral na vista craniocaudal (CC), devido ao seu
tamanho variável, possíveis ausências e sobreposição de tecido fibroglandular.
Para enfrentar esse desafio, este trabalho propõe uma abordagem centrada
em dados para melhorar o desempenho do modelo de segmentação na papila
mamária e no músculo peitoral. Especificamente, aprimorando os dados de
treinamento e as anotações em duas etapas. A primeira etapa é baseada
em modificações nas anotações. Foram desenvolvidos algoritmos para buscar
automaticamente anotações fora do comum dependendo da sua forma. Com
estas anotações encontradas, foi feita uma revisão e correção manual.
A segunda etapa envolve um downsampling do conjunto de dados, reduzindo
as amostras de imagens do conjunto de treinamento. Foram analisados os casos
de falsos positivos e falsos negativos, identificando as imagens que fornecem
informações confusas, para posteriormente removê-las do conjunto.
Em seguida, foram treinados modelos usando os dados de cada etapa e foram
obtidas as métricas de classificação para o músculo peitoral em vista CC e o
IoU para cada estrutura nas vistas CC e MLO (Mediolateral Oblíqua).
Os resultados do treinamento mostram uma melhora progressiva na identificação e segmentação do músculo peitoral em vista CC e uma melhora na papila
em vista MLO, mantendo as métricas para as demais estruturas. / [en] The semantic segmentation of anatomical structures in mammography images plays a significant role in supporting medical analysis. This task can be approached using a machine learning model, which must be capable of identifying and accurately delineating the structures. However, segmentation of small structures such as nipple and pectoral is often challenging. Especially in there cognition or detection of the pectoral muscle in the craniocaudal (CC) view,due to its variable size, possible absences and overlapping of fibroglandular tissue.To tackle this challenge, this work proposes a data-centric approach to improvethe segmentation model s performance on the mammary papilla and pectoral muscle. Specifically, enhancing the training data and annotations in two stages.The first stage is based on modifications to the annotations. Algorithms were developed to automatically search for uncommon annotations dependingon their shape. Once these annotations were found, a manual review and correction were performed.The second stage involves downsampling the dataset, reducing the image samples in the training set. Cases of false positives and false negatives were analyzed, identifying images that provide confusing information, which were subsequently removed from the set. Next, models were trained using the data from each stage, and classification metrics were obtained for the pectoral muscle in the CC view and IoU for each structure in CC and MLO (mediolateral oblique) views. The training results show a progressive improvement in the identification and segmentation of the pectoral muscle in the CC view and an enhancement in the mammary papilla in the MLO view, while maintaining segmentation metricsfor the other structures.
|
2 |
Novel Approaches for Application of Principal Component Analysis on Dynamic PET Images for Improvement of Image Quality and Clinical DiagnosisRazifar, Pasha January 2005 (has links)
<p>Positron Emission Tomography, PET, can be used for dynamic studies in humans. In such studies a selected part of the body, often the whole brain, is imaged repeatedly after administration of a radiolabelled tracer. Such studies are performed to provide sequences of images reflecting the tracer’s kinetic behaviour, which may be related to physiological, biochemical and functional properties of tissues. This information can be obtained by analyzing the distribution and kinetic behaviour of the administered tracers in different regions, tissues and organs. Each image in the sequence thus contains part of the kinetic information about the administered tracer. </p><p>Several factors make analysis of PET images difficult, such as a high noise magnitude and correlation between image elements in conjunction with a high level of non-specific binding to the target and a sometimes small difference in target expression between pathological and healthy regions. It is therefore important to understand how these factors affect the derived quantitative measurements when using different methods such as kinetic modelling and multivariate image analysis.</p><p>In this thesis, a new method to explore the properties of the noise in dynamic PET images was introduced and implemented. The method is based on an analysis of the autocorrelation function of the images. This was followed by proposing and implementing three novel approaches for application of Principal Component Analysis, PCA, on dynamic human PET studies. The common underlying idea of these approaches was that the images need to be normalized before application of PCA to ensure that the PCA is signal driven, not noise driven. Different ways to estimate and correct for the noise variance were investigated. Normalizations were carried out Slice-Wise (SW), for the whole volume at once, and in both image domain and sinogram domain respectively. We also investigated the value of masking out and removing the area outside the brain for the analysis. </p><p>The results were very encouraging. We could demonstrate that for phantoms as well as for real image data, the applied normalizations allow PCA to reveal the signal much more clearly than what can be seen in the original image data sets. Using our normalizations, PCA can thus be used as a multivariate analysis technique that without any modelling assumptions can separate important kinetic information into different component images. Furthermore, these images contained optimized signal to noise ratio (SNR), low levels of noise and thus showed improved quality and contrast. This should allow more accurate visualization and better precision in the discrimination between pathological and healthy regions. Hopefully this can in turn lead to improved clinical diagnosis. </p>
|
3 |
Novel Approaches for Application of Principal Component Analysis on Dynamic PET Images for Improvement of Image Quality and Clinical DiagnosisRazifar, Pasha January 2005 (has links)
Positron Emission Tomography, PET, can be used for dynamic studies in humans. In such studies a selected part of the body, often the whole brain, is imaged repeatedly after administration of a radiolabelled tracer. Such studies are performed to provide sequences of images reflecting the tracer’s kinetic behaviour, which may be related to physiological, biochemical and functional properties of tissues. This information can be obtained by analyzing the distribution and kinetic behaviour of the administered tracers in different regions, tissues and organs. Each image in the sequence thus contains part of the kinetic information about the administered tracer. Several factors make analysis of PET images difficult, such as a high noise magnitude and correlation between image elements in conjunction with a high level of non-specific binding to the target and a sometimes small difference in target expression between pathological and healthy regions. It is therefore important to understand how these factors affect the derived quantitative measurements when using different methods such as kinetic modelling and multivariate image analysis. In this thesis, a new method to explore the properties of the noise in dynamic PET images was introduced and implemented. The method is based on an analysis of the autocorrelation function of the images. This was followed by proposing and implementing three novel approaches for application of Principal Component Analysis, PCA, on dynamic human PET studies. The common underlying idea of these approaches was that the images need to be normalized before application of PCA to ensure that the PCA is signal driven, not noise driven. Different ways to estimate and correct for the noise variance were investigated. Normalizations were carried out Slice-Wise (SW), for the whole volume at once, and in both image domain and sinogram domain respectively. We also investigated the value of masking out and removing the area outside the brain for the analysis. The results were very encouraging. We could demonstrate that for phantoms as well as for real image data, the applied normalizations allow PCA to reveal the signal much more clearly than what can be seen in the original image data sets. Using our normalizations, PCA can thus be used as a multivariate analysis technique that without any modelling assumptions can separate important kinetic information into different component images. Furthermore, these images contained optimized signal to noise ratio (SNR), low levels of noise and thus showed improved quality and contrast. This should allow more accurate visualization and better precision in the discrimination between pathological and healthy regions. Hopefully this can in turn lead to improved clinical diagnosis.
|
4 |
Real-time 3D Semantic Segmentation of Timber Loads with Convolutional Neural NetworksSällqvist, Jessica January 2018 (has links)
Volume measurements of timber loads is done in conjunction with timber trade. When dealing with goods of major economic values such as these, it is important to achieve an impartial and fair assessment when determining price-based volumes. With the help of Saab’s missile targeting technology, CIND AB develops products for digital volume measurement of timber loads. Currently there is a system in operation that automatically reconstructs timber trucks in motion to create measurable images of them. Future iterations of the system is expected to fully automate the scaling by generating a volumetric representation of the timber and calculate its external gross volume. The first challenge towards this development is to separate the timber load from the truck. This thesis aims to evaluate and implement appropriate method for semantic pixel-wise segmentation of timber loads in real time. Image segmentation is a classic but difficult problem in computer vision. To achieve greater robustness, it is therefore important to carefully study and make use of the conditions given by the existing system. Variations in timber type, truck type and packing together create unique combinations that the system must be able to handle. The system must work around the clock in different weather conditions while maintaining high precision and performance.
|
5 |
[pt] BUSCA POR ARQUITETURA NEURAL COM INSPIRAÇÃO QUÂNTICA APLICADA A SEGMENTAÇÃO SEMÂNTICA / [en] QUANTUM-INSPIRED NEURAL ARCHITECTURE SEARCH APPLIED TO SEMANTIC SEGMENTATIONGUILHERME BALDO CARLOS 14 July 2023 (has links)
[pt] Redes neurais profundas são responsáveis pelo grande progresso em diversas tarefas perceptuais, especialmente nos campos da visão computacional,reconhecimento de fala e processamento de linguagem natural. Estes resultados produziram uma mudança de paradigma nas técnicas de reconhecimentode padrões, deslocando a demanda do design de extratores de característicaspara o design de arquiteturas de redes neurais. No entanto, o design de novas arquiteturas de redes neurais profundas é bastante demandanteem termos de tempo e depende fortemente da intuição e conhecimento de especialistas,além de se basear em um processo de tentativa e erro. Neste contexto, a idea de automatizar o design de arquiteturas de redes neurais profundas tem ganhado popularidade, estabelecendo o campo da busca por arquiteturas neurais(NAS - Neural Architecture Search). Para resolver o problema de NAS, autores propuseram diversas abordagens envolvendo o espaço de buscas, a estratégia de buscas e técnicas para mitigar o consumo de recursos destes algoritmos. O Q-NAS (Quantum-inspired Neural Architecture Search) é uma abordagem proposta para endereçar o problema de NAS utilizando um algoritmo evolucionário com inspiração quântica como estratégia de buscas. Este método foi aplicado de forma bem sucedida em classificação de imagens, superando resultados de arquiteturas de design manual nos conjuntos de dados CIFAR-10 e CIFAR-100 além de uma aplicação de mundo real na área da sísmica. Motivados por este sucesso, propõe-se nesta Dissertação o SegQNAS (Quantum-inspired Neural Architecture Search applied to Semantic Segmentation), uma adaptação do Q-NAS para a tarefa de segmentação semântica. Diversos experimentos foram realizados com objetivo de verificar a aplicabilidade do SegQNAS em dois conjuntos de dados do desafio Medical Segmentation Decathlon. O SegQNAS foi capaz de alcançar um coeficiente de similaridade dice de 0.9583 no conjunto de dados de baço, superando os resultados de arquiteturas tradicionais como U-Net e ResU-Net e atingindo resultados comparáveis a outros trabalhos que aplicaram NAS a este conjunto de dados, mas encontrando arquiteturas com muito menos parãmetros. No conjunto de dados de próstata, o SegQNAS alcançou um coeficiente de similaridade dice de 0.6887 superando a U-Net, ResU-Net e o trabalho na área de NAS que utilizamos como comparação. / [en] Deep neural networks are responsible for great progress in performance
for several perceptual tasks, especially in the fields of computer vision, speech
recognition, and natural language processing. These results produced a paradigm shift in pattern recognition techniques, shifting the demand from feature
extractor design to neural architecture design. However, designing novel deep
neural network architectures is very time-consuming and heavily relies on experts intuition, knowledge, and a trial and error process. In that context, the
idea of automating the architecture design of deep neural networks has gained
popularity, establishing the field of neural architecture search (NAS). To tackle the problem of NAS, authors have proposed several approaches regarding
the search space definition, algorithms for the search strategy, and techniques
to mitigate the resource consumption of those algorithms. Q-NAS (Quantum-inspired Neural Architecture Search) is one proposed approach to address the
NAS problem using a quantum-inspired evolutionary algorithm as the search
strategy. That method has been successfully applied to image classification,
outperforming handcrafted models on the CIFAR-10 and CIFAR-100 datasets
and also on a real-world seismic application. Motivated by this success, we
propose SegQNAS (Quantum-inspired Neural Architecture Search applied to
Semantic Segmentation), which is an adaptation of Q-NAS applied to semantic
segmentation. We carried out several experiments to verify the applicability
of SegQNAS on two datasets from the Medical Segmentation Decathlon challenge. SegQNAS was able to achieve a 0.9583 dice similarity coefficient on the
spleen dataset, outperforming traditional architectures like U-Net and ResU-Net and comparable results with a similar NAS work from the literature but
with fewer parameters network. On the prostate dataset, SegQNAS achieved
a 0.6887 dice similarity coefficient, also outperforming U-Net, ResU-Net, and
outperforming a similar NAS work from the literature.
|
6 |
Fashion Object Detection and Pixel-Wise Semantic Segmentation : Crowdsourcing framework for image bounding box detection & Pixel-Wise SegmentationMallu, Mallu January 2018 (has links)
Technology has revamped every aspect of our life, one of those various facets is fashion industry. Plenty of deep learning architectures are taking shape to augment fashion experiences for everyone. There are numerous possibilities of enhancing the fashion technology with deep learning. One of the key ideas is to generate fashion style and recommendation using artificial intelligence. Likewise, another significant feature is to gather reliable information of fashion trends, which includes analysis of existing fashion related images and data. When specifically dealing with images, localisation and segmentation are well known to address in-depth study relating to pixels, objects and labels present in the image. In this master thesis a complete framework is presented to perform localisation and segmentation on fashionista images. This work is a part of an interesting research work related to Fashion Style detection and Recommendation. Developed solution aims to leverage the possibility of localising fashion items in an image by drawing bounding boxes and labelling them. Along with that, it also provides pixel-wise semantic segmentation functionality which extracts fashion item label-pixel data. Collected data can serve as ground truth as well as training data for the aimed deep learning architecture. A study related to localisation and segmentation of videos has also been presented in this work. The developed system has been evaluated in terms of flexibility, output quality and reliability as compared to similar platforms. It has proven to be fully functional solution capable of providing essential localisation and segmentation services while keeping the core architecture simple and extensible. / Tekniken har förnyat alla aspekter av vårt liv, en av de olika fasetterna är modeindustrin. Massor av djupa inlärningsarkitekturer tar form för att öka modeupplevelser för alla. Det finns många möjligheter att förbättra modetekniken med djup inlärning. En av de viktigaste idéerna är att skapa modestil och rekommendation med hjälp av artificiell intelligens. På samma sätt är en annan viktig egenskap att samla pålitlig information om modetrender, vilket inkluderar analys av befintliga moderelaterade bilder och data. När det specifikt handlar om bilder är lokalisering och segmentering väl kända för att ta itu med en djupgående studie om pixlar, objekt och etiketter som finns i bilden. I denna masterprojekt presenteras en komplett ram för att utföra lokalisering och segmentering på fashionista bilder. Detta arbete är en del av ett intressant forskningsarbete relaterat till Fashion Style detektering och rekommendation. Utvecklad lösning syftar till att utnyttja möjligheten att lokalisera modeartiklar i en bild genom att rita avgränsande lådor och märka dem. Tillsammans med det tillhandahåller det även pixel-wise semantisk segmenteringsfunktionalitet som extraherar dataelementetikett-pixeldata. Samlad data kan fungera som grundsannelse samt träningsdata för den riktade djuplärarkitekturen. En studie relaterad till lokalisering och segmentering av videor har också presenterats i detta arbete. Det utvecklade systemet har utvärderats med avseende på flexibilitet, utskriftskvalitet och tillförlitlighet jämfört med liknande plattformar. Det har visat sig vara en fullt fungerande lösning som kan tillhandahålla viktiga lokaliseringsoch segmenteringstjänster samtidigt som kärnarkitekturen är enkel och utvidgbar.
|
7 |
[pt] ESTRATÉGIAS PARA OTIMIZAR PROCESSOS DE ANOTAÇÃO E GERAÇÃO DE DATASETS DE SEGMENTAÇÃO SEMÂNTICA EM IMAGENS DE MAMOGRAFIA / [en] STRATEGIES TO OPTIMIZE ANNOTATION PROCESSES AND GENERATION OF SEMANTIC SEGMENTATION DATASETS IN MAMMOGRAPHY IMAGESBRUNO YUSUKE KITABAYASHI 17 November 2022 (has links)
[pt] Com o avanço recente do uso de aprendizagem profunda supervisionada
(supervised deep learning) em aplicações no ramo da visão computacional, a
indústria e a comunidade acadêmica vêm evidenciando que uma das principais
dificuldades para o sucesso destas aplicações é a falta de datasets com a
suficiente quantidade de dados anotados. Nesse sentido aponta-se a necessidade
de alavancar grandes quantidades de dados rotulados para que estes modelos
inteligentes possam solucionar problemas pertinentes ao seu contexto para
atingir os resultados desejados. O uso de técnicas para gerar dados anotados
de maneira mais eficiente está sendo cada vez mais explorado, juntamente com
técnicas para o apoio à geração dos datasets que servem de insumos para o
treinamento dos modelos de inteligência artificial. Este trabalho tem como
propósito propor estratégias para otimizar processos de anotação e geração
de datasets de segmentação semântica. Dentre as abordagens utilizadas neste
trabalho destacamos o Interactive Segmentation e Active Learning. A primeira,
tenta melhorar o processo de anotação de dados, tornando-o mais eficiente e
eficaz do ponto de vista do anotador ou especialista responsável pela rotulagem
dos dados com uso de um modelo de segmentação semântica que tenta imitar
as anotações feitas pelo anotador. A segunda, consiste em uma abordagem que
permite consolidar um modelo deep learning utilizando um critério inteligente,
visando a seleção de dados não anotados mais informativos para o treinamento
do modelo a partir de uma função de aquisição que se baseia na estimação de
incerteza da rede para realizar a filtragem desses dados. Para aplicar e validar
os resultados de ambas as técnicas, o trabalho os incorpora em um caso de
uso relacionado em imagens de mamografia para segmentação de estruturas
anatômicas. / [en] With the recent advancement of the use of supervised deep learning in
applications in the field of computer vision, the industry and the academic
community have been showing that one of the main difficulties for the success
of these applications is the lack of datasets with a sufficient amount of
annotated data. In this sense, there is a need to leverage large amounts of
labeled data so that these intelligent models can solve problems relevant to
their context to achieve the desired results. The use of techniques to generate
annotated data more efficiently is being increasingly explored, together with
techniques to support the generation of datasets that serve as inputs for the
training of artificial intelligence models. This work aims to propose strategies
to optimize annotation processes and generation of semantic segmentation
datasets. Among the approaches used in this work, we highlight Interactive
Segmentation and Active Learning. The first one tries to improve the data
annotation process, making it more efficient and effective from the point of
view of the annotator or specialist responsible for labeling the data using a
semantic segmentation model that tries to imitate the annotations made by
the annotator. The second consists of an approach that allows consolidating
a deep learning model using an intelligent criterion, aiming at the selection of
more informative unannotated data for training the model from an acquisition
function that is based on the uncertainty estimation of the network to filter
these data. To apply and validate the results of both techniques, the work
incorporates them in a use case in mammography images for segmentation of
anatomical structures.
|
8 |
Digital Signature Technologies for Image Information Assurance / Vaizdo skaitmeninis parašas vaizdinės informacijos apsaugaiKriukovas, Artūras 25 January 2011 (has links)
The dissertation investigates the issues of image authentication and tamper localization after general image processing operations – blurring, sharpening, rotation and JPEG compression.
The dissertation consists of four parts including Introduction, 4 chapters, Conclusions, References.
The introduction reveals the investigated problem, importance of the thesis and the object of research and describes the purpose and tasks of the paper, research methodology, scientific novelty, the practical significance of results examined in the paper and defended statements. The introduction ends in presenting the author’s publications on the subject of the defended dissertation,
offering the material of made presentations in conferences and defining the structure of the dissertation.
Chapter 1 revises used literature, analyzes competitive methods. The devastating effect of blur/sharpen methods on digital image matrices is shown. General pixel-wise tamper localization methods are completely inefficient after
such casual processing. Block-wise methods demonstrate some resistance against blurring/sharpening, but no tamper localization with the resolution of up to one pixel is possible. There is clearly a need for a method, able to locate damaged pixels despite general image processing operations such as blurring or sharpening.
Chapter 2 defines theoretical foundation for the proposed method. It is shown that phase of Fourier transform demonstrates invariance against blurring or sharpening... [to full text] / Disertacijoje nagrinėjamos atvaizdų apsaugos – autentiškumo užtikrinimo ir pažeidimų radimo, po bendrųjų vaizdo apdorojimo procedūrų – blukinimo, ryškinimo, pasukimo ar JPEG suspaudimo – problemos.
Disertaciją sudaro įvadas, keturi skyriai, rezultatų apibendrinimas, naudotos literatūros ir autoriaus publikacijų disertacijos tema sąrašai.
Įvadiniame skyriuje aptariama tiriamoji problema, darbo aktualumas, aprašomas tyrimų objektas, formuluojamas darbo tikslas bei uždaviniai, aprašoma tyrimų metodika, darbo mokslinis naujumas, darbo rezultatų praktinė reikšmė, ginamieji teiginiai. Įvado pabaigoje pristatomos disertacijos tema autoriaus paskelbtos publikacijos ir pranešimai konferencijose bei disertacijos struktūra.
Pirmasis skyrius skirtas literatūros apžvalgai. Jame pateikta konkuruojančių metodų apžvalga. Parodomas globalus blukinimo/ryškinimo operacijų poveikis atvaizdo skaitmeninėms matricoms. Išsiaiškinama kodėl pikselių tikslumo metodai nėra atsparūs tokiam poveikiui. Kodėl blokiniai metodai demonstruoja dalinį atsparumą – bet jie nėra pajėgūs surasti vieno pažeisto pikselio.
Parodomas aiškus poreikis metodo, tiek galinčio rasti pažeistus pikselius, tiek atsparaus bendroms vaizdo apdorojimo procedūroms, tokioms kaip blukinimas
ar aštrinimas.
Antrajame skyriuje pateiktas sukurto metodo teorinis pagrindimas. Įrodoma, kad Furjė fazė pasižymi atsparumu blukinimui ir ryškinimui. Tačiau iškyla sekanti problema dėl to, kad Furjė fazėje neįmanoma rasti konkretaus pikselio –... [toliau žr. visą tekstą]
|
9 |
[pt] APLICAÇÃO DE REDES TOTALMENTE CONVOLUCIONAIS PARA A SEGMENTAÇÃO SEMÂNTICA DE IMAGENS DE DRONES, AÉREAS E ORBITAIS / [en] APPLYING FULLY CONVOLUTIONAL ARCHITECTURES FOR THE SEMANTIC SEGMENTATION OF UAV, AIRBORN, AND SATELLITE REMOTE SENSING IMAGERY14 December 2020 (has links)
[pt] A crescente disponibilidade de dados de sensoriamento remoto vem criando novas oportunidades e desafios em aplicações de monitoramento de processos naturais e antropogénicos em escala global. Nos últimos anos, as técnicas de aprendizado profundo tornaram-se o estado da arte na análise de dados
de sensoriamento remoto devido sobretudo à sua capacidade de aprender automaticamente atributos discriminativos a partir de grandes volumes de dados. Um dos problemas chave em análise de imagens é a segmentação semântica, também conhecida como rotulação de pixels. Trata-se de atribuir uma classe a cada sítio de imagem. As chamadas redes totalmente convolucionais de prestam a esta função. Os anos recentes têm testemunhado inúmeras propostas de arquiteturas de redes totalmente convolucionais que
têm sido adaptadas para a segmentação de dados de observação da Terra. O presente trabalho avalias cinco arquiteturas de redes totalmente convolucionais que representam o estado da arte em segmentação semântica de imagens de sensoriamento remoto. A avaliação considera dados provenientes de diferentes plataformas: veículos aéreos não tripulados, aeronaves e satélites. Cada um destes dados refere-se a aplicações diferentes: segmentação de espécie arbórea, segmentação de telhados e desmatamento. O desempenho das redes é avaliado experimentalmente em termos de acurácia e da carga computacional associada. O estudo também avalia os benefícios da utilização do Campos Aleatórios Condicionais (CRF) como etapa de pósprocessamento para melhorar a acurácia dos mapas de segmentação. / [en] The increasing availability of remote sensing data has created new opportunities and challenges for monitoring natural and anthropogenic processes on a global scale. In recent years, deep learning techniques have become state of the art in remote sensing data analysis, mainly due to their ability
to learn discriminative attributes from large volumes of data automatically. One of the critical problems in image analysis is the semantic segmentation, also known as pixel labeling. It involves assigning a class to each image site. The so-called fully convolutional networks are specifically designed for this task. Recent years have witnessed numerous proposals for fully convolutional network architectures that have been adapted for the segmentation of Earth observation data. The present work evaluates five fully convolutional
network architectures that represent the state of the art in semantic segmentation of remote sensing images. The assessment considers data from different platforms: unmanned aerial vehicles, airplanes, and satellites. Three applications are addressed: segmentation of tree species, segmentation of roofs, and deforestation. The performance of the networks is evaluated experimentally in terms of accuracy and the associated computational load. The study also assesses the benefits of using Conditional Random Fields
(CRF) as a post-processing step to improve the accuracy of segmentation maps.
|
10 |
[pt] APLICAÇÕES DE APRENDIZADO PROFUNDO NO MONITORAMENTO DE CULTURAS: CLASSIFICAÇÃO DE TIPO, SAÚDE E AMADURECIMENTO DE CULTURAS / [en] APPLICATIONS OF DEEP LEARNING FOR CROP MONITORING: CLASSIFICATION OF CROP TYPE, HEALTH AND MATURITYGABRIEL LINS TENORIO 18 May 2020 (has links)
[pt] A eficiência de culturas pode ser aprimorada monitorando-se suas condições de forma contínua e tomando-se decisões baseadas em suas análises. Os dados para análise podem ser obtidos através de sensores de imagens e o processo de monitoramento pode ser automatizado utilizando-se algoritmos de reconhecimento de imagem com diferentes níveis de complexidade. Alguns dos algoritmos de maior êxito estão relacionados a abordagens supervisionadas de aprendizagem profunda (Deep Learning) as quais utilizam formas de Redes Neurais de Convolucionais (CNNs). Nesta dissertação de mestrado, empregaram-se modelos de aprendizagem profunda supervisionados para classificação, regressão, detecção de objetos e segmentação semântica em tarefas de monitoramento de culturas, utilizando-se amostras de imagens obtidas através de três níveis distintos: Satélites, Veículos Aéreos Não Tripulados (UAVs) e Robôs Terrestres Móveis (MLRs). Ambos satélites e UAVs envolvem o uso de imagens multiespectrais. Para o primeiro nível, implementou-se um modelo CNN baseado em Transfer Learning para a classificação de espécies vegetativas. Aprimorou-se o desempenho de aprendizagem do transfer learning através de um método de análise estatística recentemente proposto. Na sequência, para o segundo nível, implementou-se um algoritmo segmentação semântica multitarefa para a detecção de lavouras de cana-de-açúcar e identificação de seus estados (por exemplo, saúde e idade da cultura). O algoritmo também detecta a vegetação ao redor das lavouras, sendo relevante na busca por ervas daninhas. No terceiro nível, implementou-se um algoritmo Single Shot Multibox Detector para detecção de cachos de tomate. De forma a avaliar o estado dos cachos, utilizaram-se duas abordagens diferentes: uma implementação baseada em segmentação de imagens e uma CNN supervisionada adaptada para cálculos de regressão
capaz de estimar a maturação dos cachos de tomate. De forma a quantificar cachos de tomate em vídeos para diferentes estágios de maturação, empregou-se uma implementação de Região de Interesse e propôs-se um sistema de rastreamento o qual utiliza informações temporais. Para todos os
três níveis, apresentaram-se soluções e resultados os quais superam as linhas de base do estado da arte. / [en] Crop efficiency can be improved by continually monitoring their state and making decisions based on their analysis. The data for analysis can be obtained through images sensors and the monitoring process can be automated by using image recognition algorithms with different levels of complexity. Some of the most successful algorithms are related to supervised Deep Learning approaches which use a form of Convolutional Neural Networks (CNNs). In this master s dissertation, we employ supervised deep learning models for classification, regression, object detection, and semantic segmentation in crop monitoring tasks, using image samples obtained through three different levels: Satellites, Unmanned Aerial Vehicles (UAVs) and Unmanned Ground Vehicles (UGVs). Both satellites and UAVs levels involve the use of multispectral images. For the first level, we implement a CNN model based on transfer learning to classify vegetative species. We also improve the transfer learning performance by a newly proposed statistical analysis method. Next, for the second level, we implement a multi-task semantic segmentation algorithm to detect sugarcane crops and infer their state (e.g. crop health and age). The algorithm also detects the surrounding vegetation, being relevant in the search for weeds. In the third level, we implement a Single Shot Multibox detector algorithm to detect tomato clusters. To evaluate the cluster s state, we use two different approaches: an implementation based on image segmentation and a supervised CNN regressor capable of estimating their maturity. In order to quantify the tomato clusters in videos at different maturation stages, we employ a Region of Interest implementation and also a proposed tracking system which uses temporal information. For all the three levels, we present solutions and results that outperform state-of-the art baselines.
|
Page generated in 0.0757 seconds