Global ETD Search

101	Novel Edge-Preserving Filtering Model Based on the Quadratic Envelope of the l0 Gradient Regularization Vásquez Ortiz, Eduar Aníbal 26 January 2023 (has links) In image processing, the l0 gradient regularization (l0-grad) is an inverse problem which penalizes the l0 norm of the reconstructed image’s gradient. Current state-of-the art algorithms for solving this problem are based on the alternating direction method of multipliers (ADMM). l0-grad however, reconstructs images poorly in cases where the noise level is large, giving images with plain regions and abrupt changes between them, that look very distorted. This happens because it prioritizes keeping the main edges but risks losing important details when the images are too noisy. Furthermore, since kÑuk0 is a non-continuous and non-convex regularizer, l0-grad can not be directly solved by methods like the accelerated proximal gradient (APG). This thesis presents a novel edge-preserving filtering model (Ql0-grad) that uses a relaxed form of the quadratic envelope of the l0 norm of the gradient. This enables us to control the level of details that can be lost during denoising and deblurring. The Ql0-grad model can be seen as a mixture of the Total Variation and l0-grad models. The results for the denoising and deblurring problems show that our model sharpens major edges while strongly attenuating textures. When it was compared to the l0-grad model, it reconstructed images with flat, texture-free regions that had smooth changes between them, even for scenarios where the input image was corrupted with a large amount of noise. Furthermore the averages of the differences between the obtained metrics with Ql0- grad and l0-grad were +0.96 dB SNR (signal to noise ratio), +0.96 dB PSNR (peak signal to noise ratio) and +0.03 SSIM (structural similarity index measure). An early version of the model was presented in the paper Fast gradient-based algorithm for a quadratic envelope relaxation of the l0 gradient regularization which was published in the international and indexed conference proceedings of the XXIII Symposium on Image, Signal Processing and Artificial Vision. Procesamiento de imágenes digitales Procesamiento de señales Algoritmos
102	Generación de imágenes de acciones específicas de una persona utilizando aprendizaje profundo Morales Pariona, Jose Ulises 16 April 2024 (has links) Desde que aparecieron las redes GAN, se han realizado varias investigaciones sobre cómo generar imágenes en diversos ámbitos, como la generación de imágenes, conversión de imágenes, síntesis de videos, síntesis de imágenes a partir de textos y predicción de cuadros de videos. Basándose mayormente en mejorar la generación de imágenes de alta resolución y la reconstrucción o predicción de datos. El propósito de este trabajo es implementar las redes GAN en otros ámbitos, como la generación de imágenes de entidades realizando una acción. En este caso se consideró 3 acciones de personas, que son los ejercicios de Glúteo, Abdomen y Cardio. En primer lugar, se descargaron y procesaron las imágenes de YouTube, el cual incluye una secuencia de imágenes de cada acción. Posteriormente, se separó dos grupos de imágenes, de una sola persona, y de personas diferentes realizando las acciones. En segundo lugar, se seleccionó el modelo InfoGAN para la generación de imágenes, teniendo como evaluador de rendimiento, la Puntuación Inicial (PI). Obteniendo como resultados para el primer grupo, una puntuación máxima de 1.28 y en el segundo grupo, una puntuación máxima de 1.3. En conclusión, aunque no se obtuvo el puntaje máximo de 3 para este evaluador de rendimiento, debido a la cantidad y calidad de las imágenes. Se aprecia, que el modelo si logra diferenciar los 3 tipos de ejercicios, aunque existen casos donde se muestran incorrectamente las piernas, los brazos y la cabeza. / Since the appearance of GAN networks, various investigations have been carried out on how to generate images in various fields, such as image generation, image conversion, video synthesis, image synthesis from text, and video frame prediction. Based mostly on improving the generation of high resolution images and the reconstruction or prediction of data. The purpose of this work is to implement GAN networks in other areas, such as the generation of images of entities performing an action. In this case, 3 actions of people were considered, which are the Gluteus, Abdomen and Cardio exercises. First, the images from YouTube were downloaded and processed, which includes a sequence of images of each action. Subsequently, two groups of images were separated, of a single person, and of different people performing the actions. Secondly, the InfoGAN model was selected for image generation, having the Initial Score (PI) as a performance evaluator. Obtaining as results for the first group, a maximum score of 1.28 and in the second group, a maximum score of 1.3. In conclusion, although the maximum score of 3 was not obtained for this performance tester, due to the quantity and quality of the images. It can be seen that the model is able to differentiate the 3 types of exercises, although there are cases where the legs, arms and head are shown incorrectly. Procesamiento de imágenes digitales Procesamiento de datos Aprendizaje profundo
103	Studies on obstacle detection and path planning for a quadrotor system Valencia Mamani, Dalthon Abel 06 November 2017 (has links) Autonomous systems are one interesting topic recently investigated; for land and aerial vehicles; however, the main limitation of aerial vehicles is the weight to carry on-board, since the power consumed depends on this and hardware like sensors and processor is limited. The present thesis develops an application of digital image processing to detect obstacles using only a monocamera, there are some approaches but the present report wants to focus on the distance estimation approach that, in future works, can be combined with other methods since this approach is more general. The distance estimation approach uses feature detection algorithms in two consecutive images, matching them and thus estimate the obstacle position. The estimation is computed through a mathematical model of the camera and projections between those two images. There are many parameters to improve final results and the best parameters are found and tested with consecutive images, which were captured every 0.5m along a straight path of 5m. Fraunhofer position modules are tested with the entire algorithm. Finally, in order to establish the new path without obstacles, an optimal binary integer programming problem is proposed, adapting the approach using results obtained from the distance estimation and obstacle detection. Resulting data is suitable for combining them with information obtained from conventional sensors, such as ultrasonic sensors. The obtained mean error is between 1% and 12% in short distances (less than 2.5 m) and greater with longer distances. The complexity of this study lies in the use of a single camera for the capture of frontal images and obtaining 3D information of the environment, the computation of the obstacle detection algorithm is tested off-line and the path-planning algorithm is proposed with detected keypoints in the background. / Autonome Systeme sind ein interessantes Thema vor kurzem untersucht; für Land- und Luftfahrzeuge; Allerdings ist die Hauptbegrenzung von Luftfahrzeugen das Gewicht, um an Bord zu tragen, da die verbrauchte Energie davon abhängt und Hardware wie Sensoren und Prozessor begrenzt ist. Die vorliegende Arbeit entwickelt eine Anwendung der digitalen Bildverarbeitung zur Erkennung von Hindernissen, die nur eine Monokamera verwenden, es gibt einige Ansätze, aber der vorliegende Bericht will sich auf den Abstandsschätzungsansatz konzentrieren, der in Zukunft mit anderen Methoden kombiniert werden kann, da dieser Ansatz ist allgemeiner. Der Abstandsschätzungsansatz verwendet Merkmalserkennungsalgorithmen in zwei aufeinanderfolgenden Bildern, passt sie an und schätzt somit die Hindernisposition ab. Die Schätzung wird durch ein mathematisches Modell der Kamera und Projektionen zwischen diesen beiden Bildern berechnet. Es gibt viele Parameter, um die endgültigen Ergebnisse zu verbessern, und die besten Parameter werden mit aufeinanderfolgenden Bildern gefunden und getestet, die alle 0,5 m auf einem geraden Weg von 5 m erfasst wurden. Fraunhofer-Positionsmodule werden mit dem gesamten Algorithmus getestet. Schließlich wird, um den neuen Weg ohne Hindernisse zu etablieren, ein optimales Binär-Integer-Programmierproblem vorgeschlagen, das den Ansatz unter Verwendung von Ergebnissen, die aus der Abstandsschätzung und der Hinderniserkennung erhalten wurden, anpasst. Die daraus resultierenden Daten eignen sich zur Kombination mit Informationen aus konventionellen Sensoren wie Ultraschallsensoren. Der erhaltene mittlere Fehler liegt zwischen 1 % und 12 % in kurzen Abständen (weniger als 2,5 m) und größer mit längeren Abständen. Die Komplexität dieser Studie liegt in der Verwendung einer einzigen Kamera für die Erfassung von Frontalbildern und dem Erhalten von 3D-Informationen der Umgebung, wird die Berechnung des Hinderniserfassungsalgorithmus off-line getestet und derWegplanungsalgorithmus wird mit erkannten Keypoints vorgeschlagen im Hintergrund. / Tesis Aeronaves--Control automático Procesamiento de imágenes digitales Detectores
104	Desarrollo y comparación de diversos mapas de probabilidades en 3D del cáncer de próstata a partir de imágenes de histología Díaz Rojas, Kristians Edgardo 04 December 2013 (has links) Understanding the spatial distribution of prostate cancer and how it changes according to prostate specific antigen (PSA) values, Gleason score, and other clinical parameters may help comprehend the disease and increase the overall success rate of biopsies. This work aims to build 3D spatial distributions of prostate cancer and examine the extent and location of cancer as a function of independent clinical parameters. The border of the gland and cancerous regions from whole-mount histopathological images are used to reconstruct 3D models showing the localization of tumor. This process utilizes color segmentation and interpolation based on mathematical morphological distance. 58 glands are deformed into one prostate atlas using a combination of rigid, a ne, and b-spline deformable registration techniques. Spatial distribution is developed by counting the number of occurrences in a given position in 3D space from each registered prostate cancer. Finally a di erence between proportions is used to compare di erent spatial distributions. Results show that prostate cancer has a significant di erence (SD) in the right zone of the prostate between populations with PSA greater and less than 5 ng=ml. Age does not have any impact in the spatial distribution of the disease. Positive and negative capsule-penetrated cases show a SD in the right posterior zone. There is SD in almost all the glands between cases with tumors larger and smaller than 10% of the whole prostate. A larger database is needed to improve the statistical validity of the test. Finally, information from whole-mount histopathological images could provide better insight into prostate cancer. / Tesis Reconocimiento de imágenes Cáncer
105	Software for calibrating a digital image processing Lang, Kathrin 30 May 2014 (has links) This work is about learning tool wich provides the necessary parameters for a program controlling robots of type LUKAS at the Faculty of Mechanical Engineering. The robot controlling program needs various parameters depending on its environment, like the light intensity distribution, and camera settings as exposure time and gain raw. These values have to be transmitted from the learning tool to the robot controlling software. Chapter one introduces the robots of type LUKAS which are created for the RoboCup Small Size League. Furthermore, it introduces the camera used for image processing. The second chapter explains the learning process according to Christoph UBfeller and deduces the requirements for this work. In the third chapter theoretical basics concerning image processing, wich are fundamental for this work, are explained. Chapter 4 describes the developed learning tool which is used for the learning process and generates the required parameters for the robot controlling software. In chapter five practical test with two persons are represented. The sixth and last chapter summarizes the results. / Tesis Robots--Sistemas de control Robots móviles Procesamiento de imágenes digitales Calibración
106	Optimal vicinity 2D median filter for fixed-point or floating-point values Chang Fu, Javier 19 June 2024 (has links) Los filtros medianos son una técnica digital no lineal normalmente usada para remover ruido blanco, ’sal y pimienta’ de imágenes digitales. Consiste en reemplazar el valor de cada pixel por la mediana de los valores circundantes. Las implementaciones en punto flotante usan ordenamientos con técnicas de comparación para encontrar la mediana. Un método trivial de ordenar n elementos tiene una complejidad de O(n2), y los ordenamientos más rápidos tienen complejidad de O(n log n) al calcular la mediana de n elementos. Sin embargo, éstos algoritmos suelen tener fuerte divergencia en su ejecución. Otras implementaciones usan algoritmos basados en histogramas, y obtienen sus mejores desempeños cuando operan con filtros de ventanas grandes. Estos algoritmos pueden alcanzar tiempo constante al evaluar filtros medianos, es decir, presenta una complejidad de O(1). El presente trabajo propone un algoritmo de filtro mediano rápido y altamente paralelizable. Se basa en ordenamientos sin divergencia con ejecución O(n log2 n) y mezclas O(n) con los cuales se puede calcular grupos de pixeles en paralelo. Este método se beneficia de la redundancia de valores en pixeles próximos y encuentra la vecindad de procesamiento óptima que minimiza el número de operaciones promedio por pixel. El presente trabajo (i) puede procesar indiferentemente imágenes en punto fijo o flotante, (ii) aprovecha al máximo el paralelismo de múltiples arquitecturas, (iii) ha sido implementado en CPU y GPU, (iv) se logra una aceleración respecto al estado del arte. / Median filter is a non-linear digital technique often used to remove additive white, salt and pepper noise from images. It replaces each pixel value by the median of the surrounding pixels. Floating point implementations use sorting and comparing techniques to find median. A common method for sorting n elements has complexity O(n2), and the fastest sorting ones have complexity O(n log n) when computing the median of n elements. However, such fastest algorithms have strong divergence in their execution. Other implementations use histogram based algorithms and have their best performance for large size windows. These histogram based achieve constant time median filtering, exhibiting O(1) complexity. A fast and highly parallelizable median filter algorithm is proposed. It is based on sorting without divergence execution O(n log2 n) and merge O(n) that computes groups of pixels in parallel. The method benefits from redundancy values in neighboring pixels and finds the optimal vicinity that minimize the average operations per pixel. The present work (i) can process either fixed or floating point images, (ii) take full advantage of parallelism of multiple architectures, (iii) have been implemented on CPU and GPU, (iv) the results speed up state of the art implementations. Procesamiento de imágenes digitales Algoritmos
107	Clasificación automática de eventos en videos de fútbol utilizando redes convolucionales profundas Laboriano Galindo, Alipio 14 January 2025 (has links) La forma en que las nuevas generaciones consumen y experimentan el deporte especialmente el fútbol, ha generado oportunidades significativas en la difusión de contenidos deportivos en plataformas no tradicionales y en formatos más reducidos. Sin embargo, recuperar información con contenido semántico de eventos deportivos presentados en formato de video no es tarea sencilla y plantea diversos retos. En videos de partidos de fútbol entre otros retos tenemos: las posiciones de las cámaras de grabación, la superposición de eventos o jugadas y la ingente cantidad de fotogramas disponibles. Para generar resúmenes de calidad y que sean interesantes para el aficionado, en esta investigación se desarrolló un sistema basado en Redes Convolucionales Profundas para clasificar automáticamente eventos o jugadas que ocurren durante un partido de fútbol. Para ello se construyó una base de datos a partir de videos de fútbol descargados de SoccerNet, la cual contiene 1,959 videoclips de 5 eventos: saques de meta, tiros de esquina, faltas cometidas, tiros libres indirectos y remates al arco. Para la experimentación se utilizó técnicas de preprocesamiento de video, una arquitectura convolucional propia y se aplicó transfer learning con modelos como ResNet50, EfficientNetb0, Visión Transformers y Video Visión Transformers. El mejor resultado se obtuvo con una EfficentNetb0 modificada en su primera capa convolucional, con la cual se obtuvo un 91% accuracy, y una precisión de 100% para los saques de meta, 92% para los tiros de esquina, 90% para las faltas cometidas, 88% para los tiros libres indirectos y 89% para los remates al arco. / The way the new generations consume and experiment sports, especially soccer, has generated significant opportunities in the dissemination of sports content on non-traditional platforms and in smaller formats. However, retrieving information with semantic content of sporting events presented in video format is not an easy task and poses several challenges. In videos of soccer matches, among other challenges we have: the positions of the recording cameras, the overlapping of events or plays and the huge amount of frames available. In order to generate quality summaries that are interesting for the fan, this research developed a system based on Deep Convolutional Networks to automatically classify events or plays that occur during a soccer match. For this purpose, a database was built from soccer videos downloaded from SoccerNet, which contains 1,959 video clips of 5 events: goal kicks, corner kicks, fouls, indirect free kicks and shots on target. For the experimentation, video preprocessing techniques were used, a proprietary convolutional architecture and transfer learning was applied with models such as ResNet50, EfficientNetb0, Vision Transformers and Video Vision Transformers. The best result was obtained with a modified EfficentNetb0 in its first convolutional layer, with which 91% accuracy was obtained, and an accuracy of 100% for goal kicks, 92% for corner kicks, 90% for fouls committed, 88% for indirect free kicks and 89% for shots on target. Futbol Procesamiento de imágenes digitales Redes neuronales (Computación)
108	Predicción de un tiro penal de fútbol basado en la estimación de postura del jugador Mauricio Salazar, Josue Angel 24 June 2024 (has links) En este artículo se presenta una metodología innovadora para predecir un tiro penal en fútbol basado en la estimación de postura del jugador que ejecuta el disparo haciendo uso de dos herramientas de visión computacional como segmentación semántica en videos y la estimación de postura 3D mediante los métodos TAM y MMPose, respectivamente. Para ello, se construyó un corpus de videos de tiros penales y se han entrenado modelos de aprendizaje profundo para predecir la región del arco a la cual llegará el disparo. Los resultados muestran que el modelo llamado CNN 3D logra una mejor precisión con respecto a los otros modelos entrenados. Además, se ha medido la influencia de distintas partes del cuerpo con respecto a la tarea de predicción, mostrando que las piernas son las partes más influyentes. Por último, implementamos una herramienta web para el entrenamiento de porteros y jugadores de fútbol en tiros penales, ofreciendo de esta manera posibles mejoras en las tácticas de un disparo de tiro penal mediante el uso de la visión computacional. / This paper presents an innovative methodology for predicting a penalty kick in football based on the kicker’s pose estimation using two computer vision tools, such as semantic segmentation in videos and 3D pose estimation using the TAM and MMPose methods, respectively. For this purpose, a corpus of penalty kick videos was built and deep learning models were trained to predict the region of the goal where the kick should arrive. The results show that the CNN 3D model achieves better accuracy than the other trained models. Furthermore, the influence of different body parts on the prediction task was measured, showing that the legs are the most influential parts. Finally, we implemented a web-based tool to train goalkeepers and footballers in penalty kicks. This offers potential improvements in penalty kick tactics using computer vision. Visión por computadoras Procesamiento de imágenes digitales Aprendizaje profundo Futbol
109	Evaluación de método para la detección automática de puntos de referencia (landmark detection) en imágenes en dos dimensiones de huellas plantares para el diseño de una plantilla ortopédica Donayre Gamboa, Gustavo Miguel 28 August 2024 (has links) El presente trabajo de investigación evalúa la técnica de regresión de mapas de calor (heatmap regression - HR) para la detección automática de puntos de referencia (landmark detection) en imágenes médicas, específicamente en las imágenes de huellas plantares en dos dimensiones. El estudio se basa en la regresión de mapas de calor con aprendizaje profundo, una técnica que ha demostrado ser efectiva en la detección de puntos en rostros y en la estimación de la pose humana. Se propone un método automático para la detección de 8 puntos en las imágenes digitalizadas de huellas plantares que servirán de referencia para el diseño base de una plantilla ortopédica bidimensional, buscando así mejorar el proceso de fabricación de plantillas ortopédicas, que actualmente se realiza de forma manual y artesanal en la mayoría de los países de América Latina. La detección automática de estos puntos de referencia en las huellas plantares tiene el potencial de agilizar este proceso y mejorar la precisión de las plantillas. Los resultados del estudio mostraron un error absoluto promedio normalizado de 0.01017 en el conjunto de validación. Estas evaluaciones se llevaron a cabo utilizando una red convolucional U-Net, la cual consta de una ruta de codificación y compresión de imágenes para capturar el contexto, y una ruta de expansión simétrica que permite una localización precisa de puntos de interés en un tiempo razonable gracias al uso de los procesadores GPU actuales. / This paper evaluates the heatmap regression (HR) technique for landmark detection in medical images, specifically in two- dimensional footprint images. The study is based on heatmap regression with deep learning, a technique that has proven to be effective in face landmark detection and human pose estimation. We propose the evaluation of an automatic method for the detection of 8 points in the digitized images of plantar footprints that will serve as a reference for the base design of a two-dimensional orthopedic insole, thus seeking to improve the orthopedic insole manufacturing process, which is currently handmade and handcrafted in most Latin American countries. The automatic detection of reference points in the plantar footprints would speed up this process and improve the accuracy of the insoles. The results of the study showed an average normalized mean absolute error of 0.01017 in the validation set. These evaluations were carried out using a U-Net convolutional network, which consists of an image encoding and compression path to capture the context, and a symmetric expansion path that allows accurate localization of points of interest in a reasonable amount of time with current GPU processors. Informática médica Procesamiento de imágenes digitales
110	Elaboración de un sistema para análisis de fallas basado en procesamiento de imágenes capturadas por un boroscopio para inspección de turbinas a gas Ordoñez Rojas, Gerardo Manuel 06 October 2020 (has links) En el presente trabajo se hace el uso de las herramientas del procesamiento de imágenes para poder verificar las fallas de los componentes internos de una turbina de gas. Específicamente el estudio se enfoca en uno de los problemas principales del mantenimiento de estas turbinas el cual es la medición de las fallas internas que se producen en estas máquinas complejas. En nuestro medio local no se cuentan con proveedores capaces de poder aplicar las herramientas de visión por computadora a las organizaciones que tienen turbinas de gas como uno de sus principales activos y por ende no pueden brindar la solución para que estas empresas demandan con un estándar alto de calidad a bajo costo, haciendo que el mantenimiento de las turbinas sea muy costoso y en especial en sectores como el de aviación militar y civil, energético y de transmisión de gas, los cuales son los sectores que más emplean este tipo de tecnología. Las turbinas de gas, desde el punto de vista económico, son activos muy costosos. Según su tamaño y potencia generada, pueden llegar a tener costos en millones de dólares y mientras más continuo sea el monitoreo de su desgaste externo e interno, sea a través de inspecciones físicas directas o de parámetros medidos, mejor se podrá monitorear su deterioro y se evitarán fallas prematuras y por consiguiente se reducen sus costos de mantenimiento a largo plazo. Con el trascurso de los años se han desarrollado diversas técnicas de mantenimiento para turbinas a gas, que han permitido incrementar su disponibilidad y confiabilidad. Una de las técnicas más importantes ha sido el monitoreo con el equipo fuera de línea u off-line de los componentes internos de estas turbinas, esta técnica es la boroscopia, el cual consiste de un sistema de inspección visual remota que unido a un procesamiento de imágenes brinda una herramienta potente para la detección y diagnóstico de fallas internas. Esta técnica es la más fiable para verificar la condición física interna de las turbinas ya que anteriormente se tenían que retirar y, en caso de no tener los medios, enviar la turbina a fábrica para su inspección y reparación correspondiente. Es por ello que en el presente trabajo se buscará diseñar un sistema de boroscopia para las turbinas a gas, el cual podrá emplearse para cumplir las funciones de inspeccionar, grabar y medir los daños internos de las turbinas a gas del mismo modo que ofrecen las soluciones comerciales, pero a un costo mucho menor. En la parte experimental de este trabajo se pone énfasis en el problema de la medición y la solución propuesta muestra que se puede obtener un error promedio de entre -0.16 a 0.028 mm para un objetivo de 5 mm., esto demuestra que la técnica obtiene resultados muy satisfactorios ya que un equipo comercial de una marca referente, que cuenta con tecnología de la medición, logra a tener un error de entre 0.025 y 0.03 mm de error para un objetivo de 5.33 mm. / Tesis Turbinas de gas Análisis de fallas Procesamiento de imágenes digitales

Search results