Spelling suggestions: "subject:"computer visou"" "subject:"computer bison""
1 |
Facade Segmentation in the WildPara, Wamiq Reyaz 19 August 2019 (has links)
Facade parsing is a fundamental problem in urban modeling that forms the back- bone of a variety of tasks including procedural modeling, architectural analysis, urban reconstruction and quite often relies on semantic segmentation as the first step. With the shift to deep learning based approaches, existing small-scale datasets are the bot- tleneck for making further progress in fa ̧cade segmentation and consequently fa ̧cade parsing. In this thesis, we propose a new fa ̧cade image dataset for semantic segmenta- tion called PSV-22, which is the largest such dataset. We show that PSV-22 captures semantics of fa ̧cades better than existing datasets. Additionally, we propose three architectural modifications to current state of the art deep-learning based semantic segmentation architectures and show that these modifications improve performance on our dataset and already existing datasets. Our modifications are generalizable to a large variety of semantic segmentation nets, but are fa ̧cade-specific and employ heuris- tics which arise from the regular grid-like nature of fac ̧ades. Furthermore, results show that our proposed architecture modifications improve the performance compared to baseline models as well as specialized segmentation approaches on fa ̧cade datasets and are either close in, or improve performance on existing datasets. We show that deep models trained on existing data have a substantial performance reduction on our data, whereas models trained only on our data actually improve when evaluated on existing datasets. We intend to release the dataset publically in the future.
|
2 |
Apprentissage Profond pour des Prédictions Structurées Efficaces appliqué à la Classification Dense en Vision par Ordinateur / Efficient Deep Structured Prediction for Dense Labeling Tasks in Computer VisionChandra, Siddhartha 11 May 2018 (has links)
Dans cette thèse, nous proposons une technique de prédiction structurée qui combine les vertus des champs aléatoires conditionnels Gaussiens (G-CRF) avec les réseaux de neurones convolutifs (CNN). L’idée à l’origine de cette thèse est l’observation que tout en étant d’une forme limitée, les GCRF nous permettent d’effectuer une inférence exacte de Maximum-A-Posteriori (MAP) de manière efficace. Nous préférons l’exactitude et la simplicité à la généralité et préconisons la prédiction structurée basée sur les G-CRFs dans les chaînes de traitement d’apprentissage en profondeur. Nous proposons des méthodes de prédiction structurées qui permettent de gérer (i) l’inférence exacte, (ii) les interactions par paires à court et à long terme, (iii) les expressions CNN riches pour les termes paires et (iv) l’entraînement de bout en bout aux côtés des CNN. Nous concevons de nouvelles stratégies de mise en œuvre qui nous permettent de surmonter les problèmes de mémoire et de calcul lorsque nous traitons des modèles graphiques entièrement connectés. Ces méthodes sont illustrées par des études expérimentales approfondies qui démontrent leur utilité. En effet, nos méthodes permettent une amélioration des résultats vis-à-vis de L’état de l’art sur des applications variées dans le domaine de la vision par ordinateur. / In this thesis we propose a structured prediction technique that combines the virtues of Gaussian Conditional Random Fields (G-CRFs) with Convolutional Neural Networks (CNNs). The starting point of this thesis is the observation that while being of a limited form GCRFs allow us to perform exact Maximum-APosteriori (MAP) inference efficiently. We prefer exactness and simplicity over generality and advocate G-CRF based structured prediction in deep learning pipelines. Our proposed structured prediction methods accomodate (i) exact inference, (ii) both shortand long- term pairwise interactions, (iii) rich CNN-based expressions for the pairwise terms, and (iv) end-to-end training alongside CNNs. We devise novel implementation strategies which allow us to overcome memory and computational challenges
|
3 |
CountNet3D: A 3D Computer Vision Approach to Infer Counts of Occluded Objects with Quantified UncertaintyNelson, Stephen W. 30 August 2023 (has links) (PDF)
3D scene understanding is an important problem that has experienced great progress in recent years, in large part due to the development of state-of-the-art methods for 3D object detection. However, the performance of 3D object detectors can suffer in scenarios where extreme occlusion of objects is present, or the number of object classes is large. In this paper, we study the problem of inferring 3D counts from densely packed scenes with heterogeneous objects. This problem has applications to important tasks such as inventory management or automatic crop yield estimation. We propose a novel regression-based method, CountNet3D, that uses mature 2D object detectors for finegrained classi- fication and localization, and a PointNet backbone for geo- metric embedding. The network processes fused data from images and point clouds for end-to-end learning of counts. We perform experiments on a novel synthetic dataset for inventory management in retail, which we construct and make publicly available to the community. We also have a proprietary dataset we've collected of real-world scenes. In addition we run experiments to quantify the uncertainty of the models and evaluate the confidence of our predic- tions. Our results show that regression-based 3D counting methods systematically outperform detection-based meth- ods, and reveal that directly learning from raw point clouds greatly assists count estimation under extreme occlusion.
|
4 |
Autenticação biométrica de usuários em sistemas de E-learning baseada em reconhecimento de faces a partir de vídeo /Penteado, Bruno Elias. January 2009 (has links)
Orientador: Aparecido Nilceu Elias / Banca: Agma Juci Machado Traina / Banca: Wilson Massashiro Yonezawa / Resumo: Nos últimos anos tem sido observado um crescimento exponencial na oferta de cursos a distância realizados pela Internet, decorrente de suas vantagens e características (menores custos de distribuição e atualização de conteúdo, gerenciamento de grandes turmas, aprendizado assíncrono e geograficamente independente, etc.), bem como de sua regulamentação e apoio governamental. Entretanto, a falta de mecanismos eficazes para assegurar a autenticação dos alunos neste tipo de ambiente é apontada como uma séria deficiência, tanto no acesso ao sistema quanto durante a participação do usuário nas atividades do curso. Atualmente, a autenticação baseada em senhas continua predominante. Porém, estudos têm sido conduzidos sobre possíveis aplicações da Biometria para autenticação em ambientes Web. Com a popularização e conseqüente barateamento de hardware habilitado para coleta biométrica (como webcams, microfone e leitores de impressão digital embutidos), a Biometria passa a ser considerada uma forma segura e viável de autenticação remota de indivíduos em aplicações Web. Baseado nisso, este trabalho propõe uma arquitetura distribuída para um ambiente de e-Learning, explorando as propriedades de um sistema Web para a autenticação biométrica tanto no acesso ao sistema quanto de forma contínua, durante a realização do curso. Para análise desta arquitetura, é avaliada a performance de técnicas de reconhecimento de faces a partir de vídeo capturadas on-line por uma webcam em um ambiente de Internet, simulando a interação natural de um indivíduo em um sistema de e- Learning. Para este fim, foi criada uma base de dados de vídeos própria, contando com 43 indivíduos navegando e interagindo com páginas Web. Os resultados obtidos mostram que os métodos analisados, consolidados na literatura, podem ser aplicados com sucesso nesse tipo de aplicação... (Resumo completo, clicar acesso eletrônico abaixo) / Abstract: In the last years it has been observed an exponential growth in the offering of Internet-enabled distance courses, due to its advantages and features (decreased distribution and content updates costs, management of large groups of students, asynchronous and geographically independent learning) as well as its regulation and governmental support. However, the lack of effective mechanisms that assure user authentication in this sort of environment has been pointed out as a serious deficiency, both in the system logon and during user attendance in the course assignments. Currently, password based authentication still prevails. Nevertheless, studies have been carried out about possible biometric applications for Web authentication. With the popularization and resultant decreasing costs of biometric enabled devices, such as webcams, microphones and embedded fingerprint sensors, Biometrics is reconsidered as a secure and viable form of remote authentication of individuals for Web applications. Based on that, this work presents a distributed architecture for an e-Learning environment, by exploring the properties of a Web system for biometric authentication both in the system logon and in continuous monitoring, during the course attendance. For the analysis of this architecture, the performance of techniques for face recognition from video, captured on-line by a webcam in an Internet environment, is evaluated, simulating the natural interaction of an individual in an e-Learning system. For that, a private database was created, with 43 individuals browsing and interacting with Web pages. The results show that the methods analyzed, though consolidated in the literature, can be successfully applied in this kind of application, with recognition rates up to 97% in ideal conditions, with low execution times and with short amount of information transmitted between client and server, with templates sizes of about 30KB. / Mestre
|
5 |
Autenticação biométrica de usuários em sistemas de E-learning baseada em reconhecimento de faces a partir de vídeoPenteado, Bruno Elias [UNESP] 27 July 2009 (has links) (PDF)
Made available in DSpace on 2014-06-11T19:29:40Z (GMT). No. of bitstreams: 0
Previous issue date: 2009-07-27Bitstream added on 2014-06-13T20:59:56Z : No. of bitstreams: 1
penteado_be_me_sjrp.pdf: 1032009 bytes, checksum: 4cf143854132e42249128674b69ba77b (MD5) / Nos últimos anos tem sido observado um crescimento exponencial na oferta de cursos a distância realizados pela Internet, decorrente de suas vantagens e características (menores custos de distribuição e atualização de conteúdo, gerenciamento de grandes turmas, aprendizado assíncrono e geograficamente independente, etc.), bem como de sua regulamentação e apoio governamental. Entretanto, a falta de mecanismos eficazes para assegurar a autenticação dos alunos neste tipo de ambiente é apontada como uma séria deficiência, tanto no acesso ao sistema quanto durante a participação do usuário nas atividades do curso. Atualmente, a autenticação baseada em senhas continua predominante. Porém, estudos têm sido conduzidos sobre possíveis aplicações da Biometria para autenticação em ambientes Web. Com a popularização e conseqüente barateamento de hardware habilitado para coleta biométrica (como webcams, microfone e leitores de impressão digital embutidos), a Biometria passa a ser considerada uma forma segura e viável de autenticação remota de indivíduos em aplicações Web. Baseado nisso, este trabalho propõe uma arquitetura distribuída para um ambiente de e-Learning, explorando as propriedades de um sistema Web para a autenticação biométrica tanto no acesso ao sistema quanto de forma contínua, durante a realização do curso. Para análise desta arquitetura, é avaliada a performance de técnicas de reconhecimento de faces a partir de vídeo capturadas on-line por uma webcam em um ambiente de Internet, simulando a interação natural de um indivíduo em um sistema de e- Learning. Para este fim, foi criada uma base de dados de vídeos própria, contando com 43 indivíduos navegando e interagindo com páginas Web. Os resultados obtidos mostram que os métodos analisados, consolidados na literatura, podem ser aplicados com sucesso nesse tipo de aplicação... / In the last years it has been observed an exponential growth in the offering of Internet-enabled distance courses, due to its advantages and features (decreased distribution and content updates costs, management of large groups of students, asynchronous and geographically independent learning) as well as its regulation and governmental support. However, the lack of effective mechanisms that assure user authentication in this sort of environment has been pointed out as a serious deficiency, both in the system logon and during user attendance in the course assignments. Currently, password based authentication still prevails. Nevertheless, studies have been carried out about possible biometric applications for Web authentication. With the popularization and resultant decreasing costs of biometric enabled devices, such as webcams, microphones and embedded fingerprint sensors, Biometrics is reconsidered as a secure and viable form of remote authentication of individuals for Web applications. Based on that, this work presents a distributed architecture for an e-Learning environment, by exploring the properties of a Web system for biometric authentication both in the system logon and in continuous monitoring, during the course attendance. For the analysis of this architecture, the performance of techniques for face recognition from video, captured on-line by a webcam in an Internet environment, is evaluated, simulating the natural interaction of an individual in an e-Learning system. For that, a private database was created, with 43 individuals browsing and interacting with Web pages. The results show that the methods analyzed, though consolidated in the literature, can be successfully applied in this kind of application, with recognition rates up to 97% in ideal conditions, with low execution times and with short amount of information transmitted between client and server, with templates sizes of about 30KB.
|
6 |
Detecting and comparing Kanban boards using Computer Vision / Detektering och jämförelse av Kanbantavlor med hjälp av datorseendeBehnam, Humam January 2022 (has links)
This thesis investigates the problem of detecting and tracking sticky notes on Kanban boards using classical computer vision techniques. Currently, there exists some alternatives for digitizing sticky notes, but none keep track of notes that have already been digitized, allowing for duplicate notes to be created when scanning multiple images of the same Kanban board. Kanban boards are widely used in various industries, and being able to recognize, and possibly in the future even digitize entire Kanban boards could provide users with extended functionality. The implementation presented in this thesis is able to, given two images, detect the Kanban boards in each image and rectify them. The rectified images are then sent to the Google Cloud Vision API for text detection. Then, the rectified images are used to detect all the sticky notes. The positional information of the notes and columns of the Kanban boards are then used to filter the text detection to find the text inside each note as well as the header text for each column. Between the two images, the columns are compared and matched, as well as notes of the same color. If columns or notes in one image do not have a match in the second image, it is concluded that the boards are different, and the user is informed of why. If all columns and notes in one image have matches in the second image but some notes have moved, the user is informed of which notes that have moved, and how they have moved as well. The different experiments conducted in this thesis on the implementation show that it works well, but it is very confined to strict requirements, making it unsuitable for commercial use. The biggest problem to solve is to make the implementation more general, i.e. the Kanban board layout, sticky note shapes and colors as well as their actual content. / Denna avhandling undersöker problemet med att upptäcka och spåra klisterlappar och Kanban-tavlor med hjälp av klassiska datorseendetekniker. För närvarande finns det några alternativ för att digitalisera klisterlappar, men ingen håller reda på anteckningar som redan har digitaliserats, vilket gör att duplicerade anteckningar kan skapas när du skannar flera bilder av samma Kanban-kort. Kanban-kort används flitigt i olika branscher och att kunna känna igen, och eventuellt i framtiden även digitalisera hela Kanban-tavlor, skulle kunna ge användarna utökad funktionalitet. Implementeringen som presenteras i denna avhandling kan, givet två bilder, upptäcka Kanban-brädorna i varje bild och korrigera dem. De korrigerade bilderna skickas sedan till Google Cloud Vision API för textidentifiering. Sedan används de korrigerade bilderna för att upptäcka alla klisterlappar. Positionsinformationen för anteckningarna och kolumnerna på Kanban-tavlan används sedan för att filtrera textdetekteringen för att hitta texten i varje anteckning såväl som rubriktexten för varje kolumn. Mellan de två bilderna jämförs och matchas kolumnerna, samt anteckningar av samma färg. Om kolumner eller anteckningar i en bild inte har en matchning i den andra bilden dras slutsatsen att brädorna är olika och användaren informeras om varför. Om alla kolumner och anteckningar i en bild har matchningar i den andra bilden men några anteckningar har flyttats, informeras användaren om vilka anteckningar som har flyttats och hur de har flyttats. De olika experiment som genomförs i denna avhandling om implementering visar att den fungerar bra, men den är mycket begränsad till strikta krav, vilket gör den olämplig för kommersiellt bruk. Det största problemet att lösa är att göra implementeringen mer generell, d.v.s. Kanban-tavlans layout, klisterlapparnas former och färger samt deras faktiska innehåll.
|
Page generated in 0.0543 seconds