Global ETD Search

Return to search

Developing a Neural Network Model for Semantic Segmentation / Utveckling av en neural nätverksmodell för semantisk segmentering

This study details the development of a neural network model designed for real-time semantic segmentation, specifically to distinguish sky pixels from other elements within an image. The model is incorporated into a feature for an Augmented Reality application in Unity, leveraging Unity Barracuda—a versatile neural network inference library. While Barracuda offers cross-platform compatibility, it poses challenges due to its lack of support for certain layers and operations. Consequently, it lacks the support of most state-of-the-art models, and this study aims to provide a model that works within Barracuda. Given Unity's absence of a framework for model development, the development and training of the model was conducted in an open-source machine learning library. The model is continuously evaluated to optimize the trade-off between prediction accuracy and operational speed. The resulting model is able to predict and classify each pixel in an image at around 137 frames per second. While its predictions might not be on par with some of the top-performing models in the industry, it effectively meets its objectives, particularly in the real-time classification of sky pixels within Barracuda. / Denna rapport beskriver utvecklingen av en neural nätverksmodell avsedd för semantisk segmentering i realtid, specifikt för att särskilja himlen från andra element inom en bild. Modellen integreras i en funktion för en applikation med augmenterad verklighet i Unity, med hjälp av Unity Barracuda - ett mångsidigt bibliotek för neurala nätverk. Även om Barracuda erbjuder kompatibilitet över olika plattformar, medför det utmaningar på grund av dess brist på stöd för vissa lager och operationer. Följaktligen saknar den stöd från de bäst presterande modellerna, och denna studie syftar till att erbjuda en modell som fungerar inom Barracuda. Med tanke på Unitys avsaknad av ett ramverk för modellutveckling valdes ett open-source maskininlärningsbibliotek. Modellen utvärderas kontinuerligt för att optimera avvägningen mellan förutsägelseprecision och driftshastighet. Den resulterande modellen kan förutsäga och klassificera varje pixel i en bild med en hastighet på cirka 137 bilder per sekund. Även om dess förutsägelseprecision inte är i nivå med några av de bäst presterande modellerna inom branschen, uppfyller den effektivt sina mål, särskilt när det gäller realtidsklassificering av himlen inom Barracuda.

http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-336885

Semantic segmentation

Semantisk segmentering

neurala nätverk

Unity Barracuda

PyTorch

augmenterad verklighet

Computer Sciences

Datavetenskap (datalogi)

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kth-336885
Date	January 2023
Creators	Westphal, Ronny
Publisher	KTH, Skolan för kemi, bioteknologi och hälsa (CBH)
Source Sets	DiVA Archive at Upsalla University
Language	Swedish
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess
Relation	TRITA-CBH-GRU ; 2023:255

Page generated in 0.0027 seconds

Developing a Neural Network Model for Semantic Segmentation / Utveckling av en neural nätverksmodell för semantisk segmentering

Description

Links & Downloads

Tags

Additional Fields