Facade parsing is a fundamental problem in urban modeling that forms the back- bone of a variety of tasks including procedural modeling, architectural analysis, urban reconstruction and quite often relies on semantic segmentation as the first step. With the shift to deep learning based approaches, existing small-scale datasets are the bot- tleneck for making further progress in fa ̧cade segmentation and consequently fa ̧cade parsing. In this thesis, we propose a new fa ̧cade image dataset for semantic segmenta- tion called PSV-22, which is the largest such dataset. We show that PSV-22 captures semantics of fa ̧cades better than existing datasets. Additionally, we propose three architectural modifications to current state of the art deep-learning based semantic segmentation architectures and show that these modifications improve performance on our dataset and already existing datasets. Our modifications are generalizable to a large variety of semantic segmentation nets, but are fa ̧cade-specific and employ heuris- tics which arise from the regular grid-like nature of fac ̧ades. Furthermore, results show that our proposed architecture modifications improve the performance compared to baseline models as well as specialized segmentation approaches on fa ̧cade datasets and are either close in, or improve performance on existing datasets. We show that deep models trained on existing data have a substantial performance reduction on our data, whereas models trained only on our data actually improve when evaluated on existing datasets. We intend to release the dataset publically in the future.
Identifer | oai:union.ndltd.org:kaust.edu.sa/oai:repository.kaust.edu.sa:10754/656528 |
Date | 19 August 2019 |
Creators | Para, Wamiq Reyaz |
Contributors | Wonka, Peter, Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Alouini, Mohamed-Slim, Thabet, Ali Kassem |
Source Sets | King Abdullah University of Science and Technology |
Language | English |
Detected Language | English |
Type | Thesis |
Page generated in 0.0059 seconds