Student | Davide Liberato Manna |
Supervisors | Aleksei Tepljakov, Prof. Barbara Caputo |
Keywords | Artificial intelligence, Deep Learning |
Degree | MSc |
Thesis language | English |
Defense date | December 16, 2019 |
Document link | Download Thesis Document |
Road Feature Extraction With Deep Learning Methods
Abstract
Deep Convolutional Neural Networks (CNNs) have recently become the subject of rigorous investigation especially due to their favourable properties in the field of computer vision and hence have been utilized in numerous related applications. Among these, image classification and semantic segmentation have acquired particular research interest. This work is a part of an applied project funded by Reach-U - a company specializing in geographic information systems, location based solutions and cartography - and is geared towards smart city applications. In particular, the topic of this thesis is focused on the detection of road features which includes both semantic segmentation and image classification. Towards accomplishing this goal, two ad-hoc datasets were produced containing specific road features in accordance with the needs of the company. Towards graphical scene processing, a novel lightweight CNN architecture was introduced in this thesis capable of performing adequate semantic segmentation. For the problem of image classification, a custom CNN was proposed and its performance compared to the state-of-the-art ResNet network, the latter fine-tuned towards solving the same task. Despite the low number of network parameters, it is expected that the proposed segmentation network will have better performance in the sense of shape detection even though its application results in lower accuracy levels. In contrast, the proposed classifier reveals competitive accuracy with respect to its counterpart. A graphical user interface has also been developed which uses the resulting CNN as backend. The complete solution is suitable to satisfy the end goal which is the construction of an appropriate 3D model based on 2D segmented images.