Files
Abstract
We examine two main classes of deep learning methods, patch-based convolutional neural network (CNN) architectures and fully convolutional neural network (FCNN) approaches, for semantic segmentation and object classification of coral reef survey images. Using image data collected from underwater video of marine environments, we compare five common CNN architectures and observe Resnet152 to achieve the highest accuracy. For our comparison of FCNN approaches, we test three common architectures and one custom modified architecture and observe the best performance with Deeplab v2. We expand on our initial approaches by proposing a technique that utilizes the multi-view image data commonly extracted, yet often discarded, in video or remote sensing domains. We examine the use of stereoscopic image data for FCNN approaches and multi-view image data for patch-based CNN methods. Our proposed TwinNet architecture is the top performing FCNN. Among patch-based multi-view approaches, our proposed nViewNet-8 architecture yields the highest accuracy on this task.