Go to main content
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

In recent years, the success of Transformers has been demonstrated in computer vision (CV)tasks, with the Vision Transformer (ViT), which competes with CNN networks on image classification tasks when using pre-trained models. Many of these deep learning models are designed by experts, which takes knowledge, time, and labor costs. Neural Architecture Search (NAS), seeks to automate the process of designing a neural network architecture. In this paper, I propose NSGA-ViT, a multi-objective evolutionary NAS for designing Transformer-based networks for computer vision tasks. NSGA-ViT utilizes a multi-objective genetic algorithm (NSGA-II) to design a ViT network with two objectives: maximizing performance, and minimizing network size. NSGA-ViT searches a search space of self-attention and convolution operations to discover a transformer architecture which outperforms ViT on CIFAR-10 and while containing half the parameters.

Details

PDF

Statistics

from
to
Export
Download Full History