H.B. Keller Colloquium
Neural network architectures are traditionally hand designed based on a designer's
knowledge, intuition and experience. In recent years, data-driven design has been studied as
an alternative where a search algorithm is deployed to find an optimal architecture
in addition to optimizing network weights. This led to a two-level optimization problem to solve.
I shall review a few methods especially differential architecture search (DARTS), then
introduce a single-level optimization problem as a relaxation approximation and the associated
convergent algorithm RARTS. Through architecture/weight variable splitting and Gauss-Seidel iterations,
RARTS outperforms DARTS in accuracy and search efficiency, shown
in a solvable problem and the CIFAR-10 image classification based search.
The gain over DARTS continues upon transfer to ImageNet (1000 image classes).
RARTS is further applied for network architecture compression, such as
channel pruning of overparameterized convolutional neural networks.
In experiments on ResNet-18 and MobileNetV2, RARTS achieves considerable
channel sparsity and learns slim deep networks with satisfactory accuracy.