CLOS network: A parameter-efficient alternative to linear layers in neural networks
Main Article Content
Keywords
Abstract
In this paper, we propose utilizing the CLOS network architecture to replace traditional linear layers in deep learning models, including transformers. The CLOS network, commonly used in networking systems, is adapted to neural networks to reduce parameter sizes while maintaining model performance. Our experiments show that the CLOS network achieves the same accuracy and loss as the conventional linear layer, but with fewer parameters. However, this efficiency comes at the cost of increased processing time, which is 1.5x to 3x slower. Despite this trade-off, the CLOS network can be an effective alternative for parameter reduction in various architectures, including large models like transformers.
References
Han Cai, Ligeng Zhu, and Song Han. Proxylessnas: Direct neural architecture search on target task and hardware. ArXiv, abs/1812.00332, 2018.
Tianqi Chen, Bing Xu, Chiyuan Zhang, and Carlos Guestrin. Training deep nets with sublinear memory cost, 2016.
Yong Cheng, Wei Wang, Wenjie Zhang, Ling Yang, Jun Wang, Huan Ni, Tingzhao Guan, Jiaxin He, Yakang Gu, and Ngoc Nguyen Tran. A multi-feature fusion and attention network for multi-scale object detection in remote sensing images. Remote Sensing, 15(8), 2023.
Tianqi Chen, Bing Xu, Chiyuan Zhang, and Carlos Guestrin. Training deep nets with sublinear memory cost, 2016.
Yong Cheng, Wei Wang, Wenjie Zhang, Ling Yang, Jun Wang, Huan Ni, Tingzhao Guan, Jiaxin He, Yakang Gu, and Ngoc Nguyen Tran. A multi-feature fusion and attention network for multi-scale object detection in remote sensing images. Remote Sensing, 15(8), 2023.
