skip to main content
research-article

Block Walsh–Hadamard Transform-based Binary Layers in Deep Neural Networks

Published: 18 October 2022 Publication History

Abstract

Convolution has been the core operation of modern deep neural networks. It is well known that convolutions can be implemented in the Fourier Transform domain. In this article, we propose to use binary block Walsh–Hadamard transform (WHT) instead of the Fourier transform. We use WHT-based binary layers to replace some of the regular convolution layers in deep neural networks. We utilize both one-dimensional (1D) and 2D binary WHTs in this article. In both 1D and 2D layers, we compute the binary WHT of the input feature map and denoise the WHT domain coefficients using a nonlinearity that is obtained by combining soft-thresholding with the tanh function. After denoising, we compute the inverse WHT. We use 1D-WHT to replace the 1 × 1 convolutional layers, and 2D-WHT layers can replace the 3 × 3 convolution layers and Squeeze-and-Excite layers. 2D-WHT layers with trainable weights can be also inserted before the Global Average Pooling layers to assist the dense layers. In this way, we can reduce the number of trainable parameters significantly with a slight decrease in trainable parameters. In this article, we implement the WHT layers into MobileNet-V2, MobileNet-V3-Large, and ResNet to reduce the number of parameters significantly with negligible accuracy loss. Moreover, according to our speed test, the 2D-FWHT layer runs about 24 times as fast as the regular 3 × 3 convolution with 19.51% less RAM usage in an NVIDIA Jetson Nano experiment.

References

[1]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25 (2012), 1097–1105.
[2]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556. Retrieved from https://arxiv.org/abs/1409.1556.
[3]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1–9.
[4]
Fei Wang, Mengqing Jiang, Chen Qian, Shuo Yang, Cheng Li, Honggang Zhang, Xiaogang Wang, and Xiaoou Tang. 2017. Residual attention network for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3156–3164.
[5]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.
[6]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Identity mappings in deep residual networks. In Proceedings of the European Conference on Computer Vision. Springer, 630–645.
[7]
Diaa Badawi, Hongyi Pan, Sinan Cem Cetin, and A. Enis Çetin. 2020. Computationally efficient spatio-temporal dynamic texture recognition for volatile organic compound (VOC) leakage detection in industrial plants. IEEE J. Select. Top. Sign. Process. 14, 4 (2020), 676–687.
[8]
Chirag Agarwal, Shahin Khobahi, Dan Schonfeld, and Mojtaba Soltanalian. 2021. CoroNet: A deep network architecture for enhanced identification of COVID-19 from chest x-ray images. In Medical Imaging 2021: Computer-Aided Diagnosis, Vol. 11597. International Society for Optics and Photonics, 1159722.
[9]
Harris Partaourides, Kostantinos Papadamou, Nicolas Kourtellis, Ilias Leontiades, and Sotirios Chatzis. 2020. A self-attentive emotion recognition network. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’20). IEEE, 7199–7203.
[10]
Dimitrios Stamoulis, Ting-Wu Chin, Anand Krishnan Prakash, Haocheng Fang, Sribhuvan Sajja, Mitchell Bognar, and Diana Marculescu. 2018. Designing adaptive neural networks for energy-constrained image classification. In Proceedings of the International Conference on Computer-Aided Design. 1–8.
[11]
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 779–788.
[12]
Süleyman Aslan, Uğur Güdükbay, B. Uğur Töreyin, and A. Enis Çetin. 2020. Deep convolutional generative adversarial networks for flame detection in video. In Proceedings of the International Conference on Computational Collective Intelligence. Springer, 807–815.
[13]
Guglielmo Menchetti, Zhanli Chen, Diana J. Wilkie, Rashid Ansari, Yasemin Yardimci, and A. Enis Çetin. 2019. Pain detection from facial videos using two-stage deep learning. In Proceedings of the IEEE Global Conference on Signal and Information Processing (GlobalSIP’19). IEEE, 1–5.
[14]
Süleyman Aslan, Uğur Güdükbay, B. Uğur Töreyin, and A. Enis Çetin. 2019. Early wildfire smoke detection based on motion-based geometric image transformation and deep convolutional generative adversarial networks. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’19). IEEE, 8315–8319.
[15]
Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, and Nong Sang. 2018. Bisenet: Bilateral segmentation network for real-time semantic segmentation. In Proceedings of the European Conference on Computer Vision (ECCV’18). 325–341.
[16]
Zilong Huang, Xinggang Wang, Lichao Huang, Chang Huang, Yunchao Wei, and Wenyu Liu. 2019. Ccnet: Criss-cross attention for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 603–612.
[17]
Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3431–3440.
[18]
Rudra P. K. Poudel, Stephan Liwicki, and Roberto Cipolla. 2019. Fast-scnn: Fast semantic segmentation network. arXiv:1902.04502. Retrieved from https://arxiv.org/abs/1902.04502.
[19]
Yinli Jin, Wenbang Hao, Ping Wang, and Jun Wang. 2019. Fast detection of traffic congestion from ultra-low frame rate image based on semantic segmentation. In Proceedings of the 14th IEEE Conference on Industrial Electronics and Applications (ICIEA’19). IEEE, 528–532.
[20]
Hongyi Pan, Diaa Badawi, and Ahmet Enis Cetin. 2021. Fast walsh-hadamard transform and smooth-thresholding based binary layers in deep neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4650–4659.
[21]
Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4510–4520.
[22]
Jiaxiang Wu, Cong Leng, Yuhang Wang, Qinghao Hu, and Jian Cheng. 2016. Quantized convolutional neural networks for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4820–4828.
[23]
Usama Muneeb, Erdem Koyuncu, Yasaman Keshtkarjahromd, Hulya Seferoglu, Mehmet Fatih Erden, and A Enis Cetin. 2020. Robust and computationally-efficient anomaly detection using powers-of-two networks. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’20). IEEE, 2992–2996.
[24]
Wenlin Chen, James Wilson, Stephen Tyree, Kilian Weinberger, and Yixin Chen. 2015. Compressing neural networks with the hashing trick. In Proceedings of the International Conference on Machine Learning. 2285–2294.
[25]
Hongyi Pan, Diaa Badawi, and Ahmet Enis Cetin. 2020. Computationally efficient wildfire detection method using a deep convolutional network pruned via Fourier analysis. Sensors 20, 10 (2020), 2891.
[26]
Mingchao Yu, Zhifeng Lin, Krishna Narra, Songze Li, Youjie Li, Nam Sung Kim, Alexander Schwing, Murali Annavaram, and Salman Avestimehr. 2018. Gradiveq: Vector quantization for bandwidth-efficient gradient aggregation in distributed cnn training. arXiv:1811.03617. Retrieved from https://arxiv.org/abs/1811.03617.
[27]
Song Han, Huizi Mao, and William J. Dally. 2019. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv:1510.00149. Retrieved from https://arxiv.org/abs/1510.00149.
[28]
Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size. arXiv:1602.07360. Retrieved from https://arxiv.org/abs/1602.07360.
[29]
Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized neural networks: Training deep neural networks with weights and activations constrained to+ 1 or-1. arXiv:1602.02830. Retrieved from https://arxiv.org/abs/1602.02830.
[30]
Adrian Bulat and Georgios Tzimiropoulos. 2018. Hierarchical binary CNNs for landmark localization with limited resources. IEEE Trans. Pattern Anal. Mach. Intell. 42, 2 (2018), 343–356.
[31]
Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. 2016. Xnor-net: Imagenet classification using binary convolutional neural networks. In European Conference on Computer Vision. Springer, 525–542.
[32]
Zhiqiang Shen, Zechun Liu, Jie Qin, Lei Huang, Kwang-Ting Cheng, and Marios Savvides. 2021. S2-BNN: Bridging the gap between self-supervised real and 1-bit neural networks via guided distribution calibration. arXiv:2102.08946. Retrieved from https://arxiv.org/abs/2102.08946.
[33]
Zechun Liu, Zhiqiang Shen, Marios Savvides, and Kwang-Ting Cheng. 2020. Reactnet: Towards precise binary neural network with generalized activation functions. In European Conference on Computer Vision. Springer, 143–159.
[34]
Brais Martinez, Jing Yang, Adrian Bulat, and Georgios Tzimiropoulos. 2020. Training binary neural networks with real-to-binary convolutions. arXiv:2003.11535. Retrieved from https://arxiv.org/abs/2003.11535.
[35]
Adrian Bulat, Brais Martinez, and Georgios Tzimiropoulos. 2020. Bats: Binary architecture search. arXiv:2003.01711. Retrieved from https://arxiv.org/abs/2003.01711.
[36]
Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized neural networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems. 4114–4122.
[37]
Milad Alizadeh, Javier Fernández-Marqués, Nicholas D. Lane, and Yarin Gal. 2018. An empirical study of binary neural networks’ optimisation. In Proceedings of the International Conference on Learning Representations.
[38]
Tom Bannink, Adam Hillier, Lukas Geiger, Tim de Bruin, Leon Overweel, Jelmer Neeven, and Koen Helwegen. 2021. Larq compute engine: Design, benchmark and deploy state-of-the-art binarized neural networks. Proc. Mach. Learn. Syst. 3 (2021), 680–695.
[39]
Felix Juefei-Xu, Vishnu Naresh Boddeti, and Marios Savvides. 2017. Local binary convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 19–28.
[40]
Xiaofan Lin, Cong Zhao, and Wei Pan. 2017. Towards accurate binary convolutional neural network. Advances in Neural Information Processing Systems 30 (2017).
[41]
Ziwei Wang, Jiwen Lu, Chenxin Tao, Jie Zhou, and Qi Tian. 2019. Learning channel-wise interactions for binary convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 568–577.
[42]
Ritchie Zhao, Yuwei Hu, Jordan Dotzel, Christopher De Sa, and Zhiru Zhang. 2019. Building efficient deep neural networks with unitary group convolutions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11303–11312.
[43]
Shamma Nasrin, Diaa Badawi, Ahmet Enis Cetin, Wilfred Gomes, and Amit Ranjan Trivedi. 2021. MF-Net: Compute-In-Memory SRAM for multibit precision inference using memory-immersed data conversion and multiplication-free operators. IEEE Trans. Circ. Syst. I: Regul. Pap. 68, 5 (2021), 1966–1978.
[44]
Maneesh Ayi and Mohamed El-Sharkawy. 2020. RMNv2: Reduced Mobilenet V2 for CIFAR10. In Proceedings of the 10th Annual Computing and Communication Workshop and Conference (CCWC’20). IEEE, 0287–0292.
[45]
Pravendra Singh, Vinay Kumar Verma, Piyush Rai, and Vinay P. Namboodiri. 2019. Hetconv: Heterogeneous kernel-based convolutions for deep cnns. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4835–4844.
[46]
T. Ceren Deveci, Serdar Cakir, and A. Enis Cetin. 2018. Energy efficient hadamard neural networks. arXiv:1805.05421. Retrieved from https://arxiv.org/abs/1805.05421.
[47]
A. Enis Cetin, Omer N. Gerek, and Sennur Ulukus. 1993. Block wavelet transforms for image coding. IEEE Trans. Circ. Syst. Vid. Technol. 3, 6 (1993), 433–435.
[48]
Joseph L. Walsh. 1923. A closed set of normal orthogonal functions. Am. J. Math. 45, 1 (1923), 5–24.
[49]
Bernard J. Fino and V. Ralph Algazi. 1976. Unified matrix treatment of the fast Walsh-Hadamard transform. IEEE Trans. Comput. 25, 11 (1976), 1142–1146.
[50]
Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. 2017. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1492–1500.
[51]
Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, and Alexander A. Alemi. 2017. Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the 31st AAAI Conference on Artificial Intelligence.
[52]
P. M. Agante and J. P. Marques De Sá. 1999. ECG noise filtering using wavelets with soft-thresholding methods. In Computers in Cardiology 1999, Vol. 26 (Cat. No. 99CH37004). IEEE, 535–538.
[53]
David L. Donoho. 1995. De-noising by soft-thresholding. IEEE Trans. Inf. Theory 41, 3 (1995), 613–627.
[54]
Michael W. Marcellin, Michael J. Gormish, Ali Bilgin, and Martin P. Boliek. 2000. An overview of JPEG-2000. In Proceedings of the Data Compression Conference (DCC’00). IEEE, 523–541.
[55]
James Lee-Thorp, Joshua Ainslie, Ilya Eckstein, and Santiago Ontanon. 2021. FNet: Mixing tokens with fourier transforms. arXiv:2105.03824. Retrieved from https://arxiv.org/abs/2105.03824.
[56]
Mobilenet in tensorflow’s official github. Retrieved March 1, 2021 https://github.com/tensorflow/models/tree/master/research/slim/nets/mobilenet.
[57]
Transfer learning and fine-tuning. Retrieved March 1, 2021 from https://www.tensorflow.org/tutorials/images/transfer_learning.
[58]
Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, et al. 2019. Searching for mobilenetv3. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1314–1324.
[59]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE International Conference on Computer Vision. 1026–1034.
[60]
Block walsh-hadamard transform layer speed test code. Retrieved December, 31, 2021 from https://github.com/phy710/Block-Walsh-Hadamard-Transform-Layer-Speed-Test.

Cited By

View all
  • (2024)AlN Sputtering Parameter Estimation Using A Multichannel Parallel DCT Neural Network2024 IEEE 42nd VLSI Test Symposium (VTS)10.1109/VTS60656.2024.10538771(1-5)Online publication date: 22-Apr-2024
  • (2024)ADC/DAC-Free Analog Acceleration of Deep Neural Networks With Frequency TransformationIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2024.337511132:6(991-1003)Online publication date: 18-Mar-2024
  • (2023)A hybrid quantum-classical approach based on the hadamard transform for the convolutional layerProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619528(26891-26903)Online publication date: 23-Jul-2023
  • Show More Cited By

Index Terms

  1. Block Walsh–Hadamard Transform-based Binary Layers in Deep Neural Networks

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Embedded Computing Systems
    ACM Transactions on Embedded Computing Systems  Volume 21, Issue 6
    November 2022
    498 pages
    ISSN:1539-9087
    EISSN:1558-3465
    DOI:10.1145/3561948
    • Editor:
    • Tulika Mitra
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Journal Family

    Publication History

    Published: 18 October 2022
    Online AM: 26 January 2022
    Accepted: 30 December 2021
    Revised: 25 November 2021
    Received: 15 July 2021
    Published in�TECS�Volume 21, Issue 6

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Fast Walsh–Hadamard transform
    2. block division
    3. smooth-thresholding
    4. image classification

    Qualifiers

    • Research-article
    • Refereed

    Funding Sources

    • University of Illinois Chicago Discovery Partners Institute Seed Funding Program and NSF

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)115
    • Downloads (Last 6 weeks)12
    Reflects downloads up to 16 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)AlN Sputtering Parameter Estimation Using A Multichannel Parallel DCT Neural Network2024 IEEE 42nd VLSI Test Symposium (VTS)10.1109/VTS60656.2024.10538771(1-5)Online publication date: 22-Apr-2024
    • (2024)ADC/DAC-Free Analog Acceleration of Deep Neural Networks With Frequency TransformationIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2024.337511132:6(991-1003)Online publication date: 18-Mar-2024
    • (2023)A hybrid quantum-classical approach based on the hadamard transform for the convolutional layerProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619528(26891-26903)Online publication date: 23-Jul-2023
    • (2023)Hybrid Binary Neural Networks: A Tutorial Review2023 IEEE 41st VLSI Test Symposium (VTS)10.1109/VTS56346.2023.10139963(1-12)Online publication date: 24-Apr-2023
    • (2023)Real-Time Wireless ECG-Derived Respiration Rate Estimation using an Autoencoder with a DCT LayerICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP49357.2023.10094831(1-5)Online publication date: 4-Jun-2023
    • (2023)Exploring an efficient frequency-guidance transformer for single image derainingSignal, Image and Video Processing10.1007/s11760-023-02918-z18:3(2429-2438)Online publication date: 16-Dec-2023
    • (2023)Soft threshold iteration-based anti-noise compressed sensing image reconstruction networkSignal, Image and Video Processing10.1007/s11760-023-02686-w17:8(4523-4531)Online publication date: 25-Aug-2023
    • (2023)Split frequency attention network for single image derainingSignal, Image and Video Processing10.1007/s11760-023-02601-317:7(3741-3748)Online publication date: 12-May-2023
    • (2023)DCTNet: deep shrinkage denoising via DCT filterbanksSignal, Image and Video Processing10.1007/s11760-023-02593-017:7(3665-3676)Online publication date: 2-Jun-2023
    • (2022)Deep Neural Network with Walsh-Hadamard Transform Layer For Ember Detection during a Wildfire2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW56347.2022.00040(256-265)Online publication date: Jun-2022

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media