skip to main content
research-article

Learning based Facial Image Compression with semantic fidelity metric

Published: 21 April 2019 Publication History

Abstract

Surveillance and security scenarios usually require high efficient facial image compression scheme for face recognition and identification. While either traditional general image codecs or special facial image compression schemes only heuristically refine codec separately according to face verification accuracy metric. We propose a Learning based Facial Image Compression (LFIC) framework with a novel Regionally Adaptive Pooling (RAP) module whose parameters can be automatically optimized according to gradient feedback from an integrated hybrid semantic fidelity metric, including a successfully exploration to apply Generative Adversarial Network (GAN) as metric directly in image compression scheme. The experimental results verify the framework’s efficiency by demonstrating performance improvement of 71.41%, 48.28% and 52.67% Bit Rate Saving separately over JPEG2000, WebP and neural network-based codecs under the same face verification accuracy distortion metric. We also evaluate LFIC’s superior performance gain compared with latest specific facial image codecs. Visual experiments also show some interesting insight on how LFIC can automatically capture the information in critical areas based on semantic distortion metrics for optimized compression, which is quite different from the heuristic way of optimization in traditional image compression algorithms.

References

[1]
Wan Z., A. Bovik, Mean squared error: love it or leave it?, IEEE Signal Process. Mag. 26 (1) (2009) 98–117.
[2]
Chen Z., Liao N., Gu X., Wu F., Shi G., Hybrid distortion ranking tuned bitstream-layer video quality assessment, IEEE Trans. Circuits Syst. Video Technol. 26 (6) (2016) 1029–1043.
[3]
S. Chopra, R. Hadsell, LeCun Y., Learning a similarity metric discriminatively, with application to face verification, Proceedings of the Computer Vision and Pattern Recognition (CVPR), 1, IEEE, 2005, pp. 539–546.
[4]
Zhang P., Zhou W., Wu L., Li H., Som: Semantic obviousness metric for image quality assessment, Proceedings of the Computer Vision and Pattern Recognition (CVPR), 2015, pp. 2394–2402.
[5]
J. Alakuijala, R. Obryk, O. Stoliarchuk, Z. Szabadka, L. Vandevenne, J. Wassenberg, Guetzli: Perceptually guided jpeg encoder, 2017.
[6]
Liu D., Wang D., Li H., Recognizable or not: Towards image semantic quality assessment for compression, Sens. Imaging 18 (1) (2017) 1.
[7]
A. Prakash, N. Moran, S. Garber, A. DiLillo, J. Storer, Semantic perceptual image compression using deep convolution networks, Proceedings of the Data Compression Conference (DCC), IEEE, 2017, pp. 250–259.
[8]
M. Elad, R. Goldenberg, R. Kimmel, Low bit-rate compression of facial images, IEEE Trans. Image Process. 16 (9) (2007) 2379–2383.
[9]
O. Bryt, M. Elad, Compression of facial images using the k-svd algorithm, J. Vis. Commun. Image Represent. 19 (4) (2008) 270–282.
[10]
I. Ram, I. Cohen, M. Elad, Facial image compression using patch-ordering-based adaptive wavelet transform, IEEE Signal Process. Lett. 21 (10) (2014) 1270–1274.
[11]
S. Ferdowsi, S. Voloshynovskiy, D. Kostadinov, Sparse multi-layer image approximation: Facial image compression, 2015.
[12]
K. Delac, S. Grgic, M. Grgic, Image compression in face recognition-a literature survey, Proceedings of the Recent Advances in Face Recognition, InTech, 2008.
[13]
J. Ballé, V. Laparra, E.P. Simoncelli, End-to-end optimization of nonlinear transform codes for perceptual quality, Proceedings of the Picture Coding Symposium (PCS), IEEE, 2016, pp. 1–5.
[14]
J. Ballé, V. Laparra, E.P. Simoncelli, End-to-end optimized image compression, Proceedings of the International Conference on Learning Representations (ICLR), 2017.
[15]
L. Theis, Shi W., A. Cunningham, F. Huszár, Lossy image compression with compressive autoencoders, Proceedings of the International Conference on Learning Representations (ICLR), 2017.
[16]
E. Agustsson, F. Mentzer, M. Tschannen, L. Cavigelli, R. Timofte, L. Benini, L. Van Gool, Soft-to-hard vector quantization for end-to-end learned compression of images and neural networks, Proceedings of the Advances In Neural Information Processing Systems (NIPS), 2017.
[17]
T. Dumas, A. Roumy, C. Guillemot, Image compression with stochastic winner-take-all auto-encoder, Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017.
[18]
Li M., Zuo W., Gu S., Zhao D., Zhang D., Learning convolutional networks for content-weighted image compression, Proceedings of the Computer Vision and Pattern Recognition (CVPR), 2018.
[19]
Jiang F., Tao W., Liu S., Ren J., Guo X., Zhao D., An end-to-end compression framework based on convolutional neural networks, IEEE Trans. Circuits Syst. Video Technol. 28 (10) (2017) 3007–3018.
[20]
G. Toderici, S.M. O’Malley, Hwang S.J., D. Vincent, D. Minnen, S. Baluja, M. Covell, R. Sukthankar, Variable rate image compression with recurrent neural networks, Proceedings of the International Conference on Learning Representations (ICLR), 2016.
[21]
G. Toderici, D. Vincent, N. Johnston, Hwang S.J., D. Minnen, J. Shor, M. Covell, Full resolution image compression with recurrent neural networks, Proceedings of the Computer Vision and Pattern Recognition (CVPR), 2017.
[22]
N. Johnston, D. Vincent, D. Minnen, M. Covell, S. Singh, T. Chinen, Hwang S.J., J. Shor, G. Toderici, Improved lossy image compression with priming and spatially adaptive bit rates for recurrent networks, Proceedings of the Computer Vision and Pattern Recognition (CVPR), 2018.
[23]
M.H. Baig, V. Koltun, L. Torresani, Learning to inpaint for image compression, Proceedings of the Advances In Neural Information Processing Systems (NIPS), 2017.
[24]
Chen Z., He T., Jin X., Wu F., Learning for video compression, IEEE Transactions on Circuits and Systems for Video Technology (Early Access), 2018,.
[25]
S. Santurkar, D. Budden, N. Shavit, Generative compression, Picture Coding Symposium (2018) 258–262.
[26]
O. Rippel, L. Bourdev, Real-time adaptive image compression, Proceedings of the International Conference on Machine Learning (ICML), 2017.
[27]
J. Ballé, D. Minnen, S. Singh, Hwang S.J., N. Johnston, Variational image compression with a scale hyperprior, Proceedings of the International Conference on Learning Representations (ICLR), 2018.
[28]
F. Mentzer, E. Agustsson, M. Tschannen, R. Timofte, L. Van Gool, Conditional probability models for deep image compression, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1, 2018, p. 3.
[29]
Jia Y., Huang C., T. Darrell, Beyond spatial pyramids: Receptive field learning for pooled image features, Proceedings of the Computer Vision and Pattern Recognition (CVPR), IEEE, 2012, pp. 3370–3377.
[30]
He K., Zhang X., Ren S., Sun J., Spatial pyramid pooling in deep convolutional networks for visual recognition, Proceedings of the European Conference on Computer Vision (ECCV), Springer, 2014, pp. 346–361.
[31]
Liu Y., Zhang Y.-M., Zhang X.-Y., Liu C.-L., Adaptive spatial pooling for image classification, Pattern Recognit. 55 (2016) 58–67.
[32]
Wang J., Wang W., Wang R., Gao W., Csps: an adaptive pooling method for image classification, IEEE Trans. Multimed. 18 (6) (2016) 1000–1010.
[33]
Tsai Y.-H., O.C. Hamsici, Yang M.-H., Adaptive region pooling for object detection, Proceedings of the Computer Vision and Pattern Recognition (CVPR), 2015, pp. 731–739.
[34]
Mao X., Li Q., Xie H., Lau R.Y., Wang Z., S.P. Smolley, Least squares generative adversarial networks, Proceedings of the International Conference on Computer Vision (ICCV), 2017.
[35]
I. Goodfellow, J. Pouget-Abadie, M. Mirza, Xu B., D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, Generative adversarial nets, Advances in Neural Information Processing Systems (NIPS), 2014, pp. 2672–2680.
[36]
He K., Zhang X., Ren S., Sun J., Deep residual learning for image recognition, Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778.
[37]
S. Ioffe, C. Szegedy, Batch normalization: accelerating deep network training by reducing internal covariate shift, Proceedings of the International Conference on Machine Learning (ICML), 2015, pp. 448–456.
[38]
A. Radford, L. Metz, S. Chintala, Unsupervised representation learning with deep convolutional generative adversarial networks (2016).
[39]
F. Schroff, D. Kalenichenko, J. Philbin, Facenet: A unified embedding for face recognition and clustering, Proceedings of the Computer Vision and Pattern Recognition (CVPR), 2015, pp. 815–823.
[40]
Wen Y., Zhang K., Li Z., Qiao Y., A discriminative feature learning approach for deep face recognition, Proceedings of the European Conference on Computer Vision (ECCV), Springer, 2016, pp. 499–515.
[41]
Liu Z., Luo P., Wang X., Tang X., Deep learning face attributes in the wild, Proceedings of the International Conference on Computer Vision (ICCV), 2015, pp. 3730–3738.
[42]
G. Bjontegaard, Calcuation of Average PSNR Differences between RD-curves, Doc. VCEG-M33 ITU-T Q6/16, Austin, TX, USA, 2–4 April 2001 (2001).
[43]
Huang G.B., M. Ramesh, T. Berg, E. Learned-Miller, Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments, Technical Report, University of Massachusetts, Amherst, 2007.
[44]
D. Kingma, Ba J., Adam: a method for stochastic optimization, Proceedings of the International Conference on Learning Representations (ICLR), 2015.

Cited By

View all
  • (2024)Scalable Extraction Based Semantic Communication for 6G Wireless NetworksIEEE Communications Magazine10.1109/MCOM.021.230026962:7(96-102)Online publication date: 1-Jul-2024
  • (2024)Semantically-Disentangled Progressive Image Compression for Deep Space Communications: Exploring the Ultra-Low Rate RegimeIEEE Journal on Selected Areas in Communications10.1109/JSAC.2024.336965442:5(1130-1144)Online publication date: 26-Feb-2024
  • (2023)Towards Task-Generic Image Compression: A Study of Semantics-Oriented MetricsIEEE Transactions on Multimedia10.1109/TMM.2021.313075425(721-735)Online publication date: 1-Jan-2023
  • Show More Cited By

Index Terms

  1. Learning based Facial Image Compression with semantic fidelity metric
              Index terms have been assigned to the content through auto-classification.

              Recommendations

              Comments

              Information & Contributors

              Information

              Published In

              cover image Neurocomputing
              Neurocomputing  Volume 338, Issue C
              Apr 2019
              442 pages

              Publisher

              Elsevier Science Publishers B. V.

              Netherlands

              Publication History

              Published: 21 April 2019

              Author Tags

              1. End-to-end
              2. Semantic metric
              3. Facial image compression
              4. Adversarial training

              Qualifiers

              • Research-article

              Contributors

              Other Metrics

              Bibliometrics & Citations

              Bibliometrics

              Article Metrics

              • Downloads (Last 12 months)0
              • Downloads (Last 6 weeks)0
              Reflects downloads up to 16 Oct 2024

              Other Metrics

              Citations

              Cited By

              View all
              • (2024)Scalable Extraction Based Semantic Communication for 6G Wireless NetworksIEEE Communications Magazine10.1109/MCOM.021.230026962:7(96-102)Online publication date: 1-Jul-2024
              • (2024)Semantically-Disentangled Progressive Image Compression for Deep Space Communications: Exploring the Ultra-Low Rate RegimeIEEE Journal on Selected Areas in Communications10.1109/JSAC.2024.336965442:5(1130-1144)Online publication date: 26-Feb-2024
              • (2023)Towards Task-Generic Image Compression: A Study of Semantics-Oriented MetricsIEEE Transactions on Multimedia10.1109/TMM.2021.313075425(721-735)Online publication date: 1-Jan-2023
              • (2022)A Compressive Prior Guided Mask Predictive Coding Approach for Video AnalysisComputer Vision – ACCV 202210.1007/978-3-031-26316-3_28(469-484)Online publication date: 4-Dec-2022
              • (2022)Towards an Efficient Facial Image Compression with Neural NetworksImage Analysis and Processing – ICIAP 202210.1007/978-3-031-06427-2_43(512-523)Online publication date: 23-May-2022
              • (2021)Task-Driven Semantic Coding via Reinforcement LearningIEEE Transactions on Image Processing10.1109/TIP.2021.309190930(6307-6320)Online publication date: 1-Jan-2021

              View Options

              View options

              Get Access

              Login options

              Media

              Figures

              Other

              Tables

              Share

              Share

              Share this Publication link

              Share on social media