research-article

Learning based Facial Image Compression with semantic fidelity metric

Authors:

Tianyu HeAuthors Info & Claims

Volume 338, Issue C

Pages 16 - 25

https://doi.org/10.1016/j.neucom.2019.01.086

Published: 21 April 2019 Publication History

Abstract

Surveillance and security scenarios usually require high efficient facial image compression scheme for face recognition and identification. While either traditional general image codecs or special facial image compression schemes only heuristically refine codec separately according to face verification accuracy metric. We propose a Learning based Facial Image Compression (LFIC) framework with a novel Regionally Adaptive Pooling (RAP) module whose parameters can be automatically optimized according to gradient feedback from an integrated hybrid semantic fidelity metric, including a successfully exploration to apply Generative Adversarial Network (GAN) as metric directly in image compression scheme. The experimental results verify the framework’s efficiency by demonstrating performance improvement of 71.41%, 48.28% and 52.67% Bit Rate Saving separately over JPEG2000, WebP and neural network-based codecs under the same face verification accuracy distortion metric. We also evaluate LFIC’s superior performance gain compared with latest specific facial image codecs. Visual experiments also show some interesting insight on how LFIC can automatically capture the information in critical areas based on semantic distortion metrics for optimized compression, which is quite different from the heuristic way of optimization in traditional image compression algorithms.

References

[1]

Wan Z., A. Bovik, Mean squared error: love it or leave it?, IEEE Signal Process. Mag. 26 (1) (2009) 98–117.

[2]

Chen Z., Liao N., Gu X., Wu F., Shi G., Hybrid distortion ranking tuned bitstream-layer video quality assessment, IEEE Trans. Circuits Syst. Video Technol. 26 (6) (2016) 1029–1043.

[3]

S. Chopra, R. Hadsell, LeCun Y., Learning a similarity metric discriminatively, with application to face verification, Proceedings of the Computer Vision and Pattern Recognition (CVPR), 1, IEEE, 2005, pp. 539–546.

[4]

Zhang P., Zhou W., Wu L., Li H., Som: Semantic obviousness metric for image quality assessment, Proceedings of the Computer Vision and Pattern Recognition (CVPR), 2015, pp. 2394–2402.

[5]

J. Alakuijala, R. Obryk, O. Stoliarchuk, Z. Szabadka, L. Vandevenne, J. Wassenberg, Guetzli: Perceptually guided jpeg encoder, 2017.

[6]

Liu D., Wang D., Li H., Recognizable or not: Towards image semantic quality assessment for compression, Sens. Imaging 18 (1) (2017) 1.

[7]

A. Prakash, N. Moran, S. Garber, A. DiLillo, J. Storer, Semantic perceptual image compression using deep convolution networks, Proceedings of the Data Compression Conference (DCC), IEEE, 2017, pp. 250–259.

[8]

M. Elad, R. Goldenberg, R. Kimmel, Low bit-rate compression of facial images, IEEE Trans. Image Process. 16 (9) (2007) 2379–2383.

[9]

O. Bryt, M. Elad, Compression of facial images using the k-svd algorithm, J. Vis. Commun. Image Represent. 19 (4) (2008) 270–282.

[10]

I. Ram, I. Cohen, M. Elad, Facial image compression using patch-ordering-based adaptive wavelet transform, IEEE Signal Process. Lett. 21 (10) (2014) 1270–1274.

[11]

S. Ferdowsi, S. Voloshynovskiy, D. Kostadinov, Sparse multi-layer image approximation: Facial image compression, 2015.

[12]

K. Delac, S. Grgic, M. Grgic, Image compression in face recognition-a literature survey, Proceedings of the Recent Advances in Face Recognition, InTech, 2008.

[13]

J. Ballé, V. Laparra, E.P. Simoncelli, End-to-end optimization of nonlinear transform codes for perceptual quality, Proceedings of the Picture Coding Symposium (PCS), IEEE, 2016, pp. 1–5.

[14]

J. Ballé, V. Laparra, E.P. Simoncelli, End-to-end optimized image compression, Proceedings of the International Conference on Learning Representations (ICLR), 2017.

[15]

L. Theis, Shi W., A. Cunningham, F. Huszár, Lossy image compression with compressive autoencoders, Proceedings of the International Conference on Learning Representations (ICLR), 2017.

[16]

E. Agustsson, F. Mentzer, M. Tschannen, L. Cavigelli, R. Timofte, L. Benini, L. Van Gool, Soft-to-hard vector quantization for end-to-end learned compression of images and neural networks, Proceedings of the Advances In Neural Information Processing Systems (NIPS), 2017.

[17]

T. Dumas, A. Roumy, C. Guillemot, Image compression with stochastic winner-take-all auto-encoder, Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017.

[18]

Li M., Zuo W., Gu S., Zhao D., Zhang D., Learning convolutional networks for content-weighted image compression, Proceedings of the Computer Vision and Pattern Recognition (CVPR), 2018.

[19]

Jiang F., Tao W., Liu S., Ren J., Guo X., Zhao D., An end-to-end compression framework based on convolutional neural networks, IEEE Trans. Circuits Syst. Video Technol. 28 (10) (2017) 3007–3018.

[20]

G. Toderici, S.M. O’Malley, Hwang S.J., D. Vincent, D. Minnen, S. Baluja, M. Covell, R. Sukthankar, Variable rate image compression with recurrent neural networks, Proceedings of the International Conference on Learning Representations (ICLR), 2016.

[21]

G. Toderici, D. Vincent, N. Johnston, Hwang S.J., D. Minnen, J. Shor, M. Covell, Full resolution image compression with recurrent neural networks, Proceedings of the Computer Vision and Pattern Recognition (CVPR), 2017.

[22]

N. Johnston, D. Vincent, D. Minnen, M. Covell, S. Singh, T. Chinen, Hwang S.J., J. Shor, G. Toderici, Improved lossy image compression with priming and spatially adaptive bit rates for recurrent networks, Proceedings of the Computer Vision and Pattern Recognition (CVPR), 2018.

[23]

M.H. Baig, V. Koltun, L. Torresani, Learning to inpaint for image compression, Proceedings of the Advances In Neural Information Processing Systems (NIPS), 2017.

[24]

Chen Z., He T., Jin X., Wu F., Learning for video compression, IEEE Transactions on Circuits and Systems for Video Technology (Early Access), 2018,.

Digital Library

[25]

S. Santurkar, D. Budden, N. Shavit, Generative compression, Picture Coding Symposium (2018) 258–262.

[26]

O. Rippel, L. Bourdev, Real-time adaptive image compression, Proceedings of the International Conference on Machine Learning (ICML), 2017.

[27]

J. Ballé, D. Minnen, S. Singh, Hwang S.J., N. Johnston, Variational image compression with a scale hyperprior, Proceedings of the International Conference on Learning Representations (ICLR), 2018.

[28]

F. Mentzer, E. Agustsson, M. Tschannen, R. Timofte, L. Van Gool, Conditional probability models for deep image compression, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1, 2018, p. 3.

[29]

Jia Y., Huang C., T. Darrell, Beyond spatial pyramids: Receptive field learning for pooled image features, Proceedings of the Computer Vision and Pattern Recognition (CVPR), IEEE, 2012, pp. 3370–3377.

[30]

He K., Zhang X., Ren S., Sun J., Spatial pyramid pooling in deep convolutional networks for visual recognition, Proceedings of the European Conference on Computer Vision (ECCV), Springer, 2014, pp. 346–361.

[31]

Liu Y., Zhang Y.-M., Zhang X.-Y., Liu C.-L., Adaptive spatial pooling for image classification, Pattern Recognit. 55 (2016) 58–67.

[32]

Wang J., Wang W., Wang R., Gao W., Csps: an adaptive pooling method for image classification, IEEE Trans. Multimed. 18 (6) (2016) 1000–1010.

[33]

Tsai Y.-H., O.C. Hamsici, Yang M.-H., Adaptive region pooling for object detection, Proceedings of the Computer Vision and Pattern Recognition (CVPR), 2015, pp. 731–739.

[34]

Mao X., Li Q., Xie H., Lau R.Y., Wang Z., S.P. Smolley, Least squares generative adversarial networks, Proceedings of the International Conference on Computer Vision (ICCV), 2017.

[35]

I. Goodfellow, J. Pouget-Abadie, M. Mirza, Xu B., D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, Generative adversarial nets, Advances in Neural Information Processing Systems (NIPS), 2014, pp. 2672–2680.

[36]

He K., Zhang X., Ren S., Sun J., Deep residual learning for image recognition, Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778.

[37]

S. Ioffe, C. Szegedy, Batch normalization: accelerating deep network training by reducing internal covariate shift, Proceedings of the International Conference on Machine Learning (ICML), 2015, pp. 448–456.

[38]

A. Radford, L. Metz, S. Chintala, Unsupervised representation learning with deep convolutional generative adversarial networks (2016).

[39]

F. Schroff, D. Kalenichenko, J. Philbin, Facenet: A unified embedding for face recognition and clustering, Proceedings of the Computer Vision and Pattern Recognition (CVPR), 2015, pp. 815–823.

[40]

Wen Y., Zhang K., Li Z., Qiao Y., A discriminative feature learning approach for deep face recognition, Proceedings of the European Conference on Computer Vision (ECCV), Springer, 2016, pp. 499–515.

[41]

Liu Z., Luo P., Wang X., Tang X., Deep learning face attributes in the wild, Proceedings of the International Conference on Computer Vision (ICCV), 2015, pp. 3730–3738.

[42]

G. Bjontegaard, Calcuation of Average PSNR Differences between RD-curves, Doc. VCEG-M33 ITU-T Q6/16, Austin, TX, USA, 2–4 April 2001 (2001).

[43]

Huang G.B., M. Ramesh, T. Berg, E. Learned-Miller, Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments, Technical Report, University of Massachusetts, Amherst, 2007.

[44]

D. Kingma, Ba J., Adam: a method for stochastic optimization, Proceedings of the International Conference on Learning Representations (ICLR), 2015.

Cited By

Fu YCheng WZhang WWang J(2024)Scalable Extraction Based Semantic Communication for 6G Wireless NetworksIEEE Communications Magazine10.1109/MCOM.021.230026962:7(96-102)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1109/MCOM.021.2300269
Zhang WLiu YChen LShi JHong XWang X(2024)Semantically-Disentangled Progressive Image Compression for Deep Space Communications: Exploring the Ultra-Low Rate RegimeIEEE Journal on Selected Areas in Communications10.1109/JSAC.2024.336965442:5(1130-1144)Online publication date: 26-Feb-2024
https://dl.acm.org/doi/10.1109/JSAC.2024.3369654
Gao CLiu DLi LWu F(2023)Towards Task-Generic Image Compression: A Study of Semantics-Oriented MetricsIEEE Transactions on Multimedia10.1109/TMM.2021.313075425(721-735)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TMM.2021.3130754
Show More Cited By

Index Terms

Learning based Facial Image Compression with semantic fidelity metric

Index terms have been assigned to the content through auto-classification.

Recommendations

Formal Photograph Compression in JPEG2000 Coding
IPTC '10: Proceedings of the 2010 International Symposium on Intelligence Information Processing and Trusted Computing

A formal facial image compression algorithm is proposed in this paper. Object segmentation techniques is used to locate the facial region firstly. An adapted version of the original Viola-Jones face detector is used to locate eyes and mouth. The ...
A refined VQ-based image compression method

This paper presents a refined VQ-based image compression method which modifies the traditional VQ-based image compression method. The refined VQ-based image compression method further lossless encodes the compression data which is generated by the ...
A Refined VQ-Based Image Compression Method

This paper presents a refined VQ-based image compression method which modifies the traditional VQ-based image compression method. The refined VQ-based image compression method further lossless encodes the compression data which is generated by the ...

Comments

Information & Contributors

Information

Published In

cover image Neurocomputing

Neurocomputing Volume 338, Issue C

Apr 2019

442 pages

ISSN:0925-2312

Issue’s Table of Contents

Elsevier B.V.

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 21 April 2019

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Fu YCheng WZhang WWang J(2024)Scalable Extraction Based Semantic Communication for 6G Wireless NetworksIEEE Communications Magazine10.1109/MCOM.021.230026962:7(96-102)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1109/MCOM.021.2300269
Zhang WLiu YChen LShi JHong XWang X(2024)Semantically-Disentangled Progressive Image Compression for Deep Space Communications: Exploring the Ultra-Low Rate RegimeIEEE Journal on Selected Areas in Communications10.1109/JSAC.2024.336965442:5(1130-1144)Online publication date: 26-Feb-2024
https://dl.acm.org/doi/10.1109/JSAC.2024.3369654
Gao CLiu DLi LWu F(2023)Towards Task-Generic Image Compression: A Study of Semantics-Oriented MetricsIEEE Transactions on Multimedia10.1109/TMM.2021.313075425(721-735)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TMM.2021.3130754
Huang ZJia CWang SMa S(2022)A Compressive Prior Guided Mask Predictive Coding Approach for Video AnalysisComputer Vision – ACCV 202210.1007/978-3-031-26316-3_28(469-484)Online publication date: 4-Dec-2022
https://dl.acm.org/doi/10.1007/978-3-031-26316-3_28
Napoli Spatafora MOrtis ABattiato S(2022)Towards an Efficient Facial Image Compression with Neural NetworksImage Analysis and Processing – ICIAP 202210.1007/978-3-031-06427-2_43(512-523)Online publication date: 23-May-2022
https://dl.acm.org/doi/10.1007/978-3-031-06427-2_43
Li XShi JChen Z(2021)Task-Driven Semantic Coding via Reinforcement LearningIEEE Transactions on Image Processing10.1109/TIP.2021.309190930(6307-6320)Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.1109/TIP.2021.3091909

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents