Skip to main content
Log in

2C-Net: integrate image compression and classification via deep neural network

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Providing effective support for intelligent vision tasks without image reconstruction can save numerous computational costs in the era of big data. With the help of the Deep Neural Network (DNN), integrating image compression and intelligent vision tasks at a feature representation level becomes a new promising approach. But how to perform non-linear transformation for image compression and extract image patterns for intelligent vision tasks simultaneously within a shared DNN remains an open problem. In this paper, a versatile framework is studied to explore the common feature representations for both image compression and classification. A fully shared latent representation is extracted in a more compact way to support compression and classification task. The General Feature Extraction and Feature-Analytic Classifier are proposed to generate and utilize shared latent representation. Then, the whole framework is joint optimized by considering multiple factors (i.e., rate, quality, and accuracy). Extensive experiments are carried out to validate that the proposals can improve the performance of both learning-based image compression and classification. The results show that the proposed method outperforms the conventional codecs like BPG and JPEG2000 in compression efficiency, while achieving acceptable accuracy on different image classification datasets without image reconstruction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Duan, L.-Y., Chandrasekhar, V., Chen, J., Lin, J., Wang, Z., Huang, T., Girod, B., Gao, W.: Overview of the mpeg-cdvs standard. IEEE Trans. Image Process. 25(1), 179–194 (2015)

    ArticleMathSciNetMATH Google Scholar 

  2. Duan, L.-Y., Lou, Y., Bai, Y., Huang, T., Gao, W., Chandrasekhar, V., Lin, J., Wang, S., Kot, A.C.: Compact descriptors for video analysis: the emerging mpeg standard. IEEE Multimed. 26(2), 44–54 (2018)

    Article  Google Scholar 

  3. Schaefer, G., Edmundson, D.: Dc stream based jpeg compressed domain image retrieval. In: International Conference on Active Media Technology, pp. 318–327 (2012)

  4. Xiong, Z., Huang, T.S.: Wavelet-based texture features can be extracted efficiently from compressed-domain for jpeg2000 coded images. In: International Conference on Image Processing, vol. 1, p. (2002)

  5. Toderici, G., O’Malley, S.M., Hwang, S.J., Vincent, D., Minnen, D., Baluja, S., Covell, M., Sukthankar, R.: Variable rate image compression with recurrent neural networks. arXiv preprint arXiv:1511.06085 (2015)

  6. Toderici, G., Vincent, D., Johnston, N., Hwang, S.J., Minnen, D., Shor, J., Covell, M.: Full resolution image compression with recurrent neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5435–5443 (2017). https://doi.org/10.1109/CVPR.2017.577

  7. Ballé, J., Laparra, V., Simoncelli, E.: End-to-end optimized image compression. In: 5th International Conference on Learning Representations (2017)

  8. Theis, L., Shi, W., Cunningham, A., Huszár, F.: Lossy image compression with compressive autoencoders. In: 5th International Conference on Learning Representations (2017)

  9. Ballé, J., Minnen, D., Singh, S., Hwang, S.J., Johnston, N.: Variational image compression with a scale hyperprior. In: 6th International Conference on Learning Representations (2018)

  10. Liu, H., Chen, T., Shen, Q., Yue, T., Ma, Z.: Deep image compression via end-to-end learning. In: The IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 2575–2578 (2018)

  11. Choi, Y., El-Khamy, M., Lee, J.: Variable rate deep image compression with a conditional autoencoder. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3146–3154 (2019)

  12. Liu, H., Chen, T., Guo, P., Shen, Q., Cao, X., Wang, Y., Ma, Z.: Non-local attention optimized deep image compression. arXiv preprint arXiv:1904.09757 (2019)

  13. Cai, C., Chen, L., Zhang, X., Gao, Z.: End-to-end optimized roi image compression. IEEE Trans. Image Process. (2019). https://doi.org/10.1109/TIP.2019.2960869

    Article  MATH  Google Scholar 

  14. Li, M., Zuo, W., Gu, S., You, J., Zhang, D.: Learning content-weighted deep image compression. IEEE Trans. Pattern Anal. Mach. Intell. (2020). https://doi.org/10.1109/TPAMI.2020.2983926

    Article  Google Scholar 

  15. Cheng, Z., Sun, H., Takeuchi, M., Katto, J.: Learned image compression with discretized gaussian mixture likelihoods and attention modules. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 7936–7945 (2020). https://doi.org/10.1109/CVPR42600.2020.00796

  16. Wang, Y., Liu, D., Ma, S., Wu, F., Gao, W.: Ensemble learning-based rate-distortion optimization for end-to-end image compression. IEEE Trans. Circuits Syst. Video Technol. 31(3), 1193–1207 (2021). https://doi.org/10.1109/TCSVT.2020.3000331

    Article  Google Scholar 

  17. Mishra, D., Singh, S.K., Singh, R.K.: Wavelet-based deep auto encoder-decoder (wdaed)-based image compression. IEEE Trans. Circuits Syst. Video Technol. 31(4), 1452–1462 (2021). https://doi.org/10.1109/TCSVT.2020.3010627

    Article  Google Scholar 

  18. Shen, Q., Cai, J., Liu, L., Liu, H., Chen, T., Ye, L., Ma, Z.: Codedvision: towards joint image understanding and compression via end-to-end learning. In: Pacific Rim Conference on Multimedia, pp. 3–14 (2018)

  19. Torfason, R., Mentzer, F., Agustsson, E., Tschannen, M., Timofte, R., Van Gool, L.: Towards image understanding from deep compression without decoding. In: 6th International Conference on Learning Representations (2018)

  20. Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. Comput. Vis. Image Underst. 106(1), 59–70 (2007). https://doi.org/10.1016/j.cviu.2005.09.012

    Article  Google Scholar 

  21. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vision 88(2), 303–338 (2010)

    Article  Google Scholar 

  22. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y

    Article  MathSciNet  Google Scholar 

  23. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

  24. Simonyan, K., Zisserman, A.: Very Deep Convolutional Networks for Large-scale Image Recognition. arXiv preprint arXiv:1409.1556 (2014)

  25. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

  26. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  27. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)

  28. Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8697–8710 (2018)

  29. Overview of JPEG: https://jpeg.org/jpeg/ (2018)

  30. Lee, D.T.: Jpeg 2000: retrospective and new developments. Proc. IEEE 93(1), 32–41 (2005). https://doi.org/10.1109/JPROC.2004.839613

    Article  Google Scholar 

  31. Better Portable Graphics: https://bellard.org/bpg/ (2018)

  32. Van den Oord, A., Kalchbrenner, N., Espeholt, L., Vinyals, O., Graves, A., et al.: Conditional image generation with pixelcnn decoders. In: Advances in Neural Information Processing Systems, pp. 4790–4798 (2016)

  33. Minnen, D., Ballé, J., Toderici, G.: Joint autoregressive and hierarchical priors for learned image compression. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 10794–10803 (2018)

  34. Lee, J., Cho, S., Beack, S.-K.: Context-adaptive entropy model for end-to-end optimized image compression. In: 6th International Conference on Learning Representations (2018)

  35. Lee, J., Cho, S., Jeong, S., Kwon, H., Ko, H., Kim, H.Y., Choi, J.S.: Extended end-to-end optimized image compression method based on a context-adaptive entropy model. In: The IEEE Conference on Computer Vision and Pattern Recognition Workshops (2019)

  36. Li, M., Ma, K., You, J., Zhang, D., Zuo, W.: Efficient and effective context-based convolutional entropy modeling for image compression. IEEE Trans. Image Process. 29, 5900–5911 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  37. Guo, Z., Wu, Y., Feng, R., Zhang, Z., Chen, Z.: 3-d context entropy model for improved practical image compression. In: The IEEE Conference on Computer Vision and Pattern Recognition Workshops (2020)

  38. Hu, Y., Yang, W., Ma, Z., Liu, J.: Learning end-to-end lossy image compression: a benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 44(8), 4194–4211 (2022). https://doi.org/10.1109/TPAMI.2021.3065339

    Article  Google Scholar 

  39. Minnen, D., Singh, S.: Channel-wise autoregressive entropy models for learned image compression. In: 2020 IEEE International Conference on Image Processing (ICIP), pp. 3339–3343 (2020). https://doi.org/10.1109/ICIP40778.2020.9190935

  40. Dodge, S., Karam, L.: Understanding how image quality affects deep neural networks. In: IEEE International Conference on Quality of Multimedia Experience, pp. 1–6 (2016)

  41. Choi, J., Han, B.: Task-aware quantization network for jpeg image compression. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) Computer Vision—ECCV 2020, pp. 309–324. Springer, Cham (2020)

    Chapter  Google Scholar 

  42. Li, X., Ji, S.: Neural image compression and explanation. IEEE Access 8, 214605–214615 (2020). https://doi.org/10.1109/ACCESS.2020.3041416

    Article  Google Scholar 

  43. Le, N., Zhang, H., Cricri, F., Ghaznavi-Youvalari, R., Rahtu, E.: Image coding for machines: an end-to-end learned approach. In: ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1590–1594 (2021). https://doi.org/10.1109/ICASSP39728.2021.9414465

  44. Chamain, L.D., Racapé, F., Bégaint, J., Pushparaja, A., Feltman, S.: End-to-end optimized image compression for machines, a study. In: 2021 Data Compression Conference (DCC), pp. 163–172 (2021). https://doi.org/10.1109/DCC50243.2021.00024

  45. Zhang, Z., Li, L., Li, Z., Li, H.: Mobile visual search compression with Grassmann manifold embedding. IEEE Trans. Circuits Syst. Video Technol. 29(11), 3356–3366 (2019). https://doi.org/10.1109/TCSVT.2018.2881177

    Article  Google Scholar 

  46. Yang, W., Huang, H., Hu, Y., Duan, L.-Y., Liu, J.: Video Coding for Machine: Compact Visual Representation Compression for Intelligent Collaborative Analytics (2021)

  47. Duan, L., Liu, J., Yang, W., Huang, T., Gao, W.: Video coding for machines: a paradigm of collaborative compression and intelligent analytics. IEEE Trans. Image Process. 29, 8680–8695 (2020). https://doi.org/10.1109/TIP.2020.3016485

    Article  MATH  Google Scholar 

  48. Alvar, S.R., Baji’c, I.V.: Bit allocation for multi-task collaborative intelligence. In: ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4342–4346 (2020)

  49. Yan, N., Liu, D., Li, H., Wu, F.: Semantically scalable image coding with compression of feature maps. In: 2020 IEEE International Conference on Image Processing (ICIP), pp. 3114–3118 (2020)

  50. Tseng, T.-W., Yang, K.-J., Kuo, C.-C.J., Tsai, S.-H.: An interpretable compression and classification system: theory and applications. IEEE Access 8, 143962–143974 (2020). https://doi.org/10.1109/ACCESS.2020.3014307

    Article  Google Scholar 

  51. Patel, R., Lad, K., Patel, M.: Study and investigation of video steganography over uncompressed and compressed domain: a comprehensive review. Multimed. Syst. 27(5), 985–1024 (2021)

    Article  Google Scholar 

  52. Wen, X., Huffmire, T.D., Hu, H.H., Finkelstein, A.: Wavelet-based video indexing and querying. Multimed. Syst. 7(5), 350–358 (1999)

    Article  Google Scholar 

  53. Jiang, J., Armstrong, A., Feng, G.-C.: Web-based image indexing and retrieval in jpeg compressed domain. Multimed. Syst. 9(5), 424–432 (2004)

    Article  Google Scholar 

  54. Zhang, Q., Liu, D., Li, H.: Deep network-based image coding for simultaneous compression and retrieval. In: IEEE International Conference on Image Processing, pp. 405–409 (2017). IEEE

  55. Liu, K., Liu, D., Li, L., Yan, N., Li, H.: Semantics-to-signal scalable image compression with learned revertible representations. Int. J. Comput. Vision 129(9), 2605–2621 (2021)

    Article  Google Scholar 

  56. Liu, L., Liu, H., Chen, T., Shen, Q., Ma, Z.: Codedretrieval: joint image compression and retrieval with neural networks. In: IEEE Visual Communications and Image Processing, pp. 1–4 (2019)

  57. Ballé, J., Laparra, V., Simoncelli, E.: Density modeling of images using a generalized normalization transformation. In: 4th International Conference on Learning Representations (2016)

  58. Marr, D.: Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. MIT press, Cambridge (1982)

    Google Scholar 

  59. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: European Conference on Computer Vision, pp. 818–833 (2014)

  60. Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: The Thrity-Seventh Asilomar Conference on Signals, Systems Computers, 2003, vol. 2, pp. 1398–14022 (2003). https://doi.org/10.1109/ACSSC.2003.1292216

  61. The Kodak PhotoCD dataset: http://r0k.us/graphics/kodak/ (2010)

  62. Rippel, O., Bourdev, L.: Real-time adaptive image compression. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 2922–2930 (2017)

Download references

Acknowledgements

This work is supported in part by the National Natural Science Foundation of China under Grant 62071216, U1936202, and 62231002.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiu Shen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, L., Chen, T., Liu, H. et al. 2C-Net: integrate image compression and classification via deep neural network. Multimedia Systems 29, 945–959 (2023). https://doi.org/10.1007/s00530-022-01026-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-022-01026-1

Keywords

Navigation