skip to main content
10.1145/2964284.2967201acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
short-paper

Partial Multi-Modal Sparse Coding via Adaptive Similarity Structure Regularization

Published: 01 October 2016 Publication History

Abstract

Multi-modal sparse coding has played an important role in many multimedia applications, where data are usually with multiple modalities. Recently, various multi-modal sparse coding approaches have been proposed to learn sparse codes of multi-modal data, which assume that data appear in all modalities, or at least there is one modality containing all data. However, in real applications, it is often the case that some modalities of the data may suffer from missing information and thus result in partial multi-modality data. In this paper, we propose to solve the partial multi-modal sparse coding problem via multi-modal similarity structure regularization. Specifically, we propose a partial multi-modal sparse coding framework termed Adaptive Partial Multi-Modal Similarity Structure Regularization for Sparse Coding (AdaPM2SC), which preserves the similarity structure within the same modality and between different modalities. Experimental results conducted on two real-world datasets demonstrate that AdaPM2SC significantly outperforms the state-of-the-art methods under partial multi-modality scenario.

References

[1]
S. Boyd and L. Vandenberghe. Convex optimization. Cambridge university press, 2004.
[2]
T. Cao, V. Jojic, S. Modla, D. Powell, K. Czymmek, and M. Niethammer. Robust multimodal dictionary learning. In Medical Image Computing and Computer-Assisted Intervention--MICCAI 2013, pages 259--266. Springer, 2013.
[3]
Y. Han, F. Wu, D. Tao, J. Shao, Y. Zhuang, and J. Jiang. Sparse unsupervised dimensionality reduction for multiple view data. Circuits and Systems for Video Technology, IEEE Transactions on, 22(10):1485--1496, 2012.
[4]
X.-Y. Jing, R.-M. Hu, F. Wu, X.-L. Chen, Q. Liu, and Y.-F. Yao. Uncorrelated multi-view discrimination dictionary learning for recognition. In AAAI, 2014.
[5]
H. Lee, A. Battle, R. Raina, and A. Y. Ng. Efficient sparse coding algorithms. In Advances in neural information processing systems, pages 801--808, 2006.
[6]
F. Nie, X. Wang, and H. Huang. Clustering and projected clustering with adaptive neighbors. In SIGKDD, pages 977--986. ACM, 2014.
[7]
S. Shekhar, V. M. Patel, N. M. Nasrabadi, and R. Chellappa. Joint sparse representation for robust multimodal biometrics recognition. TPAMI, 36(1):113--126, 2014.
[8]
Y. Shi, Y. Gao, Y. Yang, Y. Zhang, and D. Wang. Multimodal sparse representation-based classification for lung needle biopsy images. Biomedical Engineering, IEEE Transactions on, 60(10):2675--2685, 2013.
[9]
Q. Wang, L. Si, and B. Shen. Learning to hash on partial multi-modal data. In IJCAI, pages 3904--3910. AAAI Press, 2015.
[10]
S. Wang, L. Zhang, Y. Liang, and Q. Pan. Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis. In CVPR, pages 2216--2223. IEEE, 2012.
[11]
J. Yu, Y. Rui, and D. Tao. Click prediction for web image reranking using multimodal sparse coding. IEEE Transactions on Image Processing, 23(5):2019--2032, 2014.
[12]
J. Zheng and Z. Jiang. Learning view-invariant sparse representations for cross-view action recognition. In Proceedings of the IEEE International Conference on Computer Vision, pages 3176--3183, 2013.
[13]
M. Zheng, J. Bu, C. Chen, C. Wang, L. Zhang, G. Qiu, and D. Cai. Graph regularized sparse coding for image representation. TIP, 20(5):1327--1336, 2011.
[14]
S. Zheng, B. Xie, K. Huang, and D. Tao. Multi-view pedestrian recognition using shared dictionary learning with group sparsity. In Neural Information Processing, pages 629--638. Springer, 2011.
[15]
S.-Y. L. Y. J. Zhi and H. Zhou. Partial multi-view clustering. In AAAI. AAAI Press, 2014.
[16]
Y. T. Zhuang, Y. F. Wang, F. Wu, Y. Zhang, and W. M. Lu. Supervised coupled dictionary learning with group structures for multi-modal retrieval. In AAAI, 2013.

Cited By

View all
  • (2023)Self-supervised deep partial adversarial network for micro-video multimodal classificationInformation Sciences10.1016/j.ins.2022.11.111630(356-369)Online publication date: Jun-2023
  • (2023)A multimodal dual-fusion entity extraction model for large and complex devicesComputer Communications10.1016/j.comcom.2023.07.026210(1-9)Online publication date: Oct-2023
  • (2022)Hierarchical Fusion Using Subsets of Multi-Features for Historical Arabic Manuscript DatingJournal of Imaging10.3390/jimaging80300608:3(60)Online publication date: 1-Mar-2022
  • Show More Cited By

Index Terms

  1. Partial Multi-Modal Sparse Coding via Adaptive Similarity Structure Regularization

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '16: Proceedings of the 24th ACM international conference on Multimedia
    October 2016
    1542 pages
    ISBN:9781450336031
    DOI:10.1145/2964284
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 October 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. adaptive similarity
    2. multi-modal
    3. sparse coding

    Qualifiers

    • Short-paper

    Conference

    MM '16
    Sponsor:
    MM '16: ACM Multimedia Conference
    October 15 - 19, 2016
    Amsterdam, The Netherlands

    Acceptance Rates

    MM '16 Paper Acceptance Rate 52 of 237 submissions, 22%;
    Overall Acceptance Rate 995 of 4,171 submissions, 24%

    Upcoming Conference

    MM '24
    The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne , VIC , Australia

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)15
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 16 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Self-supervised deep partial adversarial network for micro-video multimodal classificationInformation Sciences10.1016/j.ins.2022.11.111630(356-369)Online publication date: Jun-2023
    • (2023)A multimodal dual-fusion entity extraction model for large and complex devicesComputer Communications10.1016/j.comcom.2023.07.026210(1-9)Online publication date: Oct-2023
    • (2022)Hierarchical Fusion Using Subsets of Multi-Features for Historical Arabic Manuscript DatingJournal of Imaging10.3390/jimaging80300608:3(60)Online publication date: 1-Mar-2022
    • (2022)Deep learning based audio and video cross-modal recommendation2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC53654.2022.9945521(2366-2371)Online publication date: 9-Oct-2022
    • (2022)Feature fusion based on joint sparse representations and wavelets for multiview classificationPattern Analysis and Applications10.1007/s10044-022-01110-226:2(645-653)Online publication date: 22-Sep-2022
    • (2019)Full Representation Data Embedding via Nonoverlapping Historical FeaturesIEEE Transactions on Cybernetics10.1109/TCYB.2018.283767549:8(3006-3019)Online publication date: Aug-2019
    • (2018)Partial Multi-view Subspace ClusteringProceedings of the 26th ACM international conference on Multimedia10.1145/3240508.3240679(1794-1801)Online publication date: 15-Oct-2018
    • (2017)Video Question Answering via Gradually Refined Attention over Appearance and MotionProceedings of the 25th ACM international conference on Multimedia10.1145/3123266.3123427(1645-1653)Online publication date: 23-Oct-2017
    • (2017)Video Question Answering via Hierarchical Dual-Level Attention Network LearningProceedings of the 25th ACM international conference on Multimedia10.1145/3123266.3123364(1050-1058)Online publication date: 23-Oct-2017

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media