research-article

On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval

Authors:

Gert R. G. Lanckriet,

Roger Levy,

Nuno VasconcelosAuthors Info & Claims

IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 36, Issue 3

Pages 521 - 535

https://doi.org/10.1109/TPAMI.2013.142

Published: 01 March 2014 Publication History

Abstract

The problem of cross-modal retrieval from multimedia repositories is considered. This problem addresses the design of retrieval systems that support queries across content modalities, for example, using an image to search for texts. A mathematical formulation is proposed, equating the design of cross-modal retrieval systems to that of isomorphic feature spaces for different content modalities. Two hypotheses are then investigated regarding the fundamental attributes of these spaces. The first is that low-level cross-modal correlations should be accounted for. The second is that the space should enable semantic abstraction. Three new solutions to the cross-modal retrieval problem are then derived from these hypotheses: correlation matching (CM), an unsupervised method which models cross-modal correlations, semantic matching (SM), a supervised technique that relies on semantic representation, and semantic correlation matching (SCM), which combines both. An extensive evaluation of retrieval performance is conducted to test the validity of the hypotheses. All approaches are shown successful for text retrieval in response to image queries and vice versa. It is concluded that both hypotheses hold, in a complementary form, although evidence in favor of the abstraction hypothesis is stronger than that for correlation.

Cited By

View all

Wang TLi FZhu LLi JZhang ZShen H(2024)Invisible Black-Box Backdoor Attack against Deep Cross-Modal Hashing RetrievalACM Transactions on Information Systems10.1145/365020542:4(1-27)Online publication date: 2-Mar-2024
https://dl.acm.org/doi/10.1145/3650205
Zhang LChen LZhou CLi XYang FYi Z(2024)Weighted Graph-Structured Semantics Constraint Network for Cross-Modal RetrievalIEEE Transactions on Multimedia10.1109/TMM.2023.328289426(1551-1564)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2023.3282894
Zhang YJi ZWang DPang YLi X(2024)USER: Unified Semantic Enhancement With Momentum Contrast for Image-Text RetrievalIEEE Transactions on Image Processing10.1109/TIP.2023.334829733(595-609)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TIP.2023.3348297
Show More Cited By

Recommendations

A new approach to cross-modal multimedia retrieval
MM '10: Proceedings of the 18th ACM international conference on Multimedia

The problem of joint modeling the text and image components of multimedia documents is studied. The text component is represented as a sample from a hidden topic model, learned with latent Dirichlet allocation, and images are represented as bags of ...
Cross-modal Retrieval with Correspondence Autoencoder
MM '14: Proceedings of the 22nd ACM international conference on Multimedia

The problem of cross-modal retrieval, e.g., using a text query to search for images and vice-versa, is considered in this paper. A novel model involving correspondence autoencoder (Corr-AE) is proposed here for solving this problem. The model is ...
Correspondence Autoencoders for Cross-Modal Retrieval
Special Issue on Smartphone-Based Interactive Technologies, Systems, and Applications and Special Issue on Extended Best Papers from ACM Multimedia 2014

This article considers the problem of cross-modal retrieval, such as using a text query to search for images and vice-versa. Based on different autoencoders, several novel models are proposed here for solving this problem. These models are constructed ...

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Pattern Analysis and Machine Intelligence

IEEE Transactions on Pattern Analysis and Machine Intelligence Volume 36, Issue 3

March 2014

207 pages

ISSN:0162-8828

Issue’s Table of Contents

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 March 2014

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

146
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Wang TLi FZhu LLi JZhang ZShen H(2024)Invisible Black-Box Backdoor Attack against Deep Cross-Modal Hashing RetrievalACM Transactions on Information Systems10.1145/365020542:4(1-27)Online publication date: 2-Mar-2024
https://dl.acm.org/doi/10.1145/3650205
Zhang LChen LZhou CLi XYang FYi Z(2024)Weighted Graph-Structured Semantics Constraint Network for Cross-Modal RetrievalIEEE Transactions on Multimedia10.1109/TMM.2023.328289426(1551-1564)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2023.3282894
Zhang YJi ZWang DPang YLi X(2024)USER: Unified Semantic Enhancement With Momentum Contrast for Image-Text RetrievalIEEE Transactions on Image Processing10.1109/TIP.2023.334829733(595-609)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TIP.2023.3348297
Yu WXing LNie FLi X(2024)Multiview Fuzzy Clustering Based on Anchor GraphIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2023.330663932:3(755-766)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1109/TFUZZ.2023.3306639
Okamura DHarakawa RIwahashi M(2024)LCNME: Label Correction Using Network Prediction Based on Memorization Effects for Cross-Modal Retrieval With Noisy LabelsIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.328654634:1(590-602)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TCSVT.2023.3286546
Cheng QTan ZWen KChen CGu X(2024)Semantic Pre-Alignment and Ranking Learning With Unified Framework for Cross-Modal RetrievalIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2022.318254934:7(6503-6516)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1109/TCSVT.2022.3182549
Chen RTan JZhou YYang ZNie FChen T(2024)Asymmetric low-rank double-level cooperation for scalable discrete cross-modal hashingExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121703237:PCOnline publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1016/j.eswa.2023.121703
Yao DLi ZLi BZhang CMa H(2024)Similarity Graph-correlation Reconstruction Network for unsupervised cross-modal hashingExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121516237:PBOnline publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1016/j.eswa.2023.121516
Deng JOu WGou JSong HWang AXu X(2024)Representation separation adversarial networks for cross-modal retrievalWireless Networks10.1007/s11276-020-02382-430:5(3469-3481)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1007/s11276-020-02382-4
Choudhury SLaina IRupprecht CVedaldi A(2024)The Curious Layperson: Fine-Grained Image Recognition Without Expert LabelsInternational Journal of Computer Vision10.1007/s11263-023-01885-9132:2(537-554)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1007/s11263-023-01885-9
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

Cited By

Recommendations

A new approach to cross-modal multimedia retrieval

Cross-modal Retrieval with Correspondence Autoencoder

Correspondence Autoencoders for Cross-Modal Retrieval

Comments

Information

Published In

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations