poster

NUS-WIDE: a real-world web image database from National University of Singapore

Authors:

Yantao ZhengAuthors Info & Claims

CIVR '09: Proceedings of the ACM International Conference on Image and Video Retrieval

Article No.: 48, Pages 1 - 9

https://doi.org/10.1145/1646396.1646452

Published: 08 July 2009 Publication History

Abstract

This paper introduces a web image dataset created by NUS's Lab for Media Search. The dataset includes: (1) 269,648 images and the associated tags from Flickr, with a total of 5,018 unique tags; (2) six types of low-level features extracted from these images, including 64-D color histogram, 144-D color correlogram, 73-D edge direction histogram, 128-D wavelet texture, 225-D block-wise color moments extracted over 5x5 fixed grid partitions, and 500-D bag of words based on SIFT descriptions; and (3) ground-truth for 81 concepts that can be used for evaluation. Based on this dataset, we highlight characteristics of Web image collections and identify four research issues on web image annotation and retrieval. We also provide the baseline results for web image annotation by learning from the tags using the traditional k-NN algorithm. The benchmark results indicate that it is possible to learn effective models from sufficiently large image dataset to facilitate general image retrieval.

References

[1]

S. Arya, D. M. Mount, N. S. N. R. Silverman, and A. Wu. An optimal algorithm for approximate nearest neighbor searching. Journal of ACM, 45: 891--923, 1998.

Digital Library

[2]

K. Barnard, P. Duygulu, D. Forsyth, N. de Freitas, D. M. Blei, and M. I. Jordan. Matching words and pictures. Journal of Machine Learning Research, 3: 1107--1135, 2003.

Digital Library

[3]

F. Blog. http://blog.flickr.net/en/2007/05/29/were-going-down/.

[4]

L. Fei-Fei, R. Fergus, and P. Perona. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In CVPR Workshop on Generative-Model Based Vision, 2004.

Digital Library

[5]

A. Hauptmann, R. Yan, W.-H. Lin, M. Christel, and H. Wactlar. Can high-level concepts fill the semantic gap in video retrieval? a case study with broadcast news. IEEE Transactions on Multimedia, 9(5): 958--966, 2007.

Digital Library

[6]

J. Huang, S. Kumar, M. Mitra, W.-J. Zhu, and R. Zabih. Image indexing using color correlogram. In IEEE Conf. on Computer Vision and Pattern Recognition, pages 762--768, June 1997.

Digital Library

[7]

D. Lowe. Distinctive image features from scale-invariant keypoints. Int'l J. Computer Vision, 2(60): 91--110, 2004.

Digital Library

[8]

Y. Lu, L. Zhang, Q. Tian, and W.-Y. Ma. What are the high-level concepts with small semantic gaps? In IEEE Conf. on Computer Vision and Pattern Recognition, 2008.

[9]

B. S. Manjunath and W.-Y. Ma. Texture features for browsing and retrieval of image data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(8): 837--842, August 1996.

Digital Library

[10]

M. Naphade, J. R. Smith, J. Tesic, S. Chang, W. Hsu, L. Kennedy, A. Hauptmann, and J. Curtis. A large-scale concept ontology for multimedia. IEEE MultiMedia, 13: 86--91, July 2006.

Digital Library

[11]

D. K. Park, Y. S. Jeon, and C. S. Won. Efficient use of local edge histogram descriptor. In ACM Multimedia, 2000.

Digital Library

[12]

G.-J. Qi, X.-S. Hua, Y. Rui, J. Tang, T. Mei, and H.-J. Zhang. Correlative multi-label video annotation. In ACM Multimedia, 2007.

Digital Library

[13]

G.-J. Qi, X.-S. Hua, Y. Rui, J. Tang, and H.-J. Zhang. Two-dimensional multi-label active learning with an efficient online adaptation model for image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, to appear.

Digital Library

[14]

L. G. Shapiro and G. C. Stockman. Computer Vision. Prentice Hall, 2003.

Digital Library

[15]

C. G. M. Snoek, M. Worring, J. C. van Gemert, J.-M. Geusebroek, and A. W. M. Smeulders. The challenge problem for automated detection of 101 semantic concepts in multimedia. In ACM Multimedia, Oct. 2006.

Digital Library

[16]

M. Stricker and M. Orengo. Similarity of color images. In SPIE Storage and Retrieval for Image and Video Databases III, Feb. 1995.

[17]

J. Tang, X.-S. Hua, M. Wang, Z. Gu, G.-J. Qi, and X. Wu. Correlative linear neighborhood propagation for video annotation. IEEE Transactions on Systems, Man, and Cybernetics--Part B: Cybernetics, 39(2), April 2009.

Digital Library

[18]

J. Tang, Y. Song, X.-S. Hua, T. Mei, and X. Wu. To construct optimal training set for video annotation. In ACM Multimedia, Oct. 2006.

Digital Library

[19]

A. Torralba, R. Fergus, and W. Freeman. 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11): 1958--1970, November 2008.

Digital Library

[20]

X.-J. Wang, L. Zhang, X. Li, and W.-Y. Ma. Annotating images by mining image search results. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11): 1919--1932, November 2008.

Digital Library

Cited By

Ke XChen BCai YLiu HGuo WChen W(2025)Modality-specific adaptive scaling and attention network for cross-modal retrievalNeurocomputing10.1016/j.neucom.2024.128664612(128664)Online publication date: Jan-2025
https://doi.org/10.1016/j.neucom.2024.128664
Zhou SWang XYang MSong W(2025)Multi-view clustering with adaptive anchor and bipartite graph learningNeurocomputing10.1016/j.neucom.2024.128627611(128627)Online publication date: Jan-2025
https://doi.org/10.1016/j.neucom.2024.128627
Zou XYan YHao XHu YWen HLiu EZhang JLi YLi TZheng YLiang Y(2025)Deep learning for cross-domain data fusion in urban computing: Taxonomy, advances, and outlookInformation Fusion10.1016/j.inffus.2024.102606113(102606)Online publication date: Jan-2025
https://doi.org/10.1016/j.inffus.2024.102606
Show More Cited By

Index Terms

NUS-WIDE: a real-world web image database from National University of Singapore
1. Information systems
  1. Information retrieval
    1. Document representation
  2. Information systems applications

Recommendations

Label-specific training set construction from web resource for image annotation

Recently many research efforts have been devoted to image annotation by leveraging on the associated tags/keywords of web images as training labels. A key issue to resolve is the relatively low accuracy of the tags. In this paper, we propose a novel ...
Tagging and retrieving images with co-occurrence models: from corel to flickr
LS-MMRM '09: Proceedings of the First ACM workshop on Large-scale multimedia retrieval and mining

This paper presents two models for content-based automatic image annotation and retrieval in web image repositories, based on the co-occurrence of tags and visual features in the images. In particular, we show how additional measures can be taken to ...
Accuracy Of User-Contributed Image Tagging In Flickr: A Natural Disaster Case Study
SMSociety '16: Proceedings of the 7th 2016 International Conference on Social Media & Society

Social media platforms have become extremely popular during the past few years, presenting an alternate, and often preferred, avenue for information dissemination within massive global communities. Such user-generated multimedia content is emerging as a ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIVR '09: Proceedings of the ACM International Conference on Image and Video Retrieval

July 2009

383 pages

ISBN:9781605584805

DOI:10.1145/1646396

Conference Chairs:
Yiannis Kompatsiaris
CERTH-ITI, Greece
,
Stephane Marchand-Maillet
Univ. of Geneva, Switzerland
,
Program Chairs:
Yannis Avrithis
NTUA, Greece
,
Noel O Connor
DCU, Ireland
,
Daniel Gatica-Perez
Idiap Research Institute, Switzerland
,
Tat-Seng Chua
National University of Singapore, Singapore

Copyright � 2009 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

In-Cooperation

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 July 2009

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Poster

Conference

CIVR '09

Sponsor:

SIGMM

CIVR '09: CIVR '09 - International Conference on Image and Video Retrieval

July 8 - 10, 2009

Santorini, Fira, Greece

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2,317
Total Citations
View Citations
2,953
Total Downloads

Downloads (Last 12 months)273
Downloads (Last 6 weeks)45

Reflects downloads up to 18 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Ke XChen BCai YLiu HGuo WChen W(2025)Modality-specific adaptive scaling and attention network for cross-modal retrievalNeurocomputing10.1016/j.neucom.2024.128664612(128664)Online publication date: Jan-2025
https://doi.org/10.1016/j.neucom.2024.128664
Zhou SWang XYang MSong W(2025)Multi-view clustering with adaptive anchor and bipartite graph learningNeurocomputing10.1016/j.neucom.2024.128627611(128627)Online publication date: Jan-2025
https://doi.org/10.1016/j.neucom.2024.128627
Zou XYan YHao XHu YWen HLiu EZhang JLi YLi TZheng YLiang Y(2025)Deep learning for cross-domain data fusion in urban computing: Taxonomy, advances, and outlookInformation Fusion10.1016/j.inffus.2024.102606113(102606)Online publication date: Jan-2025
https://doi.org/10.1016/j.inffus.2024.102606
Peng ZZheng GHuang W(2024)Semi-Supervised Learning with Close-Form Label Propagation Using a Bipartite GraphSymmetry10.3390/sym1610131216:10(1312)Online publication date: 4-Oct-2024
https://doi.org/10.3390/sym16101312
Yang YChen HLiu RLiu SZhan YHu CShi R(2024)DSCEH: Dual-Stream Correlation-Enhanced Deep Hashing for Image RetrievalMathematics10.3390/math1214222112:14(2221)Online publication date: 16-Jul-2024
https://doi.org/10.3390/math12142221
Cheng TBi TJi WTian C(2024)Graph Convolutional Network for Image Restoration: A SurveyMathematics10.3390/math1213202012:13(2020)Online publication date: 28-Jun-2024
https://doi.org/10.3390/math12132020
Redaoui ABelalia ABelloulata K(2024)Deep Supervised Hashing by Fusing Multiscale Deep Features for Image RetrievalInformation10.3390/info1503014315:3(143)Online publication date: 5-Mar-2024
https://doi.org/10.3390/info15030143
Song JHu YZhu LZhang CZhang JZhang S(2024)Soft Contrastive Cross-Modal RetrievalApplied Sciences10.3390/app1405194414:5(1944)Online publication date: 27-Feb-2024
https://doi.org/10.3390/app14051944
Wu RZhu XYi ZZou ZLiu YZhu L(2024)Multi-Grained Similarity Preserving and Updating for Unsupervised Cross-Modal HashingApplied Sciences10.3390/app1402087014:2(870)Online publication date: 19-Jan-2024
https://doi.org/10.3390/app14020870
Ren DXu W(2024)Cross-modal retrieval based on multi-dimensional feature fusion hashingFrontiers in Physics10.3389/fphy.2024.137987312Online publication date: 19-Jun-2024
https://doi.org/10.3389/fphy.2024.1379873
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents