skip to main content
research-article

Transfer Learning across Feature-Rich Heterogeneous Feature Spaces via Feature-Space Remapping (FSR)

Published: 11 March 2015 Publication History

Abstract

Transfer learning aims to improve performance on a target task by utilizing previous knowledge learned from source tasks. In this paper we introduce a novel heterogeneous transfer learning technique, Feature-Space Remapping (FSR), which transfers knowledge between domains with different feature spaces. This is accomplished without requiring typical feature-feature, feature instance, or instance-instance co-occurrence data. Instead we relate features in different feature-spaces through the construction of metafeatures. We show how these techniques can utilize multiple source datasets to construct an ensemble learner which further improves performance. We apply FSR to an activity recognition problem and a document classification problem. The ensemble technique is able to outperform all other baselines and even performs better than a classifier trained using a large amount of labeled data in the target domain. These problems are especially difficult because, in addition to having different feature-spaces, the marginal probability distributions and the class labels are also different. This work extends the state of the art in transfer learning by considering large transfer across dramatically different spaces.

References

[1]
A. Arnold, R. Nallapati, and W. W. Cohen. 2007. A comparative study of methods for transductive transfer learning. In Proceedings of the 7th IEEE International Conference on Data Mining Workshops (ICDM’07). 77--82.
[2]
S. M. Barnett and S. J. Ceci. 2002. When and where do we apply what we learn?: A taxonomy for far transfer. Psychological Bulletin 128, 4 (2002), 612--637.
[3]
John Blitzer, Mark Dredze, and Fernando Pereira. 2007. Biographies, Bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistic.
[4]
John Blitzer, Ryan T. McDonald, and Fernando Pereira. 2006. Domain adaptation with structural correspondence learning. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. 120--128.
[5]
Leo Breiman. 1996. Bagging predictors. Machine Learning 24 (1996), 123--140.
[6]
J. P. Byrnes. 1996. Cognitive Development and Learning in Instructional Contexts. Allyn and Bacon, Boston.
[7]
R. Chattopadhyay, N. C. Krishnan, and S. Panchanathan. 2011. Topology preserving domain adaptation for addressing subject based variability in SEMG signal. In 2011 AAAI Spring Symposium Series. Retrieved from http://www.aaai.org/ocs/index.php/SSS/SSS11/paper/download/2395/2907.
[8]
Diane J. Cook, Kyle D. Feuz, and Narayanan C. Krishnan. 2012. Transfer learning for activity recognition. Knowledge and Information Systems 36 (2012), 537--556.
[9]
Wenyuan Dai, Yuqiang Chen, Gui-Rong Xue, Qiang Yang, and Yong Yu. 2008. Translated learning: Transfer learning across different feature spaces. In Advances in Neural Information Processing Systems. 353--360.
[10]
Wenyuan Dai, Gui-Rong Xue, Qiang Yang, and Yong Yu. 2007. Co-clustering based classification for out-of-domain documents. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 210--219.
[11]
Hal Daum�, Abhishek Kumar, and Avishek Saha. 2010. Co-regularization based semi-supervised domain adaptation. In Advances in Neural Information Processing Systems. 478--486.
[12]
Hal Daum�, III and Daniel Marcu. 2006. Domain adaptation for statistical classifiers. Journal of Artificial Intelligence Research 26, 1 (May 2006), 101--126.
[13]
Hal Daum� III. 2007. Frustratingly easy domain adaptation. In Proceedings of the 45th Annual Meeting Association of Computing Linguistics. 256--263.
[14]
Thomas G. Dietterich. 2000. Ensemble methods in machine learning. In Proceedings of the 1st International Workshop on Multiple Classifier Systems. Springer-Verlag, 1--15.
[15]
Lixin Duan, Dong Xu, and Ivor W. Tsang. 2012. Learning with augmented features for heterogeneous domain adaptation. In Proceedings of the International Conference on Machine Learning. Omnipress, Edinburgh, Scotland, 711--718.
[16]
Charles Elkan. 2001. The foundations of cost-sensitive learning. In Proceedings of the 17th International Joint Conference on Artificial Intelligence (IJCAI’01). Morgan Kaufmann, San Francisco, CA, 973--978.
[17]
Yoav Freund and Robert E. Schapire. 1997. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer System Science 55, 1 (1997), 119--139.
[18]
L. K. Hansen and P. Salamon. 1990. Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence 12, 10 (1990), 993--1001.
[19]
Thomas Hofmann. 1999. Probabilistic latent semantic indexing. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 50--57.
[20]
John Lafferty and Chengxiang Zhai. 2001. Document language models, query models, and risk minimization for information retrieval. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 111--119.
[21]
K. Lang. 1995. News weeder: Learning to filter netnews. In Proceedings of the 12th International Conference of Machine Learning. 331--339.
[22]
Thomas M. Mitchell. 1997. Bayesian learning. In Machine Learning (1st ed.). McGraw-Hill, New York, NY, 154--200.
[23]
S. J. Pan and Q. Yang. 2010. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22, 10 (2010), 1345--1359.
[24]
Sinno Jialin Pan, James T. Kwok, and Qiang Yang. 2008. Transfer learning via dimensionality reduction. In AAAI, Vol. 8. 677--682.
[25]
Sinno Jialin Pan, Xiaochuan Ni, Jian-Tao Sun, Qiang Yang, and Zheng Chen. 2010. Cross-domain sentiment classification via spectral feature alignment. In Proceedings of the 19th International Conference on World Wide Web. 751--760.
[26]
S. J. Pan, I. W. Tsang, J. T. Kwok, and Q. Yang. 2011. Domain adaptation via transfer component analysis. IEEE Transactions on Neural Networks 22, 2 (2011), 199--210.
[27]
Weike Pan, Erheng Zhong, and Qiang Yang. 2012. Transfer learning for text mining. In Mining Text Data. Springer, 223--257.
[28]
Peter Prettenhofer and Benno Stein. 2011. Cross-lingual adaptation using structural correspondence learning. ACM Transactions on Intelligent Systems and Technology (TIST) 3, 1 (2011), 13.
[29]
P. Rashidi and D. J. Cook. 2010. Multi home transfer learning for resident activity discovery and recognition. In KDD Knowledge Discovery from Sensor Data. 56--63.
[30]
P. Rashidi and D. J. Cook. 2011. Activity knowledge transfer in smart environments. Pervasive and Mobile Computing 7, 3 (2011), 331--343.
[31]
X. Shi and P. Yu. 2012. Dimensionality reduction on heterogeneous feature space. In Proceedings of the IEEE International Conference on Data Mining. 635--644.
[32]
E. Thorndike and R. S. Woodworth. 1901. The influence of improvement in one mental function upon the efficiency of other functions.(I). Psychological review 8, 3 (1901), 247--261.
[33]
S. Thrun. 1996. Explanation-Based Neural Network Learning: A Lifelong Learning Approach. Kluwer Academic.
[34]
S. Thrun and L. Pratt. 1998. Learning to Learn. Kluwer Academic.
[35]
T. van Kasteren, G. Englebienne, and B. Kr�se. 2008. Recognizing activities in multiple contexts using transfer learning. In AAAI AI in Eldercare Symposium. Retrieved from https://www.aaai.org/Papers/Symposia/Fall/2008/FS-08-02/FS08-02-023.pdf.
[36]
T. van Kasteren, G. Englebienne, and B. Kr�se. 2010. Transferring knowledge of activity recognition across sensor networks. In Pervasive Computing, Patrik Floren, Antonio Krger, and Mirjana Spasojevic (Eds.). Lecture Notes in Computer Science, Vol. 6030. Springer, Berlin, 283--300.
[37]
Ricardo Vilalta and Youssef Drissi. 2002. A perspective view and survey of meta-learning. Artificial Intelligence Review 18 (2002), 77--95.
[38]
David H. Wolpert. 1992. Stacked generalization. Neural Networks 5 (1992), 241--259.
[39]
Lin Xian-ming and Li Shao-zi. 2009. Transfer adaboost learning for action recognition. In Proceedings of the IEEE International Symposium on IT in Medicine Education (ITIME’09). Vol. 1. 659--664.
[40]
Qiang Yang, Yuqiang Chen, Gui-Rong Xue, Wenyuan Dai, and Yong Yu. 2009. Heterogeneous transfer learning for image clustering via the social web. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1. Association for Computational Linguistics, 1--9.
[41]
Yi Yao and Gianfranco Doretto. 2010. Boosting for transfer learning with multiple sources. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’10). IEEE, 1855--1862.
[42]
Erheng Zhong, Wei Fan, Jing Peng, Kun Zhang, Jiangtao Ren, Deepak Turaga, and Olivier Verscheure. 2009. Cross domain distribution adaptation via kernel mapping. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1027--1036.

Cited By

View all
  • (2024)MATEO: intermolecular α-amidoalkylation theoretical enantioselectivity optimization. Online tool for selection and design of chiral catalysts and productsJournal of Cheminformatics10.1186/s13321-024-00802-716:1Online publication date: 23-Jan-2024
  • (2024)Transfer learning and its extensive appositeness in human activity recognition: A surveyExpert Systems with Applications10.1016/j.eswa.2023.122538240(122538)Online publication date: Apr-2024
  • (2024)A systematic review of transfer learning in software engineeringMultimedia Tools and Applications10.1007/s11042-024-19756-xOnline publication date: 27-Jul-2024
  • Show More Cited By

Index Terms

  1. Transfer Learning across Feature-Rich Heterogeneous Feature Spaces via Feature-Space Remapping (FSR)

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Intelligent Systems and Technology
    ACM Transactions on Intelligent Systems and Technology  Volume 6, Issue 1
    April 2015
    255 pages
    ISSN:2157-6904
    EISSN:2157-6912
    DOI:10.1145/2745393
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 March 2015
    Accepted: 01 March 2014
    Revised: 01 February 2014
    Received: 01 September 2013
    Published in�TIST�Volume 6, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Heterogeneous transfer learning
    2. activity recognition
    3. domain adaption
    4. text classification

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)56
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 22 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)MATEO: intermolecular α-amidoalkylation theoretical enantioselectivity optimization. Online tool for selection and design of chiral catalysts and productsJournal of Cheminformatics10.1186/s13321-024-00802-716:1Online publication date: 23-Jan-2024
    • (2024)Transfer learning and its extensive appositeness in human activity recognition: A surveyExpert Systems with Applications10.1016/j.eswa.2023.122538240(122538)Online publication date: Apr-2024
    • (2024)A systematic review of transfer learning in software engineeringMultimedia Tools and Applications10.1007/s11042-024-19756-xOnline publication date: 27-Jul-2024
    • (2024)Heterogeneous transfer learning: recent developments, applications, and challengesMultimedia Tools and Applications10.1007/s11042-024-18352-383:27(69759-69795)Online publication date: 2-Feb-2024
    • (2024)Domain-independent short-term calibration based hybrid approach for motor imagery electroencephalograph classification: a comprehensive reviewMultimedia Tools and Applications10.1007/s11042-023-15900-183:3(9181-9226)Online publication date: 1-Jan-2024
    • (2024)Multi-layer Tuning CLIP for Few-Shot Image ClassificationPattern Recognition and Computer Vision10.1007/978-981-97-8620-6_12(173-186)Online publication date: 20-Oct-2024
    • (2023)The Advent of Domain Adaptation into Artificial Intelligence for Gastrointestinal Endoscopy and Medical ImagingDiagnostics10.3390/diagnostics1319302313:19(3023)Online publication date: 22-Sep-2023
    • (2023)A Fine-Tuning Based Approach for Daily Activity Recognition between Smart HomesApplied Sciences10.3390/app1309570613:9(5706)Online publication date: 5-May-2023
    • (2023)Iterative transfer learning for automatic collective motion tuning on multiple robot platformsFrontiers in Neurorobotics10.3389/fnbot.2023.111399117Online publication date: 16-Mar-2023
    • (2023)On the use of evolutionary and swarm intelligence algorithms in transfer learning approaches: a reviewInternational Journal of Biosensors & Bioelectronics10.15406/ijbsbe.2023.08.002358:2(58-64)Online publication date: 26-Dec-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media