skip to main content
10.1145/2517312.2517315acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

Structural detection of android malware using embedded call graphs

Published: 04 November 2013 Publication History

Abstract

The number of malicious applications targeting the Android system has literally exploded in recent years. While the security community, well aware of this fact, has proposed several methods for detection of Android malware, most of these are based on permission and API usage or the identification of expert features. Unfortunately, many of these approaches are susceptible to instruction level obfuscation techniques. Previous research on classic desktop malware has shown that some high level characteristics of the code, such as function call graphs, can be used to find similarities between samples while being more robust against certain obfuscation strategies. However, the identification of similarities in graphs is a non-trivial problem whose complexity hinders the use of these features for malware detection. In this paper, we explore how recent developments in machine learning classification of graphs can be efficiently applied to this problem. We propose a method for malware detection based on efficient embeddings of function call graphs with an explicit feature map inspired by a linear-time graph kernel. In an evaluation with 12,158 malware samples our method, purely based on structural features, outperforms several related approaches and detects 89% of the malware with few false alarms, while also allowing to pin-point malicious code structures within Android applications.

References

[1]
B. Anderson, D. Quist, J. Neil, C. Storlie, and T. Lane. Graph-based malware detection using dynamic analysis. Journal in Computer Virology, 2011.
[2]
A. Barla, F. Odone, and A. Verri. Histogram intersection kernel for image classification. In Proc. of International Conference on Image Processing, ICIP, volume 2, pages III--513--516, 2003.
[3]
K. M. Borgwardt, C. S. Ong, S. Sch�nauer, S. V. N. Vishwanathan, A. J. Smola, and H.-P. Kriegel. Protein function prediction via graph kernels. Bioinformatics, 2005.
[4]
K. Borhwardt. Graph Kernels. PhD thesis, University of Munich, 2007.
[5]
S. Cesare and Y. Xiang. Classification of malware using structured control flow. In Proc. of the Eighth Australasian Symposium on Parallel and Distributed Computing, 2010.
[6]
S. Cesare and Y. Xiang. Malware variant detection using similarity search over sets of control flow graphs. In Proc. of the International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), 2011.
[7]
J. Crussell, C. Gibler, and H. Chen. Attack of the clones: Detecting cloned applications on android markets. In Proc. of European Symposium on Research in Computer Security (ESORICS), 2012.
[8]
A. Desnos. Androguard - Reverse engineering, Malware and goodware analysis of Android applications. http://code.google.com/p/androguard/, 2013.
[9]
T. Dullien and R. Rolles. Graph-based comparison of executable objects, 2005.
[10]
W. Enck, M. Ongtang, and P. D. McDaniel. On lightweight mobile phone application certification. In Proc. of ACM Conference on Computer and Communications Security (CCS), pages 235--245, 2009.
[11]
W. Enck, P. Gilbert, B. gon Chun, L. P. Cox, J. Jung, P. McDaniel, and A. Sheth. Taintdroid: An information-flow tracking system for realtime privacy monitoring on smartphones. In Proc. of USENIX Symposium on Operating Systems Design and Implementation (OSDI), pages 393--407, 2010.
[12]
W. Enck, D. Octeau, P. McDaniel, and S. Chaudhuri. A study of Android application security. In Proc. of USENIX Security Symposium, 2011.
[13]
R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research (JMLR), 9: 1871--1874, 2008.
[14]
A. P. Felt, E. Chin, S. Hanna, D. Song, and D. Wagner. Android permissions demystified. In Proc. of ACM Conference on Computer and Communications Security (CCS), pages 627--638, 2011.
[15]
A. P. Felt, M. Finifter, E. Chin, S. Hanna, and D. Wagner. A survey of mobile malware in the wild. In phProc. of ACM Worksgop on Security and Privacy in Smartphones and Mobile Devices (SPSM), pages 3--14, 2011.
[16]
M. Grace, Y. Zhou, Q. Zhang, S. Zou, and X. Jiang. Riskranker: scalable and accurate zero-day android malware detection. In Proc. of International Conference on Mobile Systems, Applications, and Services (MOBISYS), pages 281--294, 2012.
[17]
S. Hanna, E. Wu, S. Li, C. Chen, D. Song, and L. Huang. Juxtapp: A scalable system for detecting code reuse among android applications. In Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA), 2012.
[18]
H. Hao, V. Singh, and W. Du. On the effectiveness of api-level access control using bytecode rewriting in android. In Proc. of the ACM SIGSAC symposium on Information, computer and communications security, 2013.
[19]
D. Haussler. Convolution kernels on discrete structures. Technical Report UCSC-CRL-99--10, UC Santa Cruz, July 1999.
[20]
S. Hido and H. Kashima. A linear-time graph kernel. In Proc. of International Conference on Data Mining (ICDM), pages 179--188, 2009.
[21]
X. Hu, T.-c. Chiueh, and K. G. Shin. Large-scale malware indexing using function-call graphs. In Proc. of the ACM conference on Computer and communications security, 2009.
[22]
X. Jiang. An evaluation of the application ("app") verification service in android 4.2, December 2012. http://www.csc.ncsu.edu/faculty/jiang/appverify/.
[23]
Juniper Networks. Juniper networks third annual mobile threats report, 2013.
[24]
J. Kinable and O. Kostakis. Malware classification based on call graph clustering. Journal in Computer Virology, 2011.
[25]
C. Kolbitsch, P. Comparetti, C. Kruegel, E. Kirda, X. Zhou, and X. Wang. Effective and efficient malware detection at the end host. In Proc. of USENIX Security Symposium, 2009.
[26]
R. I. Kondor and J. D. Lafferty. Diffusion kernels on graphs and other discrete input spaces. In Proc. of the International Conference on Machine Learning, 2002.
[27]
C. Kruegel, E. Kirda, D. Mutz, W. Robertson, and G. Vigna. Polymorphic worm detection using structural information of executables. In Recent Adances in Intrusion Detection (RAID), 2005.
[28]
C. Liu, C. Chen, J. Han, and P. S. Yu. Gplag: detection of software plagiarism by program dependence graph analysis. In Proc. of the ACM SIGKDD international conference on Knowledge discovery and data mining, 2006.
[29]
L. Lu, Z. Li, Z. Wu, W. Lee, and G. Jiang. Chex: statically vetting android apps for component hijacking vulnerabilities. In Proc. of ACM Conference on Computer and Communications Security (CCS), 2012.
[30]
J. Ming, M. Pan, and D. Gao. ibinhunt: Binary hunting with inter-procedural control flow. In Information Security and Cryptology (ICISC), 2012.
[31]
H. Peng, C. S. Gates, B. P. Sarma, N. Li, Y. Qi, R. Potharaju, C. Nita-Rotaru, and I. Molloy. Using probabilistic generative models for ranking risks of android apps. In Proc. of ACM Conference on Computer and Communications Security (CCS), pages 241--252, 2012.
[32]
G. Portokalidis, P. Homburg, K. Anagnostakis, and H. Bos. Paranoid android: Versatile protection for smartphones. In Proc. of Annual Computer Security Applications Conference (ACSAC), 2010.
[33]
L. Ralaivola, S. J. Swamidass, H. Saigo, and P. Baldi. Graph kernels for chemical informatics. Neural Networks, 2005.
[34]
V. Rastogi, Y. Chen, and W. Enck. Appsplayground: Automatic security analysis of smartphone applications. In Proc. ACM Conference on Data and Application Security and Privacy (CODASPY), 2013.
[35]
B. P. Sarma, N. Li, C. Gates, R. Potharaju, C. Nita-Rotaru, and I. Molloy. Android permissions: a perspective combining risks and benefits. In Proc. of ACM symposium on Access Control Models and Technologies (SACMAT), pages 13--22, 2012.
[36]
S. Shang, N. Zheng, J. Xu, M. Xu, and H. Zhang. Detecting malware variants via function-call graph similarity. In Proc. of the International Conference on Malicious and Unwanted Software (MALWARE), 2010.
[37]
N. Shervashidze, S. Vishwanathan, T. Petri, K. Mehlhorn, and K. Borgwardt. Efficient Graphlet Kernels for Large Graph Comparison. In Proc. of the International Conference on Artificial Intelligence and Statistics (AISTATS), 2009.
[38]
C. Wagner, G. Wagener, R. State, and T. Engel. Malware analysis with graph kernels and support vector machines. In International Conference on Malicious and Unwanted Software (MALWARE), 2009.
[39]
D.-J. Wu, C.-H. Mao, T.-E. Wei, H.-M. Lee, and K.-P. Wu. Droidmat: Android malware detection through manifest and API calls tracing. In Proc. of Asia Joint Conference on Information Security (Asia JCIS), pages 62--69, 2012.
[40]
F. Yamaguchi, M. Lottmann, and K. Rieck. Generalized vulnerability extrapolation using abstract syntax trees. In Proc. of 28th Annual Computer Security Applications Conference (ACSAC), pages 359--368, Dec. 2012.
[41]
L.-K. Yan and H. Yin. Droidscope: Seamlessly reconstructing os and dalvik semantic views for dynamic android malware analysis. In Proc. of USENIX Security Symposium, 2012.
[42]
W. Zhou, Y. Zhou, X. Jiang, and P. Ning. Detecting repackaged smartphone applications in third-party android marketplaces. In Proc. ACM Conference on Data and Application Security and Privacy (CODASPY), pages 317--326, 2012.
[43]
Y. Zhou and X. Jiang. Dissecting android malware: Characterization and evolution. In Proc. of IEEE Symposium on Security and Privacy, pages 95--109, 2012.
[44]
Zhou, Wang, Zhou, and Jiang}ZhoWanZhouJia12Y. Zhou, Z. Wang, W. Zhou, and X. Jiang. Hey, you, get off of my market: Detecting malicious apps in official and alternative android markets. In phProc. of Network and Distributed System Security Symposium (NDSS), 2012.

Cited By

View all
  • (2024)Achieving High Accuracy in Android Malware Detection through Genetic Programming Symbolic ClassifierComputers10.3390/computers1308019713:8(197)Online publication date: 15-Aug-2024
  • (2024)ANFIS-AMAL: Android Malware Threat Assessment Using Ensemble of ANFIS and GWOCybernetics and Information Technologies10.2478/cait-2024-002424:3(39-58)Online publication date: 19-Sep-2024
  • (2024)NativeSummary: Summarizing Native Binary Code for Inter-language Static Analysis of Android AppsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680335(971-982)Online publication date: 11-Sep-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
AISec '13: Proceedings of the 2013 ACM workshop on Artificial intelligence and security
November 2013
116 pages
ISBN:9781450324885
DOI:10.1145/2517312
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 November 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. graph kernels
  2. machine learning
  3. malware detection

Qualifiers

  • Research-article

Conference

CCS'13
Sponsor:

Acceptance Rates

AISec '13 Paper Acceptance Rate 10 of 17 submissions, 59%;
Overall Acceptance Rate 94 of 231 submissions, 41%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)54
  • Downloads (Last 6 weeks)5
Reflects downloads up to 17 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Achieving High Accuracy in Android Malware Detection through Genetic Programming Symbolic ClassifierComputers10.3390/computers1308019713:8(197)Online publication date: 15-Aug-2024
  • (2024)ANFIS-AMAL: Android Malware Threat Assessment Using Ensemble of ANFIS and GWOCybernetics and Information Technologies10.2478/cait-2024-002424:3(39-58)Online publication date: 19-Sep-2024
  • (2024)NativeSummary: Summarizing Native Binary Code for Inter-language Static Analysis of Android AppsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680335(971-982)Online publication date: 11-Sep-2024
  • (2024)An Empirical Study on Android Malware Characterization by Social Network AnalysisIEEE Transactions on Reliability10.1109/TR.2023.330438973:1(757-770)Online publication date: Mar-2024
  • (2024)Shallow-Sec: Malware Detection in Real-Time Devices using Feature Weightage and Shallow Learning Models2024 5th International Conference on Electronics and Sustainable Communication Systems (ICESC)10.1109/ICESC60852.2024.10690023(694-702)Online publication date: 7-Aug-2024
  • (2024)SGT: Aging-related bug prediction via semantic feature learning based on graph-transformerJournal of Systems and Software10.1016/j.jss.2024.112156217(112156)Online publication date: Nov-2024
  • (2024)Code-centric learning-based just-in-time vulnerability detectionJournal of Systems and Software10.1016/j.jss.2024.112014214(112014)Online publication date: Aug-2024
  • (2024)Detection of Evasive Android Malware Using EigenGCNJournal of Information Security and Applications10.1016/j.jisa.2024.10388086(103880)Online publication date: Nov-2024
  • (2024)An advanced computing approach for software vulnerability detectionMultimedia Tools and Applications10.1007/s11042-024-19682-yOnline publication date: 27-Jun-2024
  • (2024)Optimizing software vulnerability detection using RoBERTa and machine learningAutomated Software Engineering10.1007/s10515-024-00440-131:2Online publication date: 8-May-2024
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media