TY - GEN
T1 - Obfuscated VBA macro detection using machine learning
AU - Kim, Sangwoo
AU - Hong, Seokmyung
AU - Oh, Jaesang
AU - Lee, Heejo
N1 - Funding Information:
ACKNOWLEDGMENT The authors would like to express our sincere gratitude for our shepherd, Eric Eide, and the anonymous reviewersfor their every valued comments to improve the quality of the paper. This research has been supported by Institute for Information & communications TechnologyPromotion (IITP) grant funded by the Korea government (MSIP) (NO.2017-0-00184, Self-Learning Cyber Immune Technology Development).
Publisher Copyright:
© 2018 IEEE.
PY - 2018/7/19
Y1 - 2018/7/19
N2 - Malware using document files as an attack vector has continued to increase and now constitutes a large portion of phishing attacks. To avoid anti-virus detection, malware writers usually implement obfuscation techniques in their source code. Although obfuscation is related to malicious code detection, little research has been conducted on obfuscation with regards to Visual Basic for Applications (VBA) macros. In this paper, we summarize the obfuscation techniques and propose an obfuscated macro code detection method using five machine learning classifiers. To train these classifiers, our proposed method uses 15 discriminant static features, taking into account the characteristics of the VBA macros. We evaluated our approach using a real-world dataset of obfuscated and non-obfuscated VBA macros extracted from Microsoft Office document files. The experimental results demonstrate that our detection approach achieved a F2 score improvement of greater than 23% compared to those of related studies.
AB - Malware using document files as an attack vector has continued to increase and now constitutes a large portion of phishing attacks. To avoid anti-virus detection, malware writers usually implement obfuscation techniques in their source code. Although obfuscation is related to malicious code detection, little research has been conducted on obfuscation with regards to Visual Basic for Applications (VBA) macros. In this paper, we summarize the obfuscation techniques and propose an obfuscated macro code detection method using five machine learning classifiers. To train these classifiers, our proposed method uses 15 discriminant static features, taking into account the characteristics of the VBA macros. We evaluated our approach using a real-world dataset of obfuscated and non-obfuscated VBA macros extracted from Microsoft Office document files. The experimental results demonstrate that our detection approach achieved a F2 score improvement of greater than 23% compared to those of related studies.
KW - Machine learning
KW - Macro malware
KW - Microsoft Office document
KW - Obfuscation
KW - VBA macro
UR - http://www.scopus.com/inward/record.url?scp=85051066242&partnerID=8YFLogxK
U2 - 10.1109/DSN.2018.00057
DO - 10.1109/DSN.2018.00057
M3 - Conference contribution
AN - SCOPUS:85051066242
T3 - Proceedings - 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2018
SP - 490
EP - 501
BT - Proceedings - 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2018
Y2 - 25 June 2018 through 28 June 2018
ER -