Automatic and lightweight grammar generation for fuzz testing

Su Yong Kim, Sungdeok Cha, Doo Hwan Bae

Research output: Contribution to journalArticle

13 Citations (Scopus)

Abstract

Blackbox fuzz testing can only test a small portion of code when rigorously checking the well-formedness of input values. To overcome this problem, blackbox fuzz testing is performed using a grammar that delineates the format information of input values. However, it is almost impossible to manually construct a grammar if the input specifications are not known. We propose an alternative technique: the automatic generation of fuzzing grammars using API-level concolic testing. API-level concolic testing collects constraints at the library function level rather than the instruction level. While API-level concolic testing may be less accurate than instruction-level concolic testing, it is highly useful for speedily generating fuzzing grammars that enhance code coverage for real-world programs. To verify the feasibility of the proposed concept, we implemented the system for generating ActiveX control fuzzing grammars, named YMIR. The experiment results showed that the YMIR system was capable of generating fuzzing grammars that can raise branch coverage for ActiveX control using highly-structured input string by 15-50%. In addition, the YMIR system discovered two new vulnerabilities revealed only when input values are well-formed. Automatic fuzzing grammar generation through API-level concolic testing is not restricted to the testing of ActiveX controls; it should also be applicable to other string processing program whose source code is unavailable.

Original languageEnglish
Pages (from-to)1-11
Number of pages11
JournalComputers and Security
Volume36
DOIs
Publication statusPublished - 2013 Mar 25

Fingerprint

grammar
Testing
Application programming interfaces (API)
coverage
instruction
Values
vulnerability
Specifications
experiment
Processing
Experiments

Keywords

  • ActiveX control
  • Blackbox fuzz testing
  • Grammar-based fuzzer
  • Hybrid fuzz testing
  • Whitebox fuzz testing

ASJC Scopus subject areas

  • Computer Science(all)
  • Law

Cite this

Automatic and lightweight grammar generation for fuzz testing. / Kim, Su Yong; Cha, Sungdeok; Bae, Doo Hwan.

In: Computers and Security, Vol. 36, 25.03.2013, p. 1-11.

Research output: Contribution to journalArticle

@article{e092fb7d7b204db1bb73cb95dc532cfc,
title = "Automatic and lightweight grammar generation for fuzz testing",
abstract = "Blackbox fuzz testing can only test a small portion of code when rigorously checking the well-formedness of input values. To overcome this problem, blackbox fuzz testing is performed using a grammar that delineates the format information of input values. However, it is almost impossible to manually construct a grammar if the input specifications are not known. We propose an alternative technique: the automatic generation of fuzzing grammars using API-level concolic testing. API-level concolic testing collects constraints at the library function level rather than the instruction level. While API-level concolic testing may be less accurate than instruction-level concolic testing, it is highly useful for speedily generating fuzzing grammars that enhance code coverage for real-world programs. To verify the feasibility of the proposed concept, we implemented the system for generating ActiveX control fuzzing grammars, named YMIR. The experiment results showed that the YMIR system was capable of generating fuzzing grammars that can raise branch coverage for ActiveX control using highly-structured input string by 15-50{\%}. In addition, the YMIR system discovered two new vulnerabilities revealed only when input values are well-formed. Automatic fuzzing grammar generation through API-level concolic testing is not restricted to the testing of ActiveX controls; it should also be applicable to other string processing program whose source code is unavailable.",
keywords = "ActiveX control, Blackbox fuzz testing, Grammar-based fuzzer, Hybrid fuzz testing, Whitebox fuzz testing",
author = "Kim, {Su Yong} and Sungdeok Cha and Bae, {Doo Hwan}",
year = "2013",
month = "3",
day = "25",
doi = "10.1016/j.cose.2013.02.001",
language = "English",
volume = "36",
pages = "1--11",
journal = "Computers and Security",
issn = "0167-4048",
publisher = "Elsevier Limited",

}

TY - JOUR

T1 - Automatic and lightweight grammar generation for fuzz testing

AU - Kim, Su Yong

AU - Cha, Sungdeok

AU - Bae, Doo Hwan

PY - 2013/3/25

Y1 - 2013/3/25

N2 - Blackbox fuzz testing can only test a small portion of code when rigorously checking the well-formedness of input values. To overcome this problem, blackbox fuzz testing is performed using a grammar that delineates the format information of input values. However, it is almost impossible to manually construct a grammar if the input specifications are not known. We propose an alternative technique: the automatic generation of fuzzing grammars using API-level concolic testing. API-level concolic testing collects constraints at the library function level rather than the instruction level. While API-level concolic testing may be less accurate than instruction-level concolic testing, it is highly useful for speedily generating fuzzing grammars that enhance code coverage for real-world programs. To verify the feasibility of the proposed concept, we implemented the system for generating ActiveX control fuzzing grammars, named YMIR. The experiment results showed that the YMIR system was capable of generating fuzzing grammars that can raise branch coverage for ActiveX control using highly-structured input string by 15-50%. In addition, the YMIR system discovered two new vulnerabilities revealed only when input values are well-formed. Automatic fuzzing grammar generation through API-level concolic testing is not restricted to the testing of ActiveX controls; it should also be applicable to other string processing program whose source code is unavailable.

AB - Blackbox fuzz testing can only test a small portion of code when rigorously checking the well-formedness of input values. To overcome this problem, blackbox fuzz testing is performed using a grammar that delineates the format information of input values. However, it is almost impossible to manually construct a grammar if the input specifications are not known. We propose an alternative technique: the automatic generation of fuzzing grammars using API-level concolic testing. API-level concolic testing collects constraints at the library function level rather than the instruction level. While API-level concolic testing may be less accurate than instruction-level concolic testing, it is highly useful for speedily generating fuzzing grammars that enhance code coverage for real-world programs. To verify the feasibility of the proposed concept, we implemented the system for generating ActiveX control fuzzing grammars, named YMIR. The experiment results showed that the YMIR system was capable of generating fuzzing grammars that can raise branch coverage for ActiveX control using highly-structured input string by 15-50%. In addition, the YMIR system discovered two new vulnerabilities revealed only when input values are well-formed. Automatic fuzzing grammar generation through API-level concolic testing is not restricted to the testing of ActiveX controls; it should also be applicable to other string processing program whose source code is unavailable.

KW - ActiveX control

KW - Blackbox fuzz testing

KW - Grammar-based fuzzer

KW - Hybrid fuzz testing

KW - Whitebox fuzz testing

UR - http://www.scopus.com/inward/record.url?scp=84875148672&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84875148672&partnerID=8YFLogxK

U2 - 10.1016/j.cose.2013.02.001

DO - 10.1016/j.cose.2013.02.001

M3 - Article

AN - SCOPUS:84875148672

VL - 36

SP - 1

EP - 11

JO - Computers and Security

JF - Computers and Security

SN - 0167-4048

ER -