A machine-learning algorithm with disjunctive model for data-driven program analysis

Minseok Jeon, Sehun Jeong, Sungdeok Cha, Hakjoo Oh

Research output: Contribution to journalArticlepeer-review

6 Citations (Scopus)


We present a new machine-learning algorithm with disjunctive model for data-driven program analysis. One major challenge in static program analysis is a substantial amount of manual effort required for tuning the analysis performance. Recently, data-driven program analysis has emerged to address this challenge by automatically adjusting the analysis based on data through a learning algorithm. Although this new approach has proven promising for various program analysis tasks, its effectiveness has been limited due to simpleminded learning models and algorithms that are unable to capture sophisticated, in particular disjunctive, program properties. To overcome this shortcoming, this article presents a new disjunctive model for datadriven program analysis aswell as a learning algorithm to find the model parameters. Ourmodel uses Boolean formulas over atomic features and therefore is able to express nonlinear combinations of program properties. A key technical challenge is to efficiently determine a set of good Boolean formulas, as brute-force search would simply be impractical. We present a stepwise and greedy algorithm that efficiently learns Boolean formulas. We show the effectiveness and generality of our algorithm with two static analyzers: Contextsensitive points-to analysis for Java and flow-sensitive interval analysis for C. Experimental results show that our automated technique significantly improves the performance of the state-of-the-art techniques including ones hand-crafted by human experts.

Original languageEnglish
Article number13
JournalACM Transactions on Programming Languages and Systems
Issue number2
Publication statusPublished - 2019 Jun


  • Context-sensitivity
  • Data-driven program analysis
  • Flow-sensitivity
  • Static analysis

ASJC Scopus subject areas

  • Software


Dive into the research topics of 'A machine-learning algorithm with disjunctive model for data-driven program analysis'. Together they form a unique fingerprint.

Cite this