TY - JOUR
T1 - Improved response modeling based on clustering, under-sampling, and ensemble
AU - Kang, Pilsung
AU - Cho, Sungzoon
AU - MacLachlan, Douglas L.
N1 - Funding Information:
The first author was supported by the research program funded by Seoul National University of Science & Technology (Seoultech) and the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science, and Technology (MEST) (No. 2011-0021893 ). The second author was supported by second stage of the Brain Korea 21 Project in 2011, the National Research Foundation of Korea grants funded by the Korean government (MEST) (Nos. 2011-0030814 and 20110000164 ), and the Engineering Research Institute of Seoul National University.
PY - 2012/6/15
Y1 - 2012/6/15
N2 - The purpose of response modeling for direct marketing is to identify those customers who are likely to purchase a campaigned product, based upon customers' behavioral history and other information available. Contrary to mass marketing strategy, well-developed response models used for targeting specific customers can contribute profits to firms by not only increasing revenues, but also lowering marketing costs. Endemic in customer data used for response modeling is a class imbalance problem: the proportion of respondents is small relative to non-respondents. In this paper, we propose a novel data balancing method based on clustering, under-sampling, and ensemble to deal with the class imbalance problem, and thus improve response models. Using publicly available response modeling data sets, we compared the proposed method with other data balancing methods in terms of prediction accuracy and profitability. To investigate the usability of the proposed algorithm, we also employed various prediction algorithms when building the response models. Based on the response rate and profit analysis, we found that our proposed method (1) improved the response model by increasing response rate as well as reducing performance variation, and (2) increased total profit by significantly boosting revenue.
AB - The purpose of response modeling for direct marketing is to identify those customers who are likely to purchase a campaigned product, based upon customers' behavioral history and other information available. Contrary to mass marketing strategy, well-developed response models used for targeting specific customers can contribute profits to firms by not only increasing revenues, but also lowering marketing costs. Endemic in customer data used for response modeling is a class imbalance problem: the proportion of respondents is small relative to non-respondents. In this paper, we propose a novel data balancing method based on clustering, under-sampling, and ensemble to deal with the class imbalance problem, and thus improve response models. Using publicly available response modeling data sets, we compared the proposed method with other data balancing methods in terms of prediction accuracy and profitability. To investigate the usability of the proposed algorithm, we also employed various prediction algorithms when building the response models. Based on the response rate and profit analysis, we found that our proposed method (1) improved the response model by increasing response rate as well as reducing performance variation, and (2) increased total profit by significantly boosting revenue.
KW - CRM
KW - Class imbalance
KW - Clustering
KW - Data balancing
KW - Direct marketing
KW - Ensemble
KW - Response modeling
UR - http://www.scopus.com/inward/record.url?scp=84857642451&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2011.12.028
DO - 10.1016/j.eswa.2011.12.028
M3 - Article
AN - SCOPUS:84857642451
SN - 0957-4174
VL - 39
SP - 6738
EP - 6753
JO - Expert Systems with Applications
JF - Expert Systems with Applications
IS - 8
ER -