OBJECTIVES: This study was conducted to examine gender differences in under-reporting hiring discrimination by building a prediction model for workers who responded "not applicable (NA)" to a question about hiring discrimination despite being eligible to answer. METHODS: Using data from 3,576 wage workers in the seventh wave (2004) of the Korea Labor and Income Panel Study, we trained and tested 9 machine learning algorithms using "yes" or "no" responses regarding the lifetime experience of hiring discrimination. We then applied the best-performing model to estimate the prevalence of experiencing hiring discrimination among those who answered "NA." Under-reporting of hiring discrimination was calculated by comparing the prevalence of hiring discrimination between the "yes" or "no" group and the "NA" group. RESULTS: Based on the predictions from the random forest model, we found that 58.8% of the "NA" group were predicted to have experienced hiring discrimination, while 19.7% of the "yes" or "no" group reported hiring discrimination. Among the "NA" group, the predicted prevalence of hiring discrimination for men and women was 45.3% and 84.8%, respectively. CONCLUSIONS: This study introduces a methodological strategy for epidemiologic studies to address the under-reporting of discrimination by applying machine learning algorithms.
- Machine learning
- Social discrimination
- Social epidemiology
ASJC Scopus subject areas
- Public Health, Environmental and Occupational Health