Predicting customer repurchase likelihood in e-commerce using machine learning on discrete behavioral data
DOI:
https://doi.org/10.24311/jabes/2025.36.11.05Abstract
Due to short product life cycles and rapidly changing trends, e-commerce in the fashion industry requires accurate prediction of the timing and probability of repurchase to enhance the effectiveness of personalized marketing strategies. However, a research gap persists as most studies prioritize long-term repurchase prediction, often overlooking short-term purchase likelihood, especially when dealing with discrete behavioral data. This study proposes a model to predict short-term repurchase likelihood, quantified as a probability over a 30-day period by integrating Recency-Frequency-Monetary (RFM) features with product category diversity and session interaction metrics. Validated using LightGBM and XGBoost algorithms on an online fashion dataset, the proposed models achieved strong classification performance, yielding an ROC-AUC score of approximately 0.8830. Furthermore, R and F were identified as the most influential predictors of short-term purchasing behavior. These findings not only extend the literature on purchase timing prediction but also enable businesses to identify high-potential customers for implementing more effective remarketing strategies.
References
Cheng, C. H., & Chen, Y. S. (2009). Classifying the segmentation of customer value via RFM model and RS theory. Expert Systems with Applications, 36(3), 4176-4184. https://doi.org/10.1016/j.eswa.2008.04.003
Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’16) (pp. 785-794). https://doi.org/10.1145/2939672.2939785
Cốc Cốc. (2024). Ngành thời trang Việt Nam: Nhìn cơ hội từ sự đa dạng trong hành vi và thói quen tiêu dùng. Truy cập tại https://qc.coccoc.com/vn/news/nganh-thoi-trang-viet-nam-nhin-co-hoi-tu-su-da-dang-trong-hanh-vi-va-thoi-quen-tieu-dung
Gholamveisy, S., Homayooni, S., Shemshaki, M., Sheykhan, S., Boozary, P., Tanhaei, H. G., & Akbari, N. (2024). Application of data mining technique for customer purchase behavior via Extended RFM model with focus on BCG matrix from a data set of online retailing. Journal of Infrastructure Policy and Development, 8(7), 4426. https://doi.org/10.24294/jipd.v8i7.4426
Gomes, M. A., Wönkhaus, M., Meisen, P., & Meisen, T. (2023). TEE: Real-time purchase prediction using time extended embeddings for representing customer behavior. Journal of Theoretical and Applied Electronic Commerce Research, 18(3), 1404-1418. https://doi.org/10.3390/jtaer18030070
Heinisch, J. S., Gao, N., Anderson, C., Deldari, S., David, K., & Salim, F. (2022). Investigating the effects of mood & usage behaviour on notification response time. arXiv. Retrieved from https://doi.org/10.48550/arXiv.2207.03405
Hoang. A. (2025). E‑commerce in upward trend. Vietnam Economic Times – VnEconomy. Retrieved from https://en.vneconomy.vn/e-commerce-in-upward-trend-1250945.htm
Hoàng Nguyễn Thu Huyền, Lê Ngọc Sơn, & Nguyễn Quốc Cường. (2023). Các yếu tố ảnh hưởng đến hành vi mua sắm sản phẩm thời trang trên ứng dụng di động của Gen Z tại Thành phố Hồ Chí Minh. Tạp chí Khoa học và Công nghệ - Trường Đại học Công nghiệp TP.HCM, 66(6), 56-72. https://jst.iuh.edu.vn/index.php/jst-iuh/article/view/4989
Hughes, A. M. (1996). Boosting response with RFM. Marketing Tools, 3(3), 4-10.
Jalal, M. E., & Elmaghraby, A. (2024). Analyzing the dynamics of customer behavior: A new perspective on personalized marketing through counterfactual analysis. Journal of Theoretical and Applied Electronic Commerce Research, 19(3), Article 81. https://www.mdpi.com/0718-1876/19/3/81
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T. Y. (2017). LightGBM: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems 30 (NeurIPS 2017) – Proceedings of the 30th Conference (pp. 3149-3157). https://proceedings.neurips.cc/paper_files/paper/2017/file/6449f44a102fde848669bdd9eb6b76fa-Paper.pdf
Li, J., Luo, X., Lu, X., & Moriguchi, T. (2020). Boosting returns on E-Commerce retargeting campaigns. American Marketing Association. Retrieved from https://www.ama.org/2020/11/12/boosting-returns-on-e-commerce-retargeting-campaigns/
Lismont, J., Ram, S., Vanthienen, J., Lemahieu, W., & Baesens, B. (2018). Predicting interpurchase time in a retail environment using customer-product networks: An empirical study and evaluation. Expert Systems with Applications, 104, 22-32. https://doi.org/10.1016/j.eswa.2018.03.016
Liu, D., Huang, H., Zhang, H., Luo, X., & Fan, Z. (2024). Enhancing customer behavior prediction in e-commerce: A comparative analysis of machine learning and deep learning models. Applied and Computational Engineering, 55(1), 181-195. https://doi.org/10.54254/2755-2721/55/20241475
Popowska, M., & Sinkiewicz, A. (2021). Sustainable fashion in Poland - Too early or too late?. Sustainability, 13(17), 9713. https://doi.org/10.3390/su13179713
Segun‑Falade, O. D., Osundare, O. S., Kedi, W. E., Okeleke, P. A., Ijomah, T. I., & Abdul‑Azeez, O. Y. (2024). Utilizing machine learning algorithms to enhance predictive analytics in customer behavior studies. International Journal of Scholarly Research in Engineering and Technology, 4(1), 1-18. https://doi.org/10.56781/ijsret.2024.4.1.0018
Vallarino, D. (2023). Buy when? Survival machine learning model comparison for purchase timing. arXiv. https://doi.org/10.48550/arXiv.2308.14343
Verma, R., Rathor, D., Kumar, S., Mishra, M., & Baranwal, M. (2025). Enhancing customer repurchase prediction: Integrating classification algorithms with RFM analysis for precision and actionable insights. IIMB Management Review, 37(2), 100574. https://doi.org/10.1016/j.iimb.2025.100574
Wong, C. G., Tong, G. K., & Haw, S. C. (2024). Exploring customer segmentation in e-commerce using RFM analysis with clustering techniques. Journal of Telecommunications and the Digital Economy, 12(3), 97-125. https://doi.org/10.18080/jtde.v12n3.978
Zhou, S., & Hudin, N. S. (2024). Advancing e-commerce user purchase prediction: Integration of time-series attention with event-based timestamp encoding and Graph Neural Network-Enhanced user profiling. PLoS ONE, 19(4), e0299087. https://doi.org/10.1371/journal.pone.0299087
Downloads
Published
Issue
Section
License
Copyright (c) 2026 JOURNAL OF ASIAN BUSINESS AND ECONOMIC STUDIES

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.



