FRAUD DETECTION IN BANKING TRANSACTIONS WITH THE USE OF ARTIFICIAL INTELLIGENCE AND ANONYMIZED DATA

Authors

Keywords:

artificial intelligence, bank fraud, banking transactions, CatBoost, financial transaction analytics, machine learning classification, XGBoost.

Abstract

This paper examines whether AI machine-learning classifiers trained on anonymized bank transaction data can effectively predict fraudulent transactions. The study tests H1: at least one classifier’s area under the ROC curve (AUC) > 0.50 against H0: the best classifier’s AUC ≤ 0. 50. Using an anonymized dataset from a U.S.-based commercial bank, we assess an extensive set of classifiers, including tree-based ensembles, probabilistic, distance-based, linear and marginbased learners and a neural network using Orange Data Mining Software. The models were evaluated with stratified 10-fold cross-validation. Multiple models achieved AUC > 0.50, with tree-boosting methods providing the strongest balance between detecting fraud and limiting false alarms. Linear baselines and distance based methods were weak, while SVM produced high recall with operationally costly false positives. Overall, results support H1 and are inconsistent with H0. The study offers a transparent, bank-ready benchmark on anonymized, production-plausible features, and the framework is readily replicable for threshold tuning and governance in financial institutions.

JEL: G21, C45, C52, C55, M42.

Author Biographies

Spyridon D. LAMPROPOULOS, University of Patras, Patras, Greece. 

D., PhD, Adjunct Assistant Professor, Department of Tourism Management

Georgios L. THANASAS, University of Patras, Patras, Greece. 

PhD, Associate Professor, Department of Management Science and Technology, 

Georgia N. KONTOGEORGA, University of Paris 1 Panthéon-Sorbonne, Paris, France.

PhD, Auditor, Hellenic Court of Audit, Athens, Greece; Affiliated Researcher

References

Bahnsen, A. C., Aouada, D., & Ottersten, B. (2015). Example-dependent costsensitive decision trees. Expert Systems with Applications, 42(19), 6609– 6619. https://doi.org/10.1016/j.eswa.2015.04.042

Bahnsen, A. C., Stojanovic, A., Aouada, D., & Ottersten, B. (2013). Cost sensitive credit card fraud detection using Bayes Minimum Risk. In 2013 12th International Conference on Machine Learning and Applications (pp. 333–338). https://doi.org/10.1109/icmla.2013.68

Basel Committee on Banking Supervision. (2013, January). Principles for effective risk data aggregation and risk reporting (BCBS Working paper No 239). Bank for International Settlements. https://www.bis.org/publ/bcbs239.pdf

Bhattacharyya, S., Jha, S., Tharakunnel, K., & Westland, J. C. (2011). Data mining for credit card fraud: A comparative study. Decision Support Systems, 50(3), 602–613. https://doi.org/10.1016/j.dss.2010.08.008

Bolton, R. J., & Hand, D. J. (2002). Statistical fraud detection: A review. Statistical Science, 17(3), 235–249. http://www.jstor.org/stable/3182781

Bottou, L. (2010). Large-scale machine learning with stochastic gradient descent. In Y. Lechevallier & G. Saporta (Eds.), Proceedings of COMPSTAT’2010 (pp. 177–186). Physica-Verlag HD. https://doi.org/10.1007/978-3-7908-2604-3_16

Breiman, L. (2001) Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/a:1010933404324

Bulatova, O., Kuryliak, V., Savelyev, Y., Zakharova, O., & Sachenko, S. (2019, September). Modeling the multi-dimensional indicators of regional integration processes [Conference presentation abstract] (pp. 1024–1029). In 2019 10th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS), Metz, France. https://doi.org/10.1109/IDAACS.2019.8924430

Chen, T., & Guestrin, C. (2016, August 13-17). XGBoost: A scalable tree boosting system (pp. 785–794). In 2016 KDD ‘16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, USA. Association for Computing Machinery. https://doi.org/10.1145/2939672.2939785

Chicco, D., & Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics, 21(1), Article 6. https://doi.org/10.1186/s12864-019-6413-7

Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297. https://dx.doi.org/10.1007/BF00994018

Cover, T. M., & Hart, P. E. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27. https://isl.stanford.edu/~cover/papers/transIT/0021cove.pdf

Cubric, M. (2020). Drivers, barriers and social considerations for AI adoption in business and management: A tertiary study. Technology in Society, 62, Article 101257. https://doi.org/10.1016/j.techsoc.2020.101257

Demšar, J., Curk, T., Erjavec, A., Gorup, Č., Hočevar, T., Milutinovič, M., Možina, M., Polajnar, M., Toplak, M., Starič, A., Štajdohar, M., Umek, L., Žagar, L., Žbontar, J., Žitnik, M., & Zupan, B. (2013). Orange: Data mining toolbox in Python. Journal of Machine Learning Research, 14, 2349–2353. https://www.jmlr.org/papers/v14/demsar13a.html

Division of Banking Supervision and Regulation. (2011, April 4). SR 11-7: Guidance on model risk management (Supervision and Regulation letter). Board of Governors of the Federal Reserve System, Washington, D.C. https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm

Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874. https://doi.org/10.1016/j.patrec.2005.10.010

Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of online learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139. https://doi.org/10.1006/jcss.1997.1504

He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263–1284. https://doi.org/10.1109/TKDE.2008.239

Heaton, J. (2018). Ian Goodfellow, Yoshua Bengio, and Aaron Courville: Deep learning. Genetic Programming and Evolvable Machines, 19(1–2), 305– 307. https://doi.org/10.1007/s10710-017-9314-z

Hoerl, A. E., & Kennard, R. W. (1970). Ridge Regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55–67. https://doi.org/10.1080/00401706.1970.10488634

Jurgovsky, J., Granitzer, M., Ziegler, K., Calabretto, S., Portier, P.-E., He-Guelton, L., & Caelen, O. (2018). Sequence classification for credit-card fraud detection. Expert Systems with Applications, 100, 234–245. https://doi.org/10.1016/j.eswa.2018.01.037

Kuryliak, V., Lyzun, M., Hayda, Y., Lishchynskyy, I., & Ukhova, N. (2025). Crosscorrelation analysis of dynamic interdependencies between socioeconomic development and the demand for higher education in Ukraine. Journal of European Economy, 24(3), 467–485. https://doi.org/10.35774/jee2025.03.467

Mitchell, T. M. (1997). Machine Learning. McGraw-Hill. https://www.cs.cmu.edu/~tom/mlbook.html

Ngai, E. W. T, Hu, Y., Wong, Y. H., Chen, Y., & Sun, X. (2010). The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature. Decision Support Systems, 50(3), 559–569. https://doi.org/10.1016/j.dss.2010.08.006

Petkov, R. (2020). Artificial intelligence (AI) and the accounting function – A revisit and a new perspective for developing framework. Journal of Emerging Technologies in Accounting, 17(1), 99–105. https://doi.org/10.2308/jeta-52648

Pozzolo, A. D., Boracchi, G., Caelen, O., Alippi, C., & Bontempi, G. (2018). Credit card fraud detection: A realistic modeling and a novel learning strategy. IEEE Transactions on Neural Networks and Learning Systems, 29(8), 3784–3797. https://doi.org/10.1109/TNNLS.2017.2736643

Ryman-Tubb, N. F., Krause, P., & Garn, W. (2018). How Artificial Intelligence and machine learning research impacts payment card fraud detection: A survey and industry benchmark. Engineering Applications of Artificial Intelligence, 76, 130–157. https://doi.org/10.1016/j.engappai.2018.07.008

Saito, T., & Rehmsmeier, M. (2015). The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE, 10(3), Article e0118432. https://doi.org/10.1371/journal.pone.0118432

Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x

Van Vlasselaer, V., Bravo, C., Caelen, O., Eliassi-Rad, T., Akoglu, L., Snoeck, M., & Baesens, B. (2015). APATE: A novel approach for automated credit card transaction fraud detection using network-based extensions. Decision Support Systems, 75, 38–48. https://doi.org/10.1016/j.dss.2015.04.013

Wells, J. T. (2020). Principles of fraud examination (6th ed.). Wiley.

Whitrow, C., Hand, D. J., Juszczak, P., Weston, D., & Adams, N. (2009). Transaction aggregation as a strategy for credit card fraud detection. Data Mining and Knowledge Discovery, 18(1), 30–55. https://doi.org/10.1007/s10618-008-0116-z

Received: September 18, 2025.

Reviewed: October 27, 2025.

Accepted: December 3, 2025.

Downloads

Published

31.12.2025

How to Cite

LAMPROPOULOS, Spyridon D., et al. “FRAUD DETECTION IN BANKING TRANSACTIONS WITH THE USE OF ARTIFICIAL INTELLIGENCE AND ANONYMIZED DATA”. Journal of European Economy, vol. 24, no. 4, Dec. 2025, pp. 656-73, https://jeej.wunu.edu.ua/index.php/enjee/article/view/1894.

Issue

Section

Development of Financial Relations