Causal Reinforcement Learning Algorithm for Adaptive Business Strategy Optimization Under Market Volatility

Authors

  • Shermatov Abdukodir Obidjon Ugli Turan International University, Namangan, Uzbekistan
  • Dr. N.Dayanand Lal Assistant Professor, Department of AI&DS, GITAM School of CSE, GITAM (Deemed to be University), Bengaluru, India.
  • Dr.Nidhi Mishra Assistant Professor, Kalinga University, Naya Raipur, Chhattisgarh, India
  • Dr. Sadasivam V R Professor, Department of Information Technology, K.S.Rangasamy College of Technology, Tiruchengode, India
  • Saodat Mamayusupova PhD, Associate Professor, Jizzakh State Pedagogical University, Jizzakh, Uzbekistan
  • Kattakul Kinjaev Lecturer, Department of finance and tourism, Termez University of Economics and Service, Termez, Uzbekistan

Keywords:

network security; anomaly detection; CNN; Bi-LSTM;graph neural network; attention

Abstract

Developing adaptive business strategies within volatile markets is arguably one of the most critical and challenging problems in business management. Traditional approaches to strategy optimization are built on predictive modeling with correlations that confuse causality and noise in market signals, resulting in fragile strategies that are vulnerable to shifts in the underlying distribution of markets, where robustness is essential. On the other hand, reinforcement learning (RL) algorithms are inherently adaptive, but suffer from similar issues since agents trained with historical transition distributions in markets are exposed to the confounded associations in training and inevitably fall short in unseen market conditions. In this work, propose CRL-ABSO (Causal Reinforcement Learning Algorithm for Adaptive Business Strategy Optimization), an approach designed to disentangle cause-effect relationships in market dynamics and generate intervention-resistant business strategies. Specifically, CRL-ABSO creates a dynamic structural causal model (SCM) for the business domain with an innovative online causal discovery algorithm based on non-stationary conditional independence testing and scoring. It combines a counterfactual advantage predictor with the conventional reward function in the RL agent's objective, while masking actions by do-calculus for avoiding exploiting spurious correlations. The VAC scheduler gradually introduces more severe market regimes into the training environment for the agent. The experiments conducted involve CRL-ABSO in four different strategy optimization problems, namely pricing optimization, portfolio rebalancing, procurement, and mergers and acquisitions targets, through ten years of empirical financial and operational data. CRL-ABSO is able to deliver a 31.4% gain in average cumulative rewards compared to the best performing non-causal RL baselines while reducing strategy variance in high-volatility regimes by 47.3%. In addition, CRL-ABSO shows significant out-of-distribution generalization on three different market shock scenarios, including COVID-19 supply disruptions, energy crises in 2022, and semiconductor shortages in 2024.

Downloads

Published

2026-04-15

How to Cite

Ugli, S. A. O., Lal, D. N., Mishra, D., V R, D. S., Mamayusupova, S., & Kinjaev, K. (2026). Causal Reinforcement Learning Algorithm for Adaptive Business Strategy Optimization Under Market Volatility. International Journal of Artificial Intelligence and Machine Learning, 6(1s), 933–943. Retrieved from https://mail.svedbergopen.com/index.php/ijaiml/article/view/163

Similar Articles

<< < 1 2 3 4 5 6 7 8 > >> 

You may also start an advanced similarity search for this article.