Impact Factor (2025): 6.9
DOI Prefix: 10.47001/IRJIET
Vol 10 No 5 (2026): Volume 10, Issue 5, May 2026 | Pages: 548-561
International Research Journal of Innovations in Engineering and Technology
OPEN ACCESS | Research Article | Published Date: 27-05-2026
Enterprise logistics and quick-commerce operations generate high-velocity transaction logs that dictate daily picking and replenishment patterns. Optimizing item-storage slotting layouts and subsequent tactical order-picking operations remains a challenging problem due to shifting consumer demand patterns, spatial capacity limits, and the decoupled nature of traditional distribution frameworks. Conventional warehouse systems rely on static heuristic rules that fail to capture dynamic product affinities, resulting in inflated dwell times, cross-congestion, and degraded fulfilment efficiency. To address these limitations, this paper proposes a co-designed Hybrid Mathematical Optimization and Deep Reinforcement Learning (DRL) Framework for unified strategic slotting and tactical pathfinding within localized micro-fulfilment centres (Dark Stores).
The proposed system establishes an analytical data pipeline that ingests historical transactional logs to execute ABC item velocity stratification and Apriori-based market basket affinity clustering. These empirical metrics parameterize an Integer Linear Programming (ILP) model that minimizes total expected picking travel distances subject to structural and shelf-capacity constraints. For tactical navigation, a grid-based environment digital twin is formulated as a Markov Decision Process (MDP) and wrapped within a custom OpenAI Gymnasium abstraction. A Proximal Policy Optimization (PPO) agent utilizing an Actor-Critic neural network architecture is trained to resolve optimal, collision-free multi-item retrieval trajectories. Experimental results demonstrate that the trained hybrid framework converges robustly, achieving a stable 65.5% order picking completion success rate under highly stochastic demand schedules.
This represents a significant performance improvement over conventional rule-based baselines while successfully mitigating terminal layout collisions and operational bottlenecks.
Dark Store Logistics, Integer Linear Programming (ILP), Proximal Policy Optimization (PPO), Gymnasium, Strategic Slotting, Reward Shaping, Deep Reinforcement Learning.
MD. Muneer Uddin Azam, Pruthvi Bijjarappu, K. Madhubabu, Meera Alphy, & M. Aruna. (2026). Inventory Slotting and Order Picking Optimization in Dark Store Operations. International Research Journal of Innovations in Engineering and Technology - IRJIET, 10(5), 548-561. Article DOI https://doi.org/10.47001/IRJIET/2026.105075
This work is licensed under Creative common Attribution Non Commercial 4.0 Internation Licence
H. Yoshitake and P. Abbeel, The Impact of Overall Optimization on Warehouse Automation, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1621–1628, 2023.
C. Zhang, P. Odonkor, S. Zheng, H. Khorasgani, S. Serita, and C. Gupta, Dynamic Dispatching for Large-Scale Heterogeneous Fleet via Multi-agent Deep Reinforcement Learning, IEEE International Conference on Big Data (Big Data), pp. 3118–3127, 2020.
B. Cheng, T. Xie, L. Wang, Q. Tan, and X. Cao, Deep Reinforcement Learning Driven Cost Minimization for Batch Order Scheduling in Robotic Mobile Fulfillment Systems, Expert Systems with Applications, vol. 255, part C, pp. 124525, 2024.
X. Liao, Y. Wang, Y. Xuan, and D. Wu, AGV Path Planning Model based on Reinforcement Learning, IEEE 1st China International Youth Conference on Electrical Engineering, pp. 6722– 6726, 2020.
A.Krnjaic et al., Scalable Multi-Agent Reinforcement Learning for Warehouse Logistics with Robotic and Human Co-Workers, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 677–684, 2024.
W. Wu, W.-Y. Chiu, and W. Wu, Deep Reinforcement Learning for Task Assignment and Shelf Reallocation in Smart Warehouses, IEEE Access, vol. 12, pp. 58915-58926, 2024.
S. Teck, T. S. Phạm, L.-M. Rousseau, and P. Vansteenwegen, Deep Reinforcement Learning for the Real-Time Inventory Rack Storage Assignment and Replenishment Problem, European Journal of Operational Research, vol. 327, issue 2, pp. 606–622, 2025.
X. Xu, Logistics Distribution Path Optimization based on Deep Reinforcement Learning, Procedia Computer Science, vol. 261, pp. 1143–1149, 2025.
S. Mahmoudinazlou et al., Deep Reinforcement Learning for Dynamic Order Picking in Warehouse Operations, Computers & Operations Research, vol. 182, no. C, pp. 106536, 2025.
L. Wang, A. Gunawan, and P. Vansteenwegen, Storage Location Optimization in Automated Storage and Retrieval Systems: A Deep Reinforcement Learning Approach, International Conference on Computational Logistics, vol. 13993, pp. 1–15, 2025.
G. O. Pizarro-Vasquez, Optimization of a Rectangular Warehouse Design Using Heuristics Techniques, International Conference on Science, Technology and Innovation for Society, vol. 1331, pp. 93-104, 2025.
Y. Wang and S. Minner, Deep Reinforcement Learning for Demand Fulfillment in Online Retail, International Journal of Production Economics, vol. 269, pp. 109123, 2024.
Z. Dai et al., Optimization Method of Power Grid Material Warehousing and Allocation based on Multi-Level Storage System and Reinforcement Learning, Computers & Electrical Engineering, vol. 109, Part B, pp. 108731, 2023.
Z. Xiao et al., MACNS: A Generic Graph Neural Network Integrated Deep Reinforcement Learning based Multi-Agent Collaborative Navigation System for Dynamic Trajectory Planning, Information Fusion, vol. 105, no. C, pp. 102250, 2024.