PlanB&B accepted at AAAI 2026
Our paper Planning in Branch-and-Bound: Model-Based Reinforcement Learning for Exact Combinatorial Optimization has been accepted at AAAI 2026!
This work introduces PlanB&B, the first model-based reinforcement learning (MBRL) agent for exact combinatorial optimization. Building on the principled MDP formulation of BBMDP (NeurIPS 2025), PlanB&B goes one step further by learning an internal model of branch-and-bound dynamics and using Monte Carlo Tree Search (Gumbel Search) to plan improved branching decisions — without any additional LP solves at inference time.
Key results on four standard MILP benchmarks:
- 2–3× improvement over prior RL baselines in both tree size and solving time
- First RL agent to surpass imitation learning trained on strong branching
- Up to 50% tree size reduction with additional search budget at inference
- Discovers genuinely novel branching strategies, actively diverging from expert patterns
Authors: Paul Strang, Zacharie Alès, Côme Bissuel, Olivier Juan, Safia Kedad-Sidhoum, Emmanuel Rachelson
Links: Project page · arXiv · OpenReview · Code