Alejandro Murillo-González

Adaptive Smooth Tchebycheff Attention for Multi-Objective Policy Optimization

Alejandro Murillo-González & Mahmoud Ali & Lantao Liu

Robotics: Science and Systems (RSS), 2026

Robot Learning Reinforcement Learning (RL) Multi-Objective RL (MORL) Multi-Objective Optimization (MOO) Visual Search Unmanned Ground Vehicle (UGV) Quadrotor

📄 PDF 💻 Code

Stealth Visual Search.

Robot exploration task where its presence must be non-disruptive. For example, monitoring fragile ecosystems without disturbing wildlife.

Abstract

Multi-objective reinforcement learning in robotic domains requires balancing complex, non-convex trade-offs between conflicting objectives. While linear scalarization methods provide stability, they are theoretically incapable of recovering solutions within non-convex regions of the Pareto front. Conversely, static non-linear scalarizations (e.g., Tchebycheff) can theoretically access these regions but often suffer from severe gradient variance and optimization instability in deep RL. In this work, we propose an Adaptive Smooth Tchebycheff framework that resolves this tension by dynamically modulating the curvature of the optimization landscape. We introduce a novel conflict-driven controller that regulates the optimization smoothness based on real-time gradient interference. This allows the agent to anneal toward precise, non-convex scalarization when objectives align, while elastically reverting to stable, smooth approximations when destructive gradient conflicts emerge. We validate our approach on a challenging robotic stealth visual search task—a proxy for monitoring of protected/fragile ecosystems—where an agent must balance search, exposure/interference minimization and exploration speed. Extensive ablations confirm that our conflict-aware adaptation enables the robust discovery of Pareto-optimal policies in non-convex regions inaccessible to linear baselines and unstable for static non-linear methods.

Videos

Drone Flying among Random Agents

A drone must traverse a shared workspace while avoiding two other drones executing independent tasks within assigned lanes.

Citation

@INPROCEEDINGS{murillo2026pasta, AUTHOR = {Alejandro Murillo-González AND Mahmoud Ali AND Lantao Liu}, TITLE = {{Adaptive Smooth Tchebycheff Attention for Multi-Objective Policy Optimization}}, BOOKTITLE = {Proceedings of Robotics: Science and Systems}, YEAR = {2026}, ADDRESS = {Sydney, Australia}, MONTH = {July} }

This site is open source. Improve this page.