Multi-Objective Reinforcement Learning (MORL) is an emerging field that extends the conventional reinforcement learning paradigm by enabling agents to optimise multiple conflicting objectives ...
On the Humanity’s Last Exam benchmark, Deep Research Agent scored 46.4%, outperforming OpenAI’s GPT-5 Pro (38.9%).