I am a research associate and Ph.D. student at the Humanoid Robots Lab headed by Prof. Maren Bennewitz at the University of Bonn, Germany. My research interests cover personalized human-aware robot navigation, reinforcement learning, and human-robot interaction. I hold a Master’s degree in physics from the University of Goettingen, Germany.


Research Spotlight

Learning Adaptive Multi-Objective Robot Navigation with Demonstrations

J. de Heuvel, T. Sethuraman, and M. Bennewitz
Submitted for publication. Arxiv preprint, 2024.

Preference-aligned robot navigation in human environments is typically achieved through learning-based approaches, utilizing demonstrations and user feedback for personalization. However, personal preferences are subject to change and might even be context-dependent. Yet traditional reinforcement learning (RL) approaches with a static reward function often fall short in adapting to these varying user preferences. This paper introduces a framework that combines multi-objective reinforcement learning (MORL) with demonstration-based learning. Our approach allows for dynamic adaptation to changing user preferences without retraining. Through rigorous evaluations, including sim-to-real and robot-to-robot transfers, we demonstrate our framework’s capability to reflect user preferences accurately while achieving high navigational performance in terms of collision avoidance and goal pursuance.

EnQuery: Ensemble Policies for Diverse Query-Generation in Preference Alignment of Robot Navigation

J. de Heuvel, F. Seiler, and M. Bennewitz
Proceedings of the IEEE International on Human & Robot Interactive Communication (RO-MAN), 2024.

To align mobile robot navigation policies with user preferences through reinforcement learning from human feedback (RLHF), reliable and behavior-diverse user queries are required. However, deterministic policies fail to generate a variety of navigation trajectory suggestions for a given navigation task configuration. We introduce EnQuery, a query generation approach using an ensemble of policies that achieve behavioral diversity through a regularization term. For a given navigation task, EnQuery produces multiple navigation trajectory suggestions, thereby optimizing the efficiency of preference data collection with fewer queries. Our methodology demonstrates superior performance in aligning navigation policies with user preferences in low-query regimes, offering enhanced policy convergence from sparse preference queries. The evaluation is complemented with a novel explainability representation, capturing full scene navigation behavior of the mobile robot in a single plot.

Spatiotemporal Attention Enhances Lidar-Based Robot Navigation in Dynamic Environments

J. de Heuvel, X. Zeng, W. Shi, T. Sethuraman, and M. Bennewitz
IEEE Research and Automation Letters, 2024.

Foresighted robot navigation in dynamic indoor environments with cost-efficient hardware necessitates the use of a lightweight yet dependable controller. So inferring the scene dynamics from sensor readings without explicit object tracking is a pivotal aspect of foresighted navigation among pedestrians. In this paper, we introduce a spatiotemporal attention pipeline for enhanced navigation based on 2D lidar sensor readings. This pipeline is complemented by a novel lidar-state repre- sentation that emphasizes dynamic obstacles over static ones. Subsequently, the attention mechanism enables selective scene perception across both space and time, resulting in improved overall navigation performance within dynamic scenarios. We thoroughly evaluated the approach in different scenarios and simulators, finding good generalization to unseen environments. The results demonstrate outstanding performance compared to state-of-the-art methods, thereby enabling the seamless deployment of the learned controller on a real robot.

Subgoal-Driven Navigation in Dynamic Environments Using Attention-Based Deep Reinforcement Learning

J. de Heuvel, W. Shi, X. Zeng, M. Bennewitz
Arxiv preprint, 2023. Accepted for publication at the IEEE/RSJ International Conference on Advanced Robotics (ICAR 2023).

Collision-free, goal-directed navigation in environments containing unknown static and dynamic obstacles is still a great challenge, especially when manual tuning of navigation policies or costly motion prediction needs to be avoided. In this paper, we therefore propose a subgoal-driven hierarchical navigation architecture that is trained with deep reinforcement learning and decouples obstacle avoidance and motor control. In particular, we separate the navigation task into the prediction of the next subgoal position for avoiding collisions while moving toward the final target position, and the prediction of the robot’s velocity controls. By relying on 2D lidar, our method learns to avoid obstacles while still achieving goal-directed behavior as well as to generate low-level velocity control commands to reach the subgoals. In our architecture, we apply the attention mechanism on the robot’s 2D lidar readings and compute the importance of lidar scan segments for avoiding collisions. As we show in simulated and real-world experiments with a Turtlebot robot, our proposed method leads to smooth and safe trajectories among humans and significantly outperforms a state-of-the-art approach in terms of success rate.

Learning Depth Vision-Based Personalized Robot Navigation From Dynamic Demonstrations in Virtual Reality

J. de Heuvel, N. Corral, B. Kreis, J. Conradi, A. Driemel, M. Bennewitz
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023.

For the best human-robot interaction experience, the robot’s navigation policy should take into account personal preferences of the user. In this paper, we present a learning framework complemented by a perception pipeline to train a depth vision-based, personalized navigation controller from user demonstrations. Our virtual reality interface enables the demonstration of robot navigation trajectories under motion of the user for dynamic interaction scenarios. The novel perception pipeline enrolls a variational autoencoder in combination with a motion predictor. It compresses the perceived depth images to a latent state representation to enable efficient reasoning of the learning agent about the robot’s dynamic environment. In a detailed analysis and ablation study, we evaluate different configurations of the perception pipeline.
To further quantify the navigation controller’s quality of personalization, we develop and apply a novel metric to measure preference reflection based on the Fréchet Distance. We discuss the robot’s navigation performance in various virtual scenes and demonstrate the first personalized robot navigation controller that solely relies on depth images.

Learning Personalized Human-Aware Robot Navigation Using Virtual Reality Demonstrations from a User Study

J. de Heuvel, N. Corral, L. Bruckschen, and M. Bennewitz
Proceedings of the IEEE International on Human & Robot Interactive Communication (RO-MAN), 2022.

For the most comfortable, human-aware robot navigation, subjective user preferences need to be taken into account. This paper presents a novel reinforcement learning framework to train a personalized navigation controller along with an intuitive virtual reality demonstration interface. The conducted user study provides evidence that our personalized approach significantly outperforms classical approaches with more comfortable human-robot experiences. We achieve these results using only a few demonstration trajectories from non-expert users, who predominantly appreciate the intuitive demonstration setup. As we show in the experiments, the learned controller generalizes well to states not covered in the demonstration data, while still reflecting user preferences during navigation. Finally, we transfer the navigation controller without loss in performance to a real robot.

Characterizing spreading dynamics of subsampled systems with nonstationary external input

J. de Heuvel, J. Wilting, M. Becker, V. Priesemann, and J. Zierenberg
Phys. Rev. E 102, 040301(R), 2020.

Many systems with propagation dynamics, such as spike propagation in neural networks and spreading of infectious diseases, can be approximated by autoregressive models. The estimation of model parameters can be complicated by the experimental limitation that one observes only a fraction of the system (subsampling) and potentially time-dependent parameters, leading to incorrect estimates. We show analytically how to overcome the subsampling bias when estimating the propagation rate for systems with certain nonstationary external input. This approach is readily applicable to trial-based experimental setups and seasonal fluctuations as demonstrated on spike recordings from monkey prefrontal cortex and spreading of norovirus and measles.


Other Publications

Handling Sparse Rewards in Reinforcment Learning Using Model Predictive Control

M. Dawood, N. Dengler, J. de Heuvel, and M. Bennewitz
Proceedings of the IEEE International Conference on Robotics & Automation (ICRA), 2023.

Reactive Correction of Object Placement Errors for Robotic Arrangement Tasks

B. Kreis, R. Menon, B. K. Adinarayan, J. de Heuvel, and M. Bennewitz
Proceedings of the International Conference on Intelligent Autonomous Systems (IAS), 2023.


Talks & Presentations

DFKI Saarland
Learning preference-aligned robot navigation from VR demonstrations using deep reinforcement learning.
Presentation, October 2023.

SEANavBench Workshop – IROS 2023
Personalized Human-Robot Interaction: Learning Depth Vision-Based Navigation with User Preferences.
Poster Presentation, October 2023.

Future of AI SummitRWTH AI Week 2023
Exchange event about AI-related topics with the general public, industry and science community, organized under the auspices of the Alexander von Humboldt Foundation.
Poster Presentation, September 2023.

CVPR 2023 Workshop on 3D Vision and Robotics
Learning Depth Vision-Based Personalized Robot Navigation Using Latent State Representations.
Presentation, June 2023.

Max Planck Institute for Dynamics and Self-Organization | Prof. Priesemann Group Seminar
Learning personalized robot navigation from demonstrations using deep reinforcement learning.
Presentation, November 2022.

SEANavBench Workshop – ICRA 2022
Teaching Personalized Robot Navigation through Virtual Reality Demonstrations: A Learning Framework and User Study.
Poster Presentation, May 2022.


Projects

FOR 2535 – Anticipating Human Behavior
DFG Research Unit
Subproject P7 – Foresighted Robot Navigation Using Predicted Human Behavior

Waves to Weather
DFG Transregional Collaborative Research Center (SFB/TRR165)


Curriculum Vitae

Research
2021 – currentResearch associate at the Humanoid Robots Lab
University of Bonn, Germany
2020 – 2021Research associate in the transregional collaborative research project „Waves To Weather” (SFB/TRR165).
Johannes Gutenberg-University, Mainz, Germany
2018 – 2019Master’s thesis research project on subsampled spreading dynamics
Max-Planck-Institute for Dynamics and Self-Organization, Goettingen, Germany
Education
2019Master of Science in Physics
Georg-August-Universität, Göttingen, Germany
2017 Bachelor of Science in Physics
Georg-August-Universität, Göttingen, Germany

Last update: May 2023


Contact (Work)

Jorge de Heuvel

University of Bonn
Institute for Computer Science VI
Friedrich-Hirzebruch-Allee 8
53115 Bonn
Germany