B.Eng Computer Science, Singapore University of Technology and Design
About Me
Robotics Software Engineer working with embodied AI and intelligent agent decision-making. Dedicated to building performant intelligence systems that enable robots to perceive, reason, and act in real-time within complex physical spaces.
Projects
Collaborative Robot Intelligence
Working on making human-robot collaboration more intuitive through AI.
Combined GroundingDINO and SAM2 to create an open vocabulary video object tracker.
Combined the tracker with a Realsense depth camera to obtain 3D positions of the tracked objects of interest.
Incorporated an Agentic VLM to reason about the visual scene, converse with a human operator, perform task planning and oversee the automatic execution of multiple tasks in succession.
Integrated the Moveit2 library to enable inverse kinematic capabilities on a manipulator arm. Wrote a Trajectory Generation controller to enable the manipulator arm to execute the planned trajectories output by Moveit2.
System integrated the various components (AI and CV models, Camera, Servos, Audio, Logic) via ROS2. Worked with capabilities such as services, actions, executors, custom message interfaces and launch configurations for multi-launch and multi-node compositions.
Blended synchronous and asynchronous approaches in ROS2 based on the functional use case.
Incorporated a PS2 controller to enable teleoperation capabilities. Implemented individual and multi-joint control for the 6DoF arm, alongside control for the wheel motion and steering.
Robot Arm
ROS2
Moveit2
Trajectory Control
Computer Vision
GroundingDINO
SAM2
Realsense
Autonomous Collaborative Robot Arm (VLA-based)
Protoyped a demonstration of how a VLM and VLA model might work together for greater embodied intelligence and collaboration.
Incorporated a VLM (OpenAI gpt5-mini) as an agent. In this role, the VLM converses with the operator, interprets command and queries, creates simpler tasks plans for the VLA to perform, and visually monitors the execution of the VLA model to determine task success or failure.
Deployed the Nvidia GR00T N1.5 VLA model to perform the autonomous action policy execution for three tasks.
Evaluated finetuning and inference performance of the VLA model across various parameters, such as the number of training steps, full finetuning vs LoRA finetuning, action chunk horizon, denoising steps.
Integrated the Lerobot S0101 arm as an inference client of the Gr00t policy server.
Performed data collection through a leader-follower setup.
Robot Arm
VLA
VLM
GR00T N1.5
LeRobot
LoRA Finetuning
Imitation Learning
ICP Odometry
Implementation of the Iterative Closest Point (ICP) algorithm to estimate robot odometry. It estimates how a robot moves over time by aligning consecutive 2D laser scans or 3D point clouds and reading out the relative pose change between them. It outputs an odometry trajectory: position and orientation over time, based purely on geometric scan matching rather than wheel encoders.
EKF SLAM
Implementation of the Extended Kalman Filter (EKF) SLAM. Processes LiDAR point clouds and odometry to estimate the robot pose and build a landmark map (cylinders).