Robust Detection for Autonomous Elevator Boarding using a Mobile Manipulator

Korea University1

Ewha Womans University2

Our framework is an approach to enable indoor robots to navigate multi-level structures through elevators. Our system accurately perceives elevator states, pushes buttons, and executes boarding sequences using only image sensors. We address challenges like class imbalance and label dependency in real-world datasets using specialized YOLOv7 training techniques.

Abstract

Indoor robots are becoming increasingly prevalent across a range of sectors, but the challenge of navigating multi-level structures through elevators remains largely uncharted. For a robot to operate successfully, it's pivotal to have an accurate perception of elevator states. This paper presents a robust robotic system, tailored to interact adeptly with elevators by discerning their status, actuating buttons, and boarding seamlessly. Given the inherent issues of class imbalance and limited data, we utilize the YOLOv7 model and adopt specific strategies to counteract the potential decline in object detection performance. Our method effectively confronts the class imbalance and label dependency observed in real-world datasets, Our method effectively confronts the class imbalance and label dependency observed in real-world datasets, offering a promising approach to improve indoor robotic navigation systems.

Framework

Autonomous elevator boarding framework

An overview of the autonomous elevator boarding process. The procedure is divided into two main categories: (a) button-pushing operations and (b) elevator boarding, which encompass tasks such as path planning, object detection, and interaction. Our GAEMI robot uses a comprehensive perception system to recognize elevator states and interact with them seamlessly.

Robot Platform

GAEMI Robot
GAEMI Robot: A sophisticated mobile manipulator equipped with a 5DoF robotic arm and a ZED camera.
GAEMI Robot
Button Pushing Demo: The robot navigates to the button position and pushes the button.

GAEMI is a sophisticated mobile manipulator equipped with a 5DoF robotic arm and a ZED camera. Its non-holonomic base features a 2D LiDAR sensor for obstacle detection and localization within a mapped environment. Additionally, GAEMI has a forward-facing RGB camera that serves as the primary sensor for our elevator perception system. This setup enables GAEMI to navigate complex indoor environments, detect elevator states, and interact with control panels effectively.

Perception System

Label Superset and Status Perception

Category Parameters
Elevator Door Opened, Moving, Closed
Current Robot Floor B6, B5, ..., B1, 1, ..., 63
Current Elevator Floor (Outside/Inside) B6, B5, ..., B1, 1, ..., 63
Current Elevator Direction (Outside/Inside) Up, Down, None
Elevator Button (Outside/Inside) Up, Down, B6, B5, ..., B1, 1, ..., 63
Label Superset: White rows represent labels processed by the indicator detection module, while the gray row indicates labels handled by the button detection module using instance segmentation.

The primary objective of our perception system is to ascertain the elevator's status, including door state, current floor, and location. We defined a comprehensive label superset covering all possible scenarios across diverse sites, enabling the robot to make decisions and navigate intricate environments effectively. This is essential for seamless navigation and interaction with elevator systems.

Addressing Class Imbalance & Small Object Detection

Class Imbalance & Small Object Detection
Our augmentation strategy addresses three key challenges in elevator perception through a systematic approach:
  1. Original Dataset: We start with the base dataset of elevator scenes.
  2. Patch Augmentation: By cropping high-resolution sections of original images, we enhance the detection of small indicators like floor numbers and buttons.
  3. Patch-Blur Augmentation: We selectively blur frequent objects (e.g., elevator doors) in the high-resolution images and remove their labels, effectively addressing class imbalance and label dependency issues.
This multi-step approach increases dataset diversity while maintaining detection accuracy for critical elevator components.

Datasets

Indicator Dataset
(a) Indicator dataset: Object detection dataset tailored to capture the basic status of an elevator.
(b) Button dataset: Instance segmentation dataset designed to identify points of interaction.

We developed two complementary datasets for our elevator interaction system. The (a) Indicator dataset is tailored to capture the basic status of an elevator, including door state, current floor indicators, and directional signals. This object detection dataset provides the fundamental situational awareness needed for decision-making.

The (b) Button dataset is an instance segmentation dataset designed to identify precise points of interaction between the robot and the elevator, facilitating successful task execution. This dataset includes pixel-level masks for elevator buttons, enabling the robot to accurately locate and interact with control panels. Together, these datasets provide the comprehensive perception capabilities required for autonomous elevator navigation.

Results

COCO-mini Evaluation

Dataset mAP@0.5 mAP@0.95
COCO-mini (base) 0.014 0.007
COCO-blur 0.018 0.009
COCO-cutout 0.012 0.006
COCO-mini Evaluation:
Comparison of mAP scores on COCO-mini variations.

Indicator Dataset

Method mAP@0.5 Status Accuracy
YOLOv7 0.730 0.813
YOLOv7 + patch 0.784 0.878
YOLOv7 + patch + blur 0.779 0.879
Indicator Dataset Performance:
Experimental results on Indicator dataset.

Real-world Results

Task Success Rate
GOTO BUTTON POSE 10/10
BUTTON PUSHING 9/10
ELEVATOR BOARDING 3/10
Real-world Experiment Results:
Success rates of three tasks executed by the robot over ten trials each.
Occupancy Map
Occupancy Map: This figure illustrates the constructed occupancy map of the 6th floor of Woojung Hall of Informatics at Korea University, which serves as the operational landscape for all our real-world robot experiments.

Our experimental results validate the effectiveness of our approach in addressing label dependency and small object detection challenges. The COCO-mini evaluation demonstrates that our blur augmentation technique outperforms both the baseline and cutout methods. On the Indicator dataset, our patch augmentation significantly improved accuracy (mAP@0.5: 0.784, Status Accuracy: 0.878) compared to the baseline, while adding blur augmentation maintained high accuracy while addressing class imbalance.

In real-world robot operations conducted at Korea University, we evaluated three critical tasks with varying success rates: 100% for navigation, 90% for button pushing, and 30% for elevator boarding over ten trials each. The lower success rate in elevator boarding indicates areas for future improvement, particularly in handling potential obstructions at the elevator door.

Real-World Demonstration

System Demonstration
Demonstration of our integrated robotic system: A comprehensive illustration of the robot successfully performing tasks within a real-world indoor environment.

Our final demonstration showcases the GAEMI robot navigating and interacting with elevators in the Woojung Hall of Informatics at Korea University. The system integrates all components - perception, planning, and control - to enable autonomous multi-floor navigation. The robot successfully detects elevator status, navigates to the appropriate position, interacts with buttons, and boards/exits elevators with high reliability. This real-world demonstration validates the practical applicability of our approach for indoor service robots operating in multi-level environments.

BibTeX


        @inproceedings{shin2023robust,
          title={Robust Detection for Autonomous Elevator Boarding using a Mobile Manipulator},
          author={Shin, Seungyoun and Lee, Joonhyung and Noh, Junhyug and Choi, Sungjoon},
          booktitle={Link Springer},
          year={2023}
        }