[seminar] Making Robots See and Manipulate

November 21, 2023

2023 · Robotics Task and Motion Planning · seminar

김범준 교수님의 세미나 Making Robots See and Manipulate 내용을 기록했습니다.

Continuous motion level reasoning: Feasibility check가 필수적이며, 이는 computation expensive.
- Idea: Learn to guide Planning \(\rightarrow \red{\text{MCTS+RL}}\)
  - Tree search + Value function, policy to guide the search.
  - 어떠한 물체를 어떻게, 어디로 옮겨야 하는지 geometric reasoning에 기반한 planning을 수행함.

그러나 real world에서 로봇을 연구해보니, perceive, manipulate 하는 기본적인 능력이 전혀 없다.

General Purpose Robot 연구를 위한 필수 요소
1. \[\red{\text{Perceive and Manipulate Object}}\]
2. Solve Long-horizon sequential
3. Add Semantic, Common sense

Today’s topic is the first thing.

교수님의 보통 아이디어 building: 큰 문제 \(\rightarrow\) 작은 문제로 나눔.
1. Limited action repertorie
2. Representation and perception - How do I represent obejct states?
3. Big data for robotics - How do we efficiently generate one for robots?

그러나 많은 manipulation 연구가 Pick-n-Place라는 skill에만 국한됨; Prehensile manipulation에 치중되어 있음.

Intuition: Not all objects are graspable.
Previous approaches: Physics modeling + Planning으로 해결함.
- Limitation:
  - Estimating the properties from RGB images is extremely difficult.
  - Modeling contact is still an active area of research. They make simplifying assumptions.
  - Planning trajecories take significant amount of time.

1. Limited action repertorie: Non-Prehensile Tasks

Manipulation System

Pre and Post-Contact Policy Decomposition for Non-Prehensile Manipulation with Zero-Shot Sim-To-Real Transfer: IROS 2023 paper;

너무 크거나 너무 납작한 물체를 밀어서 Pose를 조정함.
단차가 있는 벽 위로 물체를 옮겨야 할 때에.

Limitation:

requires a lot of data: isaac sim
Exploration is extremly hard for non-prehensile manipulation;
기존의 Task definition; The manipulatee is always in close proximity to the manipulator
Contact inducing reward를 추가할 수 있음. 다만, 잘 설계해야 함. It may make an ineffective contact.

Approach:

Divided into two stages: 1. Pre-contact phase / 2. Post-contact phase; Tow distint policies.
Pre-contact policy Action space; 물체 위의 어느 point에 놓을 것인가, contact point에 대한 RL
Post-contact policy Action space; Target end-effector pose (Time-varying Impedance control)

Whole-Body Manipulation

How to learn Simultaneout balancing and manipulation

Hierarchical policy decomposition + curriculum leraning (이전에는 Series로 수행되었음.)

Lesson learned:

Modularity가 중요하다. 이것이 more efficient learning을 가능하게 함.
Manipulator에서는 Action space를 따로 정의하는 것이 Exploration에서 더 효율적이었으며, Debug 과정에서 수월함.

2. Representation and perception - How do I represent obejct states?

움직이는 motion 자체가 너무 느리다. Hardware 자체적인 성능도 아직은 너무 뒤떨어진다. 훨신 빠르고 Dynamic하게 + Learning purposed에 맞춰서 제작하고자 함.

Intuition; How do I represent object states?

Setup: Three cameras.
Estimating the 3D Spatial occupancy is important; Encoder of a Shape completion algorithm
어떠한 Signal이 [high/Low]-value representation에 영향을 끼치는가?
- Contact presence와 Loaction이 매우 중요함.

CORN: Contact-based Object Representation

Patch Transformer
- estimated shape \(\rightarrow\) RRT + Grasping (Contact-based)

3. Big data for robotics - How do we efficiently generate one for robots?

Big data in simulator:
- But, Collision Detection이 Non-convex object에 대해서 too slow
  - Contact Detection in the simulator is too slow for non-convex object.
- GJK cannot leverage the parallel compuation.
- Shape encoder \(\rightarrow\) Collision Predictor; A lot of 3D assets to train this.
Contribution: Local similarity.
- Contact이 Local geometric에서는 매우 비슷한 양상을 보임.

질문:

Q. 흡착형은 어때요? A. 오염이 자주 됨.

Enjoy Reading This Article?

Here are some more articles you might like to read next:

[paper-review] 6-DOF GraspNet: Variational Grasp Generation for Object Manipulation

[paper-review] MOKA: Open-Vocabulary Robotic Manipulation through Mark-Based Visual Prompting

[paper-review] Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language Models

[paper-review] Reactive Base Control for On-The-Move Mobile Manipulation in Dynamic Environments

[study] Vector Quantization