\$π\_0\$: A Vision-Language-Action Flow Model for General Robot Control

Paper links

Preliminaries

VLA (Vision Language action model)

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

Cross embodiment training (Data coming from different robots)

Open X-Embodiment: Robotic Learning Datasets and RT-X Models