ViNT: A Foundation Model for Visual Navigation
Notes
High level block diagram
Summary
Goal
Navigation in a previously seen environment.
Exploration and Navigation in unseen environments.
Motivation
Robotics Point of view
No localization, SLAM(Mapping), or Calibration is required
RL Point of view
Learning policy where goal is changing.
Generalization to multiple robots.
supports different downstream tasks.
Technology used
EfficientNet-B0
Tokenize Image inputs
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks