This repository contains code for the results in: Probably Approximately Correct Vision-Based Planning using Motion Primitives
- Quadrotor navigating an obstacle field using a depth map from an onboard RGB-D camera
- Quadruped (Minitaur, Ghost Robotics) traversing rough terrain using proprioceptive and exteroceptive (depth map from onboard RGB-D camera) feedback
The results are demonstrated in this video on a few test environments.
- PyBullet
- PyTorch
- Tensorboard
- Relevant parameters for each example are provided in a config json file located in the configs folder.
- Ensure before training that
parameters in the config file reflect your system specs. - The environments for each training example are drawn from a distribution, hence they are generated by varying the random seed. In particular, this allows us to index the environments using random seeds with ease. In the config file,
is the number of environments to train on andstart_seed
is the starting index of the environments. We train on environments fromstart_seed
tostart_seed + num_trials
- Train a Prior using Evolutionary Strategies:
- Quadrotor:
python --config_file configs/config_quadrotor.json
- Minitaur:
python --config_file configs/config_minitaur.json
- Quadrotor:
Note: Training the prior is computationally demanding. The quadrotor was trained on 480 environments (seeds:100-579) on an AWS g3.16xlarge instance with 60 CPU workers and 4 GPUs, while the Minitaur was trained on 10 environments (seeds:0-9) with 10 CPU workers and 1 GPU (Titan XP, 12 GB). For your convenience, we have shared the prior trained for the results in the paper, so this step can be skipped. Running the relevant config file will automatically load the relevant weights from the Weights folder.
- Draw
policies i.i.d. from the prior trained above and compute the cost for each policy onN
new environments:- Quadrotor:
python --config_file configs/quadrotor.json --start_seed 580 --num_envs N --num_policies m
- Minitaur:
python --config_file configs/config_minitaur.json --start_seed 10 --num_envs N --num_policies m
- Quadrotor:
Note: We have shared the computed cost matrix with 4000 environments, 50 policies for the quadrotor and 2000 environments, 50 policies for the Minitaur; see Weights/C_quadrotor.npy
and Weights/C_minitaur.npy
- Perform PAC-Bayes optimization with the parametric REP in the paper on
N_pac (<=N)
environments andm_pac (<=m)
policies using the costs computed above:- Quadrotor:
python --config_file configs/config_quadrotor.json --num_envs N_pac --num_policies m_pac
- Minitaur:
python --config_file configs/config_minitaur.json --num_envs N_pac --num_policies m_pac
- Quadrotor:
Additionally, we provide the matrices Weights/C_quadrotor_emp_test.npy
and Weights/C_minitaur_emp_test.npy
on 5000 environments (seeds:5000-9999) and 50 policies each to emprically estimate the true cost of the posterior.
To run the trained posterior on environments from seeds N0
to N0+N
- Quadrotor:
python --config_file configs/config_quadrotor.json --start_seed N0 --num_envs N
- Minitaur:
python --config_file configs/config_minitaur.json --start_seed N0 --num_envs N