Skip to content

Latest commit

 

History

History
53 lines (39 loc) · 3.26 KB

README.md

File metadata and controls

53 lines (39 loc) · 3.26 KB

All You Can Creep

All you can creep.

Presentations

14.12.2020 Presentation KickOff

Final Presentation

Links

Examples

Installation

ML Agents

Gym Wrapper

Environment Exec

Work Distribution

All coding has been done by at least two people at the same time in Pair-Programming. Therefore, we used Visual Studio Code with Live Share, so everyone could participate and write simultaneously. Because at least two people (sometimes 3 or even all 4) have been coding at the same time, everyone has a basic understanding of A2C and PPO. As requested, we divided the different tasks among us in form of experts. The division can be seen in the following table:

Topic Name Info
A2C
Split- & Multihead NN Sofie
Activation Balthasar Sigmoid, Softplus, Softmax, TanH, ReLu
Min-Max-Clamping Balthasar
Loss & Entropy Balthasar
Advantages Sofie A2C, TD, 3-Step, Reinforce
Return Sofie
A2C vs A3C Sofie
----- ----- -----
PPO
Actor & Critic NN Lukas
Memory, Buffer, Batches Lukas
Hyperparameter Denny
Reward Denny
log_prob & prob_ratio Denny
weighted_probs & clipping Lukas
----- ----- -----
Slurm Denny Slurm Runner
Parameter Search Sofie Grid Search, Evolutionary Algorithm
Environments + Unity Lukas
Ml-Flow Balthasar Measures, Artifacts
Save and Load Models Balthasar