-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds an initial StableBaselines3 RL environment as an example #2667
base: main
Are you sure you want to change the base?
Conversation
This PR adds support for the Reset API to the test fixture. As `TestFixture` is one of the main ways one can get access to the ECM in python when trying to write some scripts for Deep Reinforcement Learning I realized that without `Reset` supported in the `TestFixture` API, end users would have a very hard time using our python APIs (which are actually quite nice). For reference I'm hacking a demo template here: https://github.com/arjo129/gz_deep_rl_experiments/tree/ionic Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
This allows us to reset simulations without having to call into gz-transport making the code more readable from an external API. Depends on #2647 Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
A lot of things are not working. Particularly when `ResetAll` is called, the EnableVelocityChecks does not trigger the phyics system to populate the velocity components. This is a blocker for the current example. Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
So the above code should be able to train a RL model even on a potato. Currently I've got the algorithm to successfully balance a cart pole. There are some open issues however that will block this from being merged. Primarily, my main concern is that I've hacked together an API for running the gui client. |
* Adds support for Reset in test fixture This PR adds support for the Reset API to the test fixture. As `TestFixture` is one of the main ways one can get access to the ECM in python when trying to write some scripts for Deep Reinforcement Learning I realized that without `Reset` supported in the `TestFixture` API, end users would have a very hard time using our python APIs (which are actually quite nice). For reference I'm hacking a demo template here: #2667 --------- Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
python/src/gz/sim/Gui.cc
Outdated
void defineGuiClient(pybind11::module &_module) | ||
{ | ||
|
||
_module.def("run_gui", [](){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is currently a blocker: I'm not sure what the best way forward is.
- Gazebo GUI does not like running in the same process as the server. It core dumps when I try to run it from within a
std::thread
. - Forking works on linux, but windows has no concept of
fork()
. It seems to start a new process, you must call an executable viaCreateProcess
instead of simplyfork()
ing. - An alternative would be to create a jupyter widget out of
gz-web
but that will need us to work on the websocket server and have some python integration withgz-launch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forking works on linux, but windows has no concept of fork(). It seems to start a new process, you must call an executable via CreateProcess instead of simply fork()ing.
We had similar problems on Gazebo Classic, in my experience to abstract away the problem of creating processes on Windows it is more convenient with libraries like https://github.com/DaanDeMeyer/reproc or https://gitlab.com/eidheim/tiny-process-library .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, I guess we will have a similar problem when implementing gz-sim
standalone executable for gazebosim/gz-tools#7 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So last night I had a discussion with @azeey. I think the proposed method is to use the runGuiMain.cc
executable on both platforms. I haven't tried it yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An update is python's os.fork()
works via Cygwin on windows and may be a viable option for us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An update is python's
os.fork()
works via Cygwin on windows and may be a viable option for us.
I am not sure about that. The official Python binary on Windows from python.org, the python installed by uv and the python installed by conda-forge all use msvc-based python. To use the cygwin powered python, you need to install python via cygwin, and that also prevents you to install most existing Windows wheels, that assume the use msvc-calling convention or runtime libraries (see pypa/cibuildwheel#329).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm going ahead and using the binary method for now.
Signed-off-by: Ubuntu <arjoc@intrinsic.ai>
Signed-off-by: Ubuntu <arjoc@intrinsic.ai>
…ample' into arjo/examples/rl_example
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
This PR provides an incredibly basic example of how to use gazebo with StableBaselines3 for RL. This example is that of the classic cartpole which is commonly used as a "getting started" task in reinforcement learning. The python script trains a simple model using python to balance a cart pole. We use the gui to visualize it.