Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds an initial StableBaselines3 RL environment as an example #2667

Open
wants to merge 24 commits into
base: main
Choose a base branch
from

Conversation

arjo129
Copy link
Contributor

@arjo129 arjo129 commented Nov 7, 2024

This PR provides an incredibly basic example of how to use gazebo with StableBaselines3 for RL. This example is that of the classic cartpole which is commonly used as a "getting started" task in reinforcement learning. The python script trains a simple model using python to balance a cart pole. We use the gui to visualize it.

This PR adds support for the Reset API to the test fixture. As `TestFixture`
is one of the main ways one can get access to the ECM in python
when trying to write some scripts for Deep Reinforcement Learning I
realized that without `Reset` supported in the `TestFixture` API, end
users would have a very hard time using our python APIs (which are
actually quite nice). For reference I'm hacking a demo template here:

https://github.com/arjo129/gz_deep_rl_experiments/tree/ionic

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
This allows us to reset simulations without having to call into
gz-transport making the code more readable from an external API.
Depends on #2647

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
A lot of things are not working. Particularly when `ResetAll` is called,
the EnableVelocityChecks does not trigger the phyics system to populate
the velocity components. This is a blocker for the current example.

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
@arjo129
Copy link
Contributor Author

arjo129 commented Nov 12, 2024

So the above code should be able to train a RL model even on a potato. Currently I've got the algorithm to successfully balance a cart pole. There are some open issues however that will block this from being merged. Primarily, my main concern is that I've hacked together an API for running the gui client.

RL_with_gazebo_simple_example

arjo129 added a commit that referenced this pull request Dec 20, 2024
* Adds support for Reset in test fixture

This PR adds support for the Reset API to the test fixture. As `TestFixture`
is one of the main ways one can get access to the ECM in python
when trying to write some scripts for Deep Reinforcement Learning I
realized that without `Reset` supported in the `TestFixture` API, end
users would have a very hard time using our python APIs (which are
actually quite nice). For reference I'm hacking a demo template here:

#2667
---------

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
void defineGuiClient(pybind11::module &_module)
{

_module.def("run_gui", [](){
Copy link
Contributor Author

@arjo129 arjo129 Feb 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently a blocker: I'm not sure what the best way forward is.

  • Gazebo GUI does not like running in the same process as the server. It core dumps when I try to run it from within a std::thread.
  • Forking works on linux, but windows has no concept of fork(). It seems to start a new process, you must call an executable via CreateProcess instead of simply fork()ing.
  • An alternative would be to create a jupyter widget out of gz-web but that will need us to work on the websocket server and have some python integration with gz-launch

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forking works on linux, but windows has no concept of fork(). It seems to start a new process, you must call an executable via CreateProcess instead of simply fork()ing.

We had similar problems on Gazebo Classic, in my experience to abstract away the problem of creating processes on Windows it is more convenient with libraries like https://github.com/DaanDeMeyer/reproc or https://gitlab.com/eidheim/tiny-process-library .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, I guess we will have a similar problem when implementing gz-sim standalone executable for gazebosim/gz-tools#7 .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So last night I had a discussion with @azeey. I think the proposed method is to use the runGuiMain.cc executable on both platforms. I haven't tried it yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An update is python's os.fork() works via Cygwin on windows and may be a viable option for us.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An update is python's os.fork() works via Cygwin on windows and may be a viable option for us.

I am not sure about that. The official Python binary on Windows from python.org, the python installed by uv and the python installed by conda-forge all use msvc-based python. To use the cygwin powered python, you need to install python via cygwin, and that also prevents you to install most existing Windows wheels, that assume the use msvc-calling convention or runtime libraries (see pypa/cibuildwheel#329).

Copy link
Contributor Author

@arjo129 arjo129 Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm going ahead and using the binary method for now.

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
Signed-off-by: Ubuntu <arjoc@intrinsic.ai>
Signed-off-by: Ubuntu <arjoc@intrinsic.ai>
Signed-off-by: Ubuntu <arjoc@intrinsic.ai>
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
Base automatically changed from arjo/feat/server_reset_public_api to main February 10, 2025 06:44
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
@arjo129 arjo129 marked this pull request as ready for review March 5, 2025 03:39
@arjo129 arjo129 requested a review from mjcarroll as a code owner March 5, 2025 03:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In progress
Development

Successfully merging this pull request may close these issues.

2 participants