-
Notifications
You must be signed in to change notification settings - Fork 698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recurrent crashes in goal_planner: segmentation fault linked to freespace_planning_algorithms::AstarSearch deallocation #5154
Comments
@NorahXiong |
I tried many times but the crash never happened and no clue was found in the related code, Is there any special step not mentioned in the reproduce steps? |
@NorahXiong
goal_planner_node_die.webm |
@NorahXiong issue_5154-2023-10-02_22.37.17_mini.mp4 |
This pull request has been automatically marked as stale because it has not had recent activity. |
@NorahXiong |
@VRichardJP |
It looks like a concurrency issue, as the program seems to crash at different places:
Did you try to run the program with valgrind? e.g. with |
@VRichardJP |
I am not sure I totally understand how the modules are running, but the Is my understanding correct?
Then, what happens to the callback created by But here, the callback is running in a different queue, so you may have a thread running inside For instance, what happens if you put a sleep right before the I guess it will crash right away. |
@VRichardJP
yes, your understanding is correct. The manager deletes or std::move modules depending on the situation. If FreespacePullOut is running in a separate thread at this time, it is possible that the data could be rewritten and crashed. So, I feel that locking the clear from manager while FreespacePullOut's callback is running, or as a separate instance of FreespacePullOut (building a server), etc. might be a solution.
I would like to confirm this as well, but I wonder if it dies during the planFreespacePath process and then sleeps before that process? Is the intention to generate a time delay so that clearing is more likely to occur during the planFreespacePath process ?Specifically, should I perform the same reproduction method by doing the following?
|
@kosuke55 if (isStuck() && is_new_costmap) {
std::this_thread::sleep_for(std::chrono::seconds(10));
planFreespacePath();
} In particular, if you reset the goal in that 10s window, the freespace object (or things it is refering to) are likely to be destroyed/moved, and I guess you will get some sort of segmentation fault. |
Have you found out the reason leading to the segmentation fault? I'm sorry I have to try it again later if you still need. |
@NorahXiong |
@kosuke55 @kyoichi-sugahara |
@NorahXiong |
@NorahXiong
And the goal is needed to be put in the |
As @VRichardJP indicated, sleep could easily produce a crash.
freespace_pull_over-2024-02-05_23.40.27.mp4 |
#6322 may fix the issue, we will test more |
@kosuke55
Information may help you confirm the cause:
Env Info:
test_of_issue_5154_new.mp4 |
This pull request has been automatically marked as stale because it has not had recent activity. |
Checklist
Description
While executing the goal_planner, the program crashes due to a segmentation fault. Based on the stack trace, the issue seems to arise when an std::unordered_map, holding values of type freespace_planning_algorithms::AstarNode, is being deallocated.
goal_planner_issue_5154.webm
Expected behavior
The intended behavior is for the nodes to remain alive, and for freespace_planning_algorithms::AstarNode to successfully generate a path to the goal and reach it without issues.
Actual behavior
It's not guaranteed to be reproducible 100% of the time when generating paths using freespace_planning_algorithms, but after several repetitions, the node eventually crashes.
Here is the stack trace:
Steps to reproduce
Please use attached lanelet map
virtual_G_dev_road_shoulder.zip
a. Set ego vehicle initial pose.
b. Set various pattern of goal pose.
Possible causes
mutex
in the goal_planner, which runs in multi-threading, may not be operating correctly.Additional context
This issue results in an occasional crash of the node, affecting the reliability of the path planning process, and thereby requires prompt attention and resolution.
The text was updated successfully, but these errors were encountered: