Skip to content

Commit 71f3381

Browse files
authored
Fix docker auto restart issue (#21426)
<!-- Please make sure you've read and understood our contributing guidelines: https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md failure_prs.log skip_prs.log Make sure all your commits include a signature generated with `git commit -s` ** If this is a bug fix, make sure your description includes "fixes #xxxx", or "closes #xxxx" or "resolves #xxxx" Please provide the following information: --> #### Why I did it if critical process crashes or killed, bmp docker container will not be auto-restarted. ##### Work item tracking - Microsoft ADO **(number only)**:30807821 #### How I did it /usr/bin/supervisor-proc-exit-listener takes in charge of critical process monitor and event publish, thus it should be autorestar-ted in any case, otherwise there might be issue if supervisor-proc-exit-listener crashes, or in some test cases like "docker exec bmp kill -SIGKILL -1" critical processes may not work correctly in some race condition (depends on whether supervisor-proc-exit-listener is the last one to be killed) When a container receives the SIGKILL signal to terminate its processes, the order in which the processes are actually terminated can depend on the scheduling and resource availability within the container. Scheduling: Within a container, processes are scheduled by the operating system or container runtime. The order in which the processes are scheduled to run can impact the order of termination. The scheduler determines which process gets executed first, and this can vary depending on factors such as process priorities, resource availability, and the scheduling algorithm used. Resource Availability: Containers share resources such as CPU, memory, and disk I/O. When a SIGKILL signal is sent to all processes, the available resources might be limited or constrained. The order in which processes get terminated can be affected by resource contention. If resources are heavily utilized, some processes might be prioritized for termination over others due to resource constraints. as a result of this, if supervisor-proc-exit-listener is killed first before critical process, container auto restart will not be launched as expected. #### How to verify it ![image](https://github.com/user-attachments/assets/1ca1c2ed-7718-4132-8195-34c9fee380fe) <!-- If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012. --> #### Which release branch to backport (provide reason below if selected) <!-- - Note we only backport fixes to a release branch, *not* features! - Please also provide a reason for the backporting below. - e.g. - [x] 202006 --> - [ ] 201811 - [ ] 201911 - [ ] 202006 - [ ] 202012 - [ ] 202106 - [ ] 202111 - [ ] 202205 - [ ] 202211 - [ ] 202305 #### Tested branch (Please provide the tested image version) <!-- - Please provide tested image version - e.g. - [x] 20201231.100 --> - [ ] <!-- image version 1 --> - [ ] <!-- image version 2 --> #### Description for the changelog <!-- Write a short (one line) summary that describes the changes in this pull request for inclusion in the changelog: --> <!-- Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU. --> #### Link to config_db schema for YANG module changes <!-- Provide a link to config_db schema for the table for which YANG model is defined Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md --> #### A picture of a cute animal (not mandatory but encouraged)
1 parent 3ea3c23 commit 71f3381

File tree

4 files changed

+4
-4
lines changed

4 files changed

+4
-4
lines changed

dockers/docker-sonic-bmp/supervisord.conf

+1-1
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ buffer_size=1024
1616
command=/usr/bin/supervisor-proc-exit-listener --container-name bmp
1717
events=PROCESS_STATE_EXITED,PROCESS_STATE_RUNNING
1818
autostart=true
19-
autorestart=false
19+
autorestart=unexpected
2020
buffer_size=1024
2121

2222
[program:rsyslogd]

dockers/docker-sonic-gnmi/supervisord.conf

+1-1
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ buffer_size=1024
1616
command=/usr/bin/supervisor-proc-exit-listener --container-name gnmi
1717
events=PROCESS_STATE_EXITED,PROCESS_STATE_RUNNING
1818
autostart=true
19-
autorestart=false
19+
autorestart=unexpected
2020
buffer_size=1024
2121

2222
[program:rsyslogd]

dockers/docker-sonic-restapi/supervisord.conf

+1-1
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ buffer_size=1024
1616
command=/usr/bin/supervisor-proc-exit-listener --container-name restapi
1717
events=PROCESS_STATE_EXITED,PROCESS_STATE_RUNNING
1818
autostart=true
19-
autorestart=false
19+
autorestart=unexpected
2020
buffer_size=1024
2121

2222
[program:rsyslogd]

dockers/docker-sonic-telemetry/supervisord.conf

+1-1
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ buffer_size=1024
1616
command=/usr/bin/supervisor-proc-exit-listener --container-name telemetry
1717
events=PROCESS_STATE_EXITED,PROCESS_STATE_RUNNING
1818
autostart=true
19-
autorestart=false
19+
autorestart=unexpected
2020
buffer_size=1024
2121

2222
[program:rsyslogd]

0 commit comments

Comments
 (0)