-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SONIC-SWSS][PORT] inconsistent behavior between combine and separate port configuration deployment #21959
Comments
@liuh-80 Can you please help investigate this issue? @qiluo-msft for visibility. |
@yuazhe , can you share me following information to reproduce this issue?
I try following steps multiple times with latest 202411 image on KVM testbed, but can't reproduce this issue:
Also, the issue seems not related with this change, because after I revert it also can't reproduce this issue: sonic-net/sonic-swss#3304 |
@qiluo-msft your reproduce steps is correct, when the issue happens, it will looks like
I was using 2411 image wish SKU ACS-MSN4600C, but I think this problem is general and not platform specific |
As my understand, the root cause of this issue is not the CONFIG_DB write operation "merged", it's not retry when autoneg failed. |
Issue can be easily reproduced on Mellanox 4600 hardware: 2025-03-13.06:25:49.281583|PORT_TABLE:Ethernet4|SET|alias:etp2|description:ARISTA01T2:Ethernet2|fec:rs|index:2|lanes:8,9,10,11|pfc_asym:off|speed:1000|subport:0|tpid:0x8100|interface_type:CR|adv_speeds:all|adv_interface_types:CR,CR2,CR4|mtu:9100|admin_status:up $ show interface status
|
After revert this PR sonic-net/sonic-swss#3304, the issue still happen: 2025-03-13.06:47:40.946313|PORT_TABLE:Ethernet8|SET|alias:etp3|description:etp3|fec:rs|index:3|lanes:16,17,18,19|pfc_asym:off|speed:1000|subport:0|tpid:0x8100|interface_type:CR|adv_speeds:1000|adv_interface_types:CR4|mtu:9100|admin_status:up admin@bjw2-can-4600c-3:~$ show interface status
|
Below flow will generate an invalid auto negotiation configuration scenario, in this case sometime the port will never be up again
It could been seen from swss.rec that in enable autoneg command, there could be 2 possible configuration deployment ways.
This is because https://github.com/sonic-net/sonic-swss/blob/4eb74f0082f0f8c4537fe58621ac902c870d217c/cfgmgr/portmgr.cpp#L206
could be ran either before or together with https://github.com/sonic-net/sonic-swss/blob/4eb74f0082f0f8c4537fe58621ac902c870d217c/cfgmgr/portmgr.cpp#L230
which can't be controlled.
and
The first one will always keep the port up, but the second one will always keep the port down because during autoneg it will fail and directly continue without any fallback mechanism
https://github.com/sonic-net/sonic-swss/blob/4eb74f0082f0f8c4537fe58621ac902c870d217c/orchagent/portsorch.cpp#L4085-L4100
The text was updated successfully, but these errors were encountered: