Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Mellanox] support thermal sensor which has discrete index #237

Closed
wants to merge 9 commits into from

Conversation

Junchao-Mellanox
Copy link
Owner

@Junchao-Mellanox Junchao-Mellanox commented Jan 8, 2025

Why I did it

Most of the thermal sensor has continues index, for example: module1_temp_input, module2_temp_input. However, there could be some thermal sensors whose index is discrete. For example, some platform only contains thermal sensor for sodimm2_temp_input, but there is no such sensor for sodimm1_temp_input.

This PR is to support thermal sensor which has discrete index.

Work item tracking
  • Microsoft ADO (number only):

How I did it

Allow sensor with discrete index, create thermal object for it

How to verify it

manual test
unit test

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

Sorry, something went wrong.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…n high CPU utilization scenario (sonic-net#21316)

Why I did it
Fix sonic-net#21314
Update and prolong the timeout of the requests between snmpd and SNMP AgentX.

In SONiC SNMP AgentX, the MIB updaters and AgentX client shares the same AsyncIO/Coroutine event loop.
During the MIB updaters update the SNMP values, the AgentX client can't respond to the snmpd request.

The default value of snmpd request is 1s(timeout) * 5(retries)

When the CPU is high, the MIB updaters are slow, 1s timeout is not enough, even if it retries 5 times.
Hence update to 5s(timeout) * 4(retries), the time windows = 20s, which makes sure the SNMP request can be handled even with 100% CPU utilization.

Work item tracking
Microsoft ADO 30112399:

How I did it
Update the default value(https://linux.die.net/man/5/snmpd.conf):

agentXTimeout 1(default value) -> 5
agentXRetries 5(default value) -> 4

How to verify it
Test on Cisco chassis, test_snmp_cpu.py which triggers 100% CPU utilization test whether snmp requests work well.
@Junchao-Mellanox Junchao-Mellanox force-pushed the master-thermal-fix branch 3 times, most recently from 64d7b3f to fd20d94 Compare January 8, 2025 06:33
opcoder0 and others added 3 commits January 8, 2025 17:38

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Why I did it
Adding pytest-stress to sonic-mgmt image will help in running stress tests.

How I did it
pip install pytest-stress plugin

How to verify it
Manually tested the image with DUT

Tested branch (Please provide the tested image version)
Not applicable.

Description for the changelog
Adding pytest-stress to sonic-mgmt image will help in running stress tests.

Link to config_db schema for YANG module changes
Not applicable

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Exclude pie ports from buffer and qos config

Signed-off-by: Zhixin Zhu <zhixzhu@cisco.com>

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Disable vstest job because sonictest agent pool is failing.
To unblock PRs.
arlakshm and others added 4 commits January 9, 2025 09:34

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…onic-net#21299)

In SAI 11.x the SAI integrity counters is not supported completely. So, to detect packet drops due to packet corruptions or credit watchdog timeouts. We enable these interrupts and set the verbose level to error. This will generate syslogs which can be used to detect.

* add sai_postinit_cmd.soc with interupt IDs for alerting

* update config.bcm with the path to sai_postinit_cmd.soc

* update config_bcm for j2 card
---------

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <arlakshm@microsoft.com>

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…sonic-net#21245)

Issue to be fix: Currently operational status of mgmt interface is not present or correct for multi-asic devices.
Root cause: Operational status of mgmt interface is updated by portsyncd in swss docker. In case of multi-asic platform, swss service is started only in asic namespace context. Since portsyncd is running in a specific network namespace context, it is not aware of mgmt interface present in the host namespace of multi-asic platform. Therefore there is no way for portsyncd to find the operational status of mgmt interface and update in STATE_DB MGMT_PORT_TABLE.
Use case: SNMP interface MIB reads MGMT_PORT_TABLE in STATE_DB to retrieve oper status of mgmt interface periodically. In case of multi-asic platform, currently this is returning the oper status of 'eth0' interface which is the virtual interface that is present inside asic namespace which gets created as a part of database docker and is not the actual management interface.

---------

Signed-off-by: Suvarna Meenakshi <sumeenak@microsoft.com>

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…utomatically (sonic-net#21369)

#### Why I did it
src/sonic-host-services
```
* 744c673 - (HEAD -> master, origin/master, origin/HEAD) Fix no info log in syslog for caclmgrd (#200) (10 minutes ago) [Zhaohui Sun]
```
#### How I did it
#### How to verify it
#### Description for the changelog

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…omatically (sonic-net#21359)

#### Why I did it
src/sonic-swss-common
```
* 12c428e - (HEAD -> master, origin/master, origin/HEAD) [schema] add SRv6 config db tables (sonic-net#962) (21 hours ago) [Yakiv Huryk]
```
#### How I did it
#### How to verify it
#### Description for the changelog
@Junchao-Mellanox
Copy link
Owner Author

ci 308 passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants