Skip to content

Commit

Permalink
Merge pull request #22 from SeisSol/davschneller/half
Browse files Browse the repository at this point in the history
Cleanup, Green Sparsity, RISC-V V, and AVX10 Support (Version 0.3.0)
  • Loading branch information
davschneller authored Jan 27, 2025
2 parents a21845e + abf5407 commit ac78a76
Show file tree
Hide file tree
Showing 66 changed files with 3,840 additions and 1,761 deletions.
168 changes: 31 additions & 137 deletions .github/workflows/codegen.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,140 +38,39 @@ jobs:
echo "Success!"
pspamm-codegen-avx:
name: pspamm-codegen-avx
pspamm-codegen:
name: pspamm-codegen
runs-on: ubuntu-24.04
needs: install-pspamm
steps:
- name: apt-get
run: |
set -euo pipefail
sudo apt-get update
sudo apt-get install g++ qemu-user-static
- name: setup-python
uses: actions/setup-python@v4
with:
python-version: '3.11'

- name: checkout-pspamm
uses: actions/checkout@v4

- name: pip-pspamm
run: |
pip install .
- name: pspamm-tests-generate
run: |
cd tests/
python unit_tests_hsw.py
- name: pspamm-tests-compile
run: |
cd tests/
g++ -static -mavx512f build/hsw_testsuite.cpp -o build/hsw-test
- name: pspamm-tests-run
run: |
cd tests/
qemu-x86_64-static -cpu Haswell build/hsw-test
pspamm-codegen-avx512-no-run:
name: pspamm-codegen-avx512-no-run
runs-on: ubuntu-24.04
needs: install-pspamm
steps:
- name: apt-get
run: |
set -euo pipefail
sudo apt-get update
sudo apt-get install g++ qemu-user-static
- name: setup-python
uses: actions/setup-python@v4
with:
python-version: '3.11'

- name: checkout-pspamm
uses: actions/checkout@v4

- name: pip-pspamm
run: |
pip install .
- name: pspamm-tests-generate
run: |
cd tests/
python unit_tests_knl.py
- name: pspamm-tests-compile
run: |
cd tests/
g++ -static -mavx512f build/knl_testsuite.cpp -o build/knl-test
# disabled, since qemu doesn't support AVX512F (yet) with of Ubuntu 24.04
# - name: pspamm-tests-run
# run: |
# cd tests/
# qemu-x86_64-static -cpu Skylake-Server build/knl-test

pspamm-codegen-aarch64:
name: pspamm-codegen-aarch64
runs-on: ubuntu-24.04
needs: install-pspamm
steps:
- name: apt-get
run: |
set -euo pipefail
sudo apt-get update
sudo apt-get install g++-aarch64-linux-gnu qemu-user-static
- name: setup-python
uses: actions/setup-python@v4
with:
python-version: '3.11'

- name: checkout-pspamm
uses: actions/checkout@v4

- name: pip-pspamm
run: |
pip install .
- name: pspamm-tests-generate
run: |
cd tests/
python unit_tests_arm.py
- name: pspamm-tests-compile
run: |
cd tests/
aarch64-linux-gnu-g++ -static -march=armv8.2-a build/arm_testsuite.cpp -o build/arm-test
- name: pspamm-tests-run
run: |
cd tests/
qemu-aarch64-static -cpu max build/arm-test
pspamm-codegen-armsve:
name: pspamm-codegen-armsve
runs-on: ubuntu-24.04
needs: install-pspamm
# include vector lengths for SVE manually (for now)
# include all vector lengths manually for now
# not supported:
# * RVV >= 1024
# * ARM/SVE which is not a power of 2
strategy:
fail-fast: false
matrix:
vectorlen:
- 128
- 256
- 512
- 1024
- 2048
arch:
- hsw128
- hsw256
- knl128
- knl256
- knl512
- arm128
- arm_sve128
- arm_sve256
- arm_sve512
- arm_sve1024
- arm_sve2048
- rvv128
- rvv256
- rvv512
- rvv1024
steps:
- name: apt-get
run: |
set -euo pipefail
sudo apt-get update
sudo apt-get install g++-aarch64-linux-gnu qemu-user-static
sudo apt-get install g++-aarch64-linux-gnu g++-riscv64-linux-gnu qemu-user-static
- name: setup-python
uses: actions/setup-python@v4
Expand All @@ -185,17 +84,12 @@ jobs:
run: |
pip install .
- name: pspamm-tests-generate
- name: pspamm-tests
run: |
cd tests/
python unit_tests_arm_sve.py ${{matrix.vectorlen}}
- name: pspamm-tests-compile
run: |
cd tests/
aarch64-linux-gnu-g++ -static -march=armv8.2-a+sve -msve-vector-bits=${{matrix.vectorlen}} build/arm_sve${{matrix.vectorlen}}_testsuite.cpp -o build/arm_sve${{matrix.vectorlen}}-test
- name: pspamm-tests-run
run: |
cd tests/
qemu-aarch64-static -cpu max,sve${{matrix.vectorlen}}=on,sve-default-vector-length=-1 build/arm_sve${{matrix.vectorlen}}-test
LOCALARCH=${{matrix.arch}}
if [[ ${LOCALARCH:0:3} == "knl" ]]; then
./runlocal.sh ${{matrix.arch}} norun
else
./runlocal.sh ${{matrix.arch}}
fi
32 changes: 21 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,32 @@
# Code Generator for Sparse Matrix Multiplication
Generates inline-Assembly for sparse Matrix Multiplication.
# PSpaMM

Currently Intel Xeon Phi 'Knights Landing' (AVX512), Haswell/Zen2 (AVX2), and ARM Cortex-A53 (ARMv8) are supported.
A code generator for small matrix multiplications.

Currently supported:

* x86_64: AVX2, AVX512/AVX10.1
* ARM/AARCH64: NEON, SVE (128,256,512,1024,2048 bit)
* RISC-V: V (128,256,512,1024,2048,4096,8192 bit)

## Installation

PspaMM is a Python package. I.e. you may do
PspaMM is a Python package. I.e. after cloning, may install it via pip.

```
pip install .
```
Alternatively, you can install it directly by running

to install it.
```bash

## Usage
pip install git+https://github.com/SeisSol/PSpaMM.git

```
pspamm-generator M N K LDA LDB LDC ALPHA BETA --arch {arm,arm_sve{128,256,512,1024,2048},knl,hsw} \
--mtx_filename MTX_FILE_PATH --output_funcname FUNCTION_NAME --output_filename OUTPUT_NAME

## Usage

```bash

pspamm-generator M N K LDA LDB LDC ALPHA BETA \
--arch {arm,arm_sve{128..2048},knl{128..512},hsw{128..256},rvv{128..8192}} \
--amtx_filename MTX_FILE_PATH --bmtx_filename MTX_FILE_PATH \
--output_funcname FUNCTION_NAME --output_filename OUTPUT_NAME

```
2 changes: 1 addition & 1 deletion pspamm/VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.2.3
0.3.0
12 changes: 0 additions & 12 deletions pspamm/architecture.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,17 +8,5 @@ def init():
generator = None
operands = None



#https://stackoverflow.com/questions/452969/does-python-have-an-equivalent-to-java-class-forname

def get_class( kls ):
return import_module(kls)
parts = kls.split('.')
module = ".".join(parts[:-1])
m = __import__( module )
for comp in parts[1:]:
m = getattr(m, comp)
return m


57 changes: 6 additions & 51 deletions pspamm/codegen/analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,57 +3,12 @@

from typing import List, Set

class Analyzer(Visitor):

class Analyzer:
def __init__(self, starting_regs: List[Register] = None):
self.clobbered_registers = set(starting_regs)
self.stack = []

def visitFma(self, stmt: FmaStmt):
self.clobbered_registers.add(stmt.add_dest)

def visitMul(self, stmt: FmaStmt):
self.clobbered_registers.add(stmt.dest)
self.clobbered_registers.add(stmt.mult_src)

def visitBcst(self, stmt: FmaStmt):
self.clobbered_registers.add(stmt.dest)

def visitAdd(self, stmt: AddStmt):
self.clobbered_registers.add(stmt.dest)

def visitLabel(self, stmt: LabelStmt):
pass

def visitJump(self, stmt: JumpStmt):
pass

def visitMov(self, stmt: MovStmt):
if isinstance(stmt.dest, Register):
self.clobbered_registers.add(stmt.dest)

def visitLea(self, stmt: MovStmt):
self.clobbered_registers.add(stmt.dest)

def visitStore(self, stmt: MovStmt):
if isinstance(stmt.dest, Register):
self.clobbered_registers.add(stmt.dest)

def visitLoad(self, stmt: MovStmt):
if isinstance(stmt.dest, Register):
self.clobbered_registers.add(stmt.dest)

def visitPrefetch(self, stmt: PrefetchStmt):
self.clobbered_registers.add(stmt.dest.base)

def visitCmp(self, stmt: CmpStmt):
pass

def visitBlock(self, block: Block):
self.stack.append(block)
for stmt in block.contents:
stmt.accept(self)
self.stack.pop()


self.input_operands = set()

def collect(self, block: Block):
for instr in block.flatten():
self.clobbered_registers |= instr.regs()
self.input_operands |= instr.args()
Loading

0 comments on commit ac78a76

Please sign in to comment.