Handle ndarray matmul broadcasting #1679

lancelet · 2024-04-22T06:15:47Z

Pull Request Template

Checklist

Confirmed that run-checks all script has been executed.
Made sure the book is up to date with changes in this PR.

Related Issues/PRs

Fixes #1646. Related to #1499

Changes

Use strides to map linear batch indices from the output back to the input arrays.

In matmul, we have arrays of shape [l_0, l_1, ..., l_N, m, k] @ [r_0, r_1, ..., r_N, k, n]. The dimensions l_i and r_i may be broadcast using the following (fairly standard) rules:

If l_i == r_i then the output dimension size is this size (no broadcasting).
If either l_i or r_i are 1, then the output dimension is max(l_i, r_i) and the array with the dimension equal to 1 is broadcast along axis i.

(The innermost two dimensions of the output are always [m, n], following standard matrix multiplication rules.)

When performing the stacked matrix multiplies within the batched matmul, it is necessary to be able to look up which matrices should be multiplied. In this PR, that is done using a stride approach:

We iterate over a flattened index into the output array's batches (same as before).
We convert this flattened batch index into a component batch index.
We multiply the component batch index by batch strides for the left and right arrays to produce a flattened index into both.

This uses a standard stride lookup technique for broadcasting, where the stride for a broadcast dimension is zero.

Testing

Added test for a [2, 1, 2, 2] @ [1, 2, 2, 2] -> [2, 2, 2, 2] case which was previously returning incorrect dimensions.
Added tests for broadcast shape and stride calculations.

- Use strides to map linear batch indices from the output back to the input arrays.

lancelet · 2024-04-22T06:36:46Z

crates/burn-ndarray/src/ops/matmul.rs

-        let lhs_array = lhs.array.into_shape((batch_size_lhs, m, k)).unwrap();
-        let rhs_array = rhs.array.into_shape((batch_size_rhs, k, n)).unwrap();
+        let lhs_array = NdArray::<E>::float_reshape(lhs, Shape::new([num_l_batches, m, k])).array;
+        let rhs_array = NdArray::<E>::float_reshape(rhs, Shape::new([num_r_batches, k, n])).array;


I originally tried this with just into_shape, but that fails in some cases with an error about an incompatible layout. The original code was also calling float_reshape (below), so I've re-used that here.

lancelet · 2024-04-22T06:39:04Z

crates/burn-ndarray/src/ops/matmul.rs

+#[derive(Debug, PartialEq)]
+struct Strides {
+    strides: Vec<usize>,
+}


The Vec here is a little frustrating. I gather that with nightly, it is possible to write [usize; {D - 2}] or similar. However, I've taken care to allocate these with a fixed capacity, so hopefully they're not too much slower than a sized array.

laggui

Thanks for your contribution! 🎉

At first glance the implementation looks good to me. Will request @nathanielsimard since he implemented the previous version.

codecov · 2024-04-22T12:52:44Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.54%. Comparing base (1cdceb5) to head (1b5b991).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1679      +/-   ##
==========================================
+ Coverage   86.51%   86.54%   +0.02%     
==========================================
  Files         696      696              
  Lines       81498    81653     +155     
==========================================
+ Hits        70506    70664     +158     
+ Misses      10992    10989       -3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

antimora · 2024-04-26T16:49:54Z

It looks like we have two approvals. @nathanielsimard, when you have a chance, could you also review it?

nathanielsimard

No performance loss was detected for the matmul with somewhat large shapes: https://burn.dev/benchmarks/community-benchmarks/?version1=21a2c6553c7d4df0b3de830112888b39fab6a9d0&versionLabel1=Version+1&version2=-&versionLabel2=Version+2&backend=ndarray&device=All&name=All&os=Pop%21_OS+22.4.0+%28jammy%29+%5B64-bit%5D&sysHardware=Any&user=SharpPrecision&search=true

@louisfd would like your review on this before merging.

louisfd

Thanks for fixing this bug!
I spotted two typos but overall looks good to me

crates/burn-ndarray/src/ops/matmul.rs

* Handle ndarray matmul broadcasting - Use strides to map linear batch indices from the output back to the input arrays. * Fix typos

Handle ndarray matmul broadcasting

21a2c65

- Use strides to map linear batch indices from the output back to the input arrays.

lancelet mentioned this pull request Apr 22, 2024

[ndarray] General matmul broadcasting bug #1646

Closed

lancelet commented Apr 22, 2024

View reviewed changes

laggui reviewed Apr 22, 2024

View reviewed changes

laggui requested a review from nathanielsimard April 22, 2024 12:51

dcvz approved these changes Apr 26, 2024

View reviewed changes

antimora added the bug Something isn't working label Apr 26, 2024

laggui approved these changes Apr 26, 2024

View reviewed changes

nathanielsimard approved these changes Apr 26, 2024

View reviewed changes

antimora requested a review from louisfd April 26, 2024 18:31

louisfd approved these changes Apr 26, 2024

View reviewed changes

crates/burn-ndarray/src/ops/matmul.rs Outdated Show resolved Hide resolved

crates/burn-ndarray/src/ops/matmul.rs Outdated Show resolved Hide resolved

antimora added 2 commits April 29, 2024 16:53

Merge remote-tracking branch 'upstream/main' into pr/1679

d3ec56e

Fix typos

1b5b991

antimora merged commit ab50143 into tracel-ai:main Apr 29, 2024
14 checks passed

nathanielsimard pushed a commit that referenced this pull request May 3, 2024

Handle ndarray matmul broadcasting (#1679)

2115a22

* Handle ndarray matmul broadcasting - Use strides to map linear batch indices from the output back to the input arrays. * Fix typos

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle ndarray matmul broadcasting #1679

Handle ndarray matmul broadcasting #1679

lancelet commented Apr 22, 2024 •

edited by antimora

Loading

lancelet Apr 22, 2024

lancelet Apr 22, 2024

laggui left a comment

codecov bot commented Apr 22, 2024 •

edited

Loading

antimora commented Apr 26, 2024

nathanielsimard left a comment

louisfd left a comment

Handle ndarray matmul broadcasting #1679

Handle ndarray matmul broadcasting #1679

Conversation

lancelet commented Apr 22, 2024 • edited by antimora Loading

Pull Request Template

Checklist

Related Issues/PRs

Changes

Testing

lancelet Apr 22, 2024

Choose a reason for hiding this comment

lancelet Apr 22, 2024

Choose a reason for hiding this comment

laggui left a comment

Choose a reason for hiding this comment

codecov bot commented Apr 22, 2024 • edited Loading

Codecov Report

antimora commented Apr 26, 2024

nathanielsimard left a comment

Choose a reason for hiding this comment

louisfd left a comment

Choose a reason for hiding this comment

lancelet commented Apr 22, 2024 •

edited by antimora

Loading

codecov bot commented Apr 22, 2024 •

edited

Loading