Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CNDB-12980: Clone the vectors issued by the postingsMap in CompactionGraph #1586

Merged
merged 2 commits into from
Feb 25, 2025

Conversation

marianotepper
Copy link

Addresses CNDB-12980

The postingsMap in CompactionGraph reuses the VectorFloat object when reading from storage. We need to copy this vector before storing it in another data structure if we want it to hold the right information.

…he same object gets reused by the underlying implementation.
@marianotepper marianotepper requested a review from jkni February 19, 2025 18:49
Copy link

Checklist before you submit for review

  • Make sure there is a PR in the CNDB project updating the Converged Cassandra version
  • Use NoSpamLogger for log lines that may appear frequently in the logs
  • Verify test results on Butler
  • Test coverage for new/modified code is > 80%
  • Proper code formatting
  • Proper title for each commit staring with the project-issue number, like CNDB-1234
  • Each commit has a meaningful description
  • Each commit is not very long and contains related changes
  • Renames, moves and reformatting are in distinct commits

@marianotepper marianotepper self-assigned this Feb 19, 2025
Copy link

@jkni jkni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Fix looks correct to me. Test failures posted by Butler are acceptable; that's a known flaky test in an unrelated area of the codebase. The other failures flagged by Butler as known failures are indeed known failures.

I think we should add some test coverage that exposes the bug, as we had no tests that surfaced the issue. I did a quick sketch of enhancing the existing PQ refinement test to test recall, confirming that it failed without the patch. This might be sufficient. WDYT?

diff --git a/test/unit/org/apache/cassandra/index/sai/cql/VectorCompactionTest.java b/test/unit/org/apache/cassandra/index/sai/cql/VectorCompactionTest.java
index f6634b86ee..9047bb8b1b 100644
--- a/test/unit/org/apache/cassandra/index/sai/cql/VectorCompactionTest.java
+++ b/test/unit/org/apache/cassandra/index/sai/cql/VectorCompactionTest.java
@@ -19,9 +19,13 @@
 package org.apache.cassandra.index.sai.cql;
 
 import java.util.ArrayList;
+import java.util.stream.Collectors;
 
 import org.junit.Test;
 
+import io.github.jbellis.jvector.vector.VectorSimilarityFunction;
+import org.apache.cassandra.db.marshal.FloatType;
+import org.apache.cassandra.index.sai.disk.v3.V3OnDiskFormat;
 import org.apache.cassandra.index.sai.disk.v5.V5VectorPostingsWriter;
 import org.apache.cassandra.index.sai.disk.vector.CompactionGraph;
 
@@ -59,13 +63,16 @@ public class VectorCompactionTest extends VectorTester.Versioned
         createTable();
         disableCompaction();
 
+        var vectors = new ArrayList<float[]>();
         // 3 sstables
         for (int j = 0; j < 3; j++)
         {
             for (int i = 0; i <= MIN_PQ_ROWS; i++)
             {
                 var pk = j * MIN_PQ_ROWS + i;
-                execute("INSERT INTO %s (pk, v) VALUES (?, ?)", pk, vector(pk, pk + 1));
+                var v = create2DVector();
+                vectors.add(v);
+                execute("INSERT INTO %s (pk, v) VALUES (?, ?)", pk, vector(v));
             }
             flush();
         }
@@ -73,8 +80,21 @@ public class VectorCompactionTest extends VectorTester.Versioned
         CompactionGraph.PQ_TRAINING_SIZE = 2 * MIN_PQ_ROWS;
         compact();
 
-        // Confirm we can query the data
-        assertRowCount(execute("SELECT * FROM %s ORDER BY v ANN OF [1,2] LIMIT 1"), 1);
+        // Confirm we can query the data with reasonable recall
+        double recall = 0;
+        int ITERS = 10;
+        for (int i = 0; i < ITERS; i++)
+        {
+            var q = create2DVector();
+            var result = execute("SELECT pk, v FROM %s ORDER BY v ANN OF ? LIMIT 20", vector(q));
+            var ann = result.stream().map(row -> {
+                var vList = row.getVector("v", FloatType.instance, 2);
+                return new float[]{ vList.get(0), vList.get(1) };
+            }).collect(Collectors.toList());
+            recall += computeRecall(vectors, q, ann, VectorSimilarityFunction.COSINE);
+        }
+        recall /= ITERS;
+        assert recall >= 0.9 : recall;
     }
 
     @Test
@@ -179,4 +199,9 @@ public class VectorCompactionTest extends VectorTester.Versioned
             }
         }
     }
+
+    private static float[] create2DVector() {
+        var R = getRandom();
+        return new float[] { R.nextFloatBetween(-100, 100), R.nextFloatBetween(-100, 100) };
+    }
 }

@marianotepper
Copy link
Author

The test looks right to me. It may help discover other issues in the future.

@jkni jkni self-assigned this Feb 21, 2025
@jkni
Copy link

jkni commented Feb 21, 2025

Thanks! Pushed the test to this branch.

@cassci-bot
Copy link

❌ Build ds-cassandra-pr-gate/PR-1586 rejected by Butler


1 new test failure(s) in 2 builds
See build details here


Found 1 new test failures

Test Explanation Branch history Upstream history
o.a.c.u.b.BinLogTest.testTruncationReleasesLogS... regression 🔴🔴 🔵🔵🔵🔵🔵🔵🔵

Found 7 known test failures

@jkni jkni self-requested a review February 22, 2025 03:54
@marianotepper marianotepper merged commit 054fbeb into main Feb 25, 2025
464 of 474 checks passed
@marianotepper marianotepper deleted the cndb-12980 branch February 25, 2025 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants