.

get-convex · Nov 6, 2024 · 473e136 · 473e136
1 parent f32eec8
commit 473e136
Show file tree

Hide file tree

Showing 5 changed files with 156 additions and 112 deletions.
diff --git a/README.md b/README.md
@@ -13,20 +13,36 @@ Suppose you have a leaderboard of game scores. These are some operations
 that the Aggregate component makes easy and efficient:
 
 1. Count the total number of scores: `aggregate.count(ctx)`
-2. Count the number of scores greater than 65: `aggregate.count(ctx, { lower: { key: 65, inclusive: false } })`
+2. Count the number of scores greater than 65: `aggregate.count(ctx, { bounds: { lower: { key: 65, inclusive: false } } })`
 3. Find the p95 score: `aggregate.at(ctx, Math.floor(aggregate.count(ctx) * 0.95))`
 4. Find the overall average score: `aggregate.sum(ctx) / aggregate.count(ctx)`
 5. Find the ranking for a score of 65 in the leaderboard: `aggregate.indexOf(ctx, 65)`
 6. Find the average score for an individual user. You can define another aggregate
-   partitioned by user and aggregate within each:
+   grouped by user and aggregate within each:
 
 ```ts
-// aggregateScoreByUser is the leaderboard scores partitioned by username.
+// aggregateScoreByUser is the leaderboard scores grouped by username.
 const bounds = { prefix: [username] };
-const highScoreForUser = aggregateScoreByUser.max(ctx, bounds);
+const highScoreForUser = await aggregateScoreByUser.max(ctx, { bounds });
 const avgScoreForUser =
-  aggregateScoreByUser.sum(ctx, bounds) /
-  aggregateScoreByUser.count(ctx, bounds);
+  await aggregateScoreByUser.sum(ctx, { bounds }) /
+  await aggregateScoreByUser.count(ctx, { bounds });
+// It still enables adding or averaging all scores across all usernames.
+const globalAverageScore = await aggregateScoreByUser.sum(ctx) /
+  await aggregateScoreByUser.count(ctx);
+```
+
+7. Alternatively, you can define a third aggregate with separate namespaces,
+   and do the same query. This method increases throughput because a user's data
+   won't interfere with other users. However, you lose the ability to aggregate
+   over all users.
+
+```ts
+const forUser = { namespace: username };
+const highScoreForUser = await aggregateScoreByUser.max(ctx, forUser);
+const avgScoreForUser =
+  await aggregateScoreByUser.sum(ctx, { bounds }) /
+  await aggregateScoreByUser.count(ctx, { bounds });
 ```
 
 The Aggregate component provides `O(log(n))`-time lookups, instead of the `O(n)`
@@ -51,10 +67,9 @@ The keys may be arbitrary Convex values, so you can choose to sort your data by:
 4. Nothing, use `key=null` for everything if you just want
    [a total count, such as for random access](#total-count-and-randomization).
 
-### Partitioning
+### Grouping
 
-You can use sorting to partition your data set, enabling namspacing,
-multitenancy, sharding, and more.
+You can use sorting to group your data set.
 
 If you want to keep track of multiple games with scores for each user,
 use a tuple of `[game, username, score]` as the key.
@@ -76,6 +91,52 @@ would need to aggregate with key `[game, score]`.
 To support different sorting and partitioning keys, you can define multiple
 instances. See [below](#defining-multiple-aggregates) for details.
 
+If you separate your data via the `sortKey` and `prefix` bounds, you can look at
+your data from any altitude. You can do a global `count` to see how many total
+data points there are, or you can zero in on an individual group of the data.
+
+However, there's a tradeoff: nearby data points can interfere with each other
+in the internal data structure, reducing throughput. See
+[below](#read-dependencies-and-writes) for more details. To avoid interference,
+you can use Namespaces.
+
+### Namespacing
+
+If your data is separated into distinct partitions, and you don't need to
+aggregate between partitions, then you can put each partition into its own
+namespace. Each namespace gets its own internal data structure.
+
+If your app has multiple games, it's not useful to aggregate scores across
+different games. The scoring system for chess isn't related to the scoring
+system for football. So we can namespace our scores based on the game.
+
+Whenever we aggregate scores, we *must* specify the namespace.
+On the other hand, the internal aggregation data structure can keep the scores
+separate and keep throughput high.
+
+Here's how you would create the aggregate we just described:
+
+```ts
+const leaderboardByGame = new TableAggregate<{
+  namespace: Id<"games">,
+  key: number,
+  dataModel: DataModel,
+  tableName: "scores",
+}>(components.leaderboardByGame, {
+  namespace: (doc) => doc.gameId,
+  sortKey: (doc) => doc.score,
+});
+```
+
+And whenever you use this aggregate, you specify the namespace.
+
+```ts
+const footballHighScore = await leaderboardByGame.max(ctx, { namespace: footballId });
+```
+
+See an example of a namespaced aggregate in
+[example/convex/photos.ts](./example/convex/photos.ts).
+
 ### More examples
 
 The Aggregate component can efficiently calculate all of these:
@@ -149,9 +210,15 @@ import { DataModel } from "./_generated/dataModel";
 import { mutation as rawMutation } from "./_generated/server";
 import { TableAggregate } from "@convex-dev/aggregate";
 
-const aggregate = new TableAggregate<number, DataModel, "mytable">(
+const aggregate = new TableAggregate<{
+  namespace: undefined,
+  key: number,
+  dataModel: DataModel,
+  tableName: "mytable",
+}>(
   components.aggregate,
   {
+    namespace: (doc) => undefined,  // disable namespacing.
     sortKey: (doc) => doc._creationTime, // Allows querying across time ranges.
     sumValue: (doc) => doc.value, // The value to be used in `.sum` calculations.
   }
@@ -167,12 +234,14 @@ here's how you might define `aggregateByGame`, as an aggregate on the "scores"
 table:
 
 ```ts
-const aggregateByGame = new TableAggregate<
-  [Id<"games">, string, number],
-  DataModel,
-  "leaderboard"
->(components.aggregateByGame, {
-  sortKey: (doc) => [doc.gameId, doc.username, doc.score],
+const aggregateByGame = new TableAggregate<{
+  namespace: Id<"games">,
+  key: [string, number],
+  dataModel: DataModel,
+  tableName: "leaderboard"
+}>(components.aggregateByGame, {
+  namespace: (doc) => doc.gameId,
+  sortKey: (doc) => [doc.username, doc.score],
 });
 ```
 
@@ -234,75 +303,27 @@ To run the examples:
 4. The dashboard should open and you can run functions like
    `leaderboard:addScore` and `leaderboard:userAverageScore`.
 
-### Namespaces
-
-When you have independent data sets, use `namespaces` for greater throughput.
-A namespace is a segment of your data points, like all users within a team,
-or all metrics related to a user.
-It behaves similarly to using a prefix on `sortKey`, but more efficiently.
-By dividing your data into namespaces, you can more read data more efficiently,
-since your queries will never be invalidated due to writes in other namespaces.
-Writes between namespaces will never conflict, reducing chances of write contention
-resulting in slowdowns and OCC failure.
-The limitation is that you cannot calculate aggregates across namespaces.
-If you need to aggregate across top-level segments, use `sortKey` with a prefix.
-
-For example, suppose you have a bunch of leaderboard scores for several games,
-and the scores for each game are independent. You can use the game id as a
-namespace. Then each game gets its own data structure in the aggregate
-component, preventing reads and writes for different games to conflict with each other.
-
-```ts
-const aggregateByGame = new NamespacedTableAggregate<
-  [string, number],
-  DataModel,
-  "leaderboard",
-  Id<"games">
->(components.aggregateByGame, {
-  sortKey: (doc) => [doc.username, doc.score],
-  namespace: (doc) => doc.gameId,
-});
-```
-
-Now when you need to aggregate within a game, you call `.get` to narrow down the
-computation to a single game.
-
-```ts
-const countTimesGamePlayed = await aggregateByGame.get(gameId).count();
-```
-
-There are namespaced classes for each kind of Aggregate you may want to build:
-`NamespacedTableAggregate`, `NamespacedRandomize`, and
-`NamespacedDirectAggregate`.
-
 ### Total Count and Randomization
 
 If you don't need the ordering, partitioning, or summing behavior of
-`TableAggregate`, there's a simpler interface you can use: `Randomize`.
+`TableAggregate`, you can set `namespace: undefined` and `sortKey: null`.
 
 ```ts
-import { components } from "./_generated/api";
-import { DataModel } from "./_generated/dataModel";
-import { mutation as rawMutation } from "./_generated/server";
-import { Randomize } from "@convex-dev/aggregate";
-import { customMutation } from "convex-helpers/server/customFunctions";
-// This is like TableAggregate but there's no key or sumValue.
-const randomize = new Randomize<DataModel, "mytable">(components.aggregate);
-
-// In a mutation, insert into the component when you insert into your table.
-const id = await ctx.db.insert("mytable", data);
-await randomize.insert(ctx, id);
-
-// As before, delete from the component when you delete from your table
-await ctx.db.delete(id);
-await randomize.delete(ctx, id);
-
-// in a query, get the total document count.
-const totalCount = await randomize.count(ctx);
-// get a random document's id.
-const randomId = await randomize.random(ctx);
+const randomize = new TableAggregate<{
+  namespace: undefined,
+  key: null,
+  dataModel: DataModel,
+  tableName: "mytable",
+}>(components.aggregate, {
+  namespace: (doc) => undefined,
+  sortKey: (doc) => null,
+});
 ```
 
+Without sorting, all documents are ordered by their `_id` which is generally
+random. And you can look up the document at any index to find one at random
+or shuffle the whole table.
+
 See more examples in [`example/convex/shuffle.ts`](example/convex/shuffle.ts),
 including a paginated random shuffle of some music.
 
@@ -313,27 +334,36 @@ Convex supports infinite-scroll pagination which is
 to worry about items going missing from your list. But sometimes you want to
 display separate pages of results on separate pages of your app.
 
-For this example, imagine you have a table of photos
+For this example, imagine you have a table of photo albums.
 
 ```ts
 // convex/schema.ts
 defineSchema({
   photos: defineTable({
+    album: v.string(),
     url: v.string(),
-  }),
+  }).index("by_album_creation_time", ["album"]),
 });
 ```
 
-And an aggregate defined with key as `_creationTime`.
+And an aggregate defined with key as `_creationTime` and namespace as `album`.
 
 ```ts
 // convex/convex.config.ts
 app.use(aggregate, { name: "photos" });
 
 // convex/photos.ts
-const photos = new TableAggregate<number, DataModel, "photos">(
+const photos = new TableAggregate<{
+  namespace: string,  // album name
+  key: number,  // creation time
+  dataModel: DataModel,
+  tableName: "photos",
+}>(
   components.photos,
-  { sortKey: (doc) => doc._creationTime }
+  {
+    namespace: (doc) => doc.album,
+    sortKey: (doc) => doc._creationTime,
+  }
 );
 ```
 
@@ -342,15 +372,15 @@ map from offset to an index key.
 
 In this example, if `offset` is 100 and `numItems` is 10, we get the hundredth
 `_creationTime` (in ascending order) and starting there we get the next ten
-documents.
+documents. In this way we can paginate through the whole photo album.
 
 ```ts
 export const pageOfPhotos({
-  args: { offset: v.number(), numItems: v.number() },
-  handler: async (ctx, { offset, numItems }) => {
-    const { key } = await photos.at(ctx, offset);
+  args: { offset: v.number(), numItems: v.number(), album: v.string() },
+  handler: async (ctx, { offset, numItems, album }) => {
+    const { key } = await photos.at(ctx, offset, { namespace: album });
     return await ctx.db.query("photos")
-      .withIndex("by_creation_time", q=>q.gte("_creationTime", key))
+      .withIndex("by_album_creation_time", q=>q.eq("album", album).gte("_creationTime", key))
       .take(numItems);
   },
 });
@@ -369,19 +399,21 @@ insert, delete, and replace operations yourself.
 import { components } from "./_generated/api";
 import { DataModel } from "./_generated/dataModel";
 import { DirectAggregate } from "@convex-dev/aggregate";
-// The first generic parameter (number in this case) is the key.
-// The second generic parameter (string in this case) should be unique to
-// be a tie-breaker in case two data points have the same key.
-const aggregate = new DirectAggregate<number, string>(components.aggregate);
+// Note the `id` should be unique to be a tie-breaker in case two data points
+// have the same key.
+const aggregate = new DirectAggregate<{
+  key: number,
+  id: string,
+}>(components.aggregate);
 
 // within a mutation, add values to be aggregated
-await aggregate.insert(ctx, key, id);
+await aggregate.insert(ctx, { key, id });
 // if you want to use `.sum` to aggregate sums of values, insert with a sumValue
-await aggregate.insert(ctx, key, id, sumValue);
+await aggregate.insert(ctx, { key, id, sumValue });
 // or delete values that were previously added
-await aggregate.delete(ctx, key, id);
+await aggregate.delete(ctx, { key, id });
 // or update values
-await aggregate.replace(ctx, oldKey, newKey, id);
+await aggregate.replace(ctx, { key: oldKey, id }, { key: newKey });
 ```
 
 See [`example/convex/stats.ts`](example/convex/stats.ts) for an example.

diff --git a/example/convex/photos.ts b/example/convex/photos.ts
@@ -84,8 +84,8 @@ export const pageOfPhotos = query({
     const { key: firstPhotoCreationTime } = await photos.at(ctx, offset, { namespace: album });
     const photoDocs = await ctx.db
       .query("photos")
-      .withIndex("by_creation_time", (q) =>
-        q.gte("_creationTime", firstPhotoCreationTime)
+      .withIndex("by_album_creation_time", (q) =>
+        q.eq("album", album).gte("_creationTime", firstPhotoCreationTime)
       )
       .take(numItems);
     return photoDocs.map((doc) => doc.url);

diff --git a/example/convex/schema.ts b/example/convex/schema.ts
@@ -12,5 +12,5 @@ export default defineSchema({
   photos: defineTable({
     album: v.string(),
     url: v.string(),
-  }),
+  }).index("by_album_creation_time", ["album"]),
 });
diff --git a/example/convex/stats.ts b/example/convex/stats.ts
@@ -7,7 +7,11 @@ import { v } from "convex/values";
 import { DirectAggregate } from "@convex-dev/aggregate";
 import { components } from "./_generated/api";
 
-const stats = new DirectAggregate<number, string>(components.stats);
+const stats = new DirectAggregate<{
+  namespace: undefined,
+  key: number,
+  id: string,
+}>(components.stats);
 
 export const reportLatency = mutation({
   args: {