Merge pull request #4 from GaiaNet-AI/refactor

Update docs
GaiaNet-AI · May 19, 2024 · dc1f3cb · dc1f3cb
2 parents 3ea84fe + 7f732ce
commit dc1f3cb
Show file tree

Hide file tree

Showing 2 changed files with 53 additions and 5 deletions.
diff --git a/docs/creator-guide/knowledge/markdown.md b/docs/creator-guide/knowledge/markdown.md
@@ -16,10 +16,10 @@ Install the WasmEdge Runtime, the cross-platform LLM runtime.
 curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install.sh | bash -s -- --plugins wasi_nn-ggml
 ```
 
-Download a chat model and an embedding model.
+Download an embedding model.
 
 ```
-curl -LO https://huggingface.co/gaianet/Nomic-embed-text-v1.5-Embedding-GGUF/resolve/main/nomic-embed-text-v1.5-f16.gguf
+curl -LO https://huggingface.co/gaianet/Nomic-embed-text-v1.5-Embedding-GGUF/resolve/main/nomic-embed-text-v1.5.f16.gguf
 ```
 
 The embedding model is a special kind of LLM that turns sentences into vectors. The vectors can then be stored in a vector database and searched later. When the sentences are from a body of text that represents a knowledge domain, that vector database becomes our RAG knowledge base. 
@@ -68,16 +68,35 @@ curl -LO https://github.com/GaiaNet-AI/embedding-tools/raw/main/markdown_embed/m
 
 It chunks the document based on markdown sections. You can check out the [Rust source code](https://github.com/GaiaNet-AI/embedding-tools/tree/main/markdown_embed) here and modify it if you need to use a different chunking strategy.
 
-Next, you can run the program by passing a collection name, vector dimension, and the source document. You can pass in the desired markdown heading level for chunking using the `--heading_level` option. Make sure that Qdrant is running on your local machine. The model is preloaded under the name embedding. The wasm app then uses the embedding model to create the 768-dimension vectors from `paris.md` and saves them into the default collection.
+Next, you can run the program by passing a collection name, vector dimension, and the source document. You can pass in the desired markdown heading level for chunking using the `--heading_level` option. The `--ctx_size` option matches the embedding model's context window size, which in this case is 8192 tokens allowing it to process long sections of text. Make sure that Qdrant is running on your local machine. The model is preloaded under the name embedding. The wasm app then uses the embedding model to create the 768-dimension vectors from `paris.md` and saves them into the default collection.
 
 ```
 curl -LO https://huggingface.co/datasets/gaianet/paris/raw/main/paris.md
 
 wasmedge --dir .:. \
-  --nn-preload embedding:GGML:AUTO:nomic-embed-text-v1.5-f16.gguf \
-  markdown_embed.wasm embedding default 768 paris.md --heading_level 1
+  --nn-preload embedding:GGML:AUTO:nomic-embed-text-v1.5.f16.gguf \
+  markdown_embed.wasm embedding default 768 paris.md --heading_level 1 --ctx_size 8192
 ```
 
+## More options
+
+You can pass the following options to the program.
+
+* Using `-c` or `--ctx_size` to specify the context size of the input. This defaults to 512.
+* Using `-l` or `--heading_level` to specify the markdown heading level for each vector. This defaults to 1.
+* Using `-m` or `--maximum_context_length` to specify a context length in the CLI argument. That is to truncate and warn for each text segment that goes above the context length.
+* Using `-s` or `--start_vector_id` to specify the start vector ID in the CLI argument. This will allow us to run this app multiple times on multiple documents on the same vector collection.
+
+Example: the above example but to append the London guide to the end of an existing collection starting from index 42.
+
+```
+wasmedge --dir .:. \
+  --nn-preload embedding:GGML:AUTO:nomic-embed-text-v1.5.f16.gguf \
+   markdown_embed.wasm embedding default 768 london.md -c 8192 -l 1 -s 42
+```
+
+## Create a vector snapshot
+
 You can create a snapshot of the collection, which can be shared and loaded into a different Qdrant database. You can find the snapshot file in the `qdrant_snapshots` directory.
 
 ```

diff --git a/docs/creator-guide/knowledge/text.md b/docs/creator-guide/knowledge/text.md
@@ -82,6 +82,35 @@ wasmedge --dir .:. \
   paragraph_embed.wasm embedding default 384 paris_chunks.txt
 ```
 
+## More options
+
+You can also pass the following options to the program.
+
+* Using `-m` or `--maximum_context_length` to specify a context length in the CLI argument. That is to truncate and warn for each text segment that goes above the context length.
+* Using `-s` or `--start_vector_id` to specify the start vector ID in the CLI argument. This will allow us to run this app multiple times on multiple documents on the same vector collection.
+* Using `-c` or `--ctx_size` to specify the context size of the input. This defaults to 512.
+
+Example: use the `nomic-embed-text-v1.5.f16` model, which has a context length of 8192 and vector size of 768, to create embeddings for long paragraphs of text. Note that your `default` vector collection must be set up to be 768 dimensions.
+
+```
+curl -LO https://huggingface.co/gaianet/Nomic-embed-text-v1.5-Embedding-GGUF/resolve/main/nomic-embed-text-v1.5.f16.gguf
+
+wasmedge --dir .:. \
+  --nn-preload embedding:GGML:AUTO:nomic-embed-text-v1.5.f16.gguf \
+   paragraph_embed.wasm embedding default 768 paris.txt -c 8192
+```
+
+Example: the above example but to append the London guide to the end of an existing collection starting from index 42.
+
+```
+wasmedge --dir .:. \
+  --nn-preload embedding:GGML:AUTO:nomic-embed-text-v1.5.f16.gguf \
+   paragraph_embed.wasm embedding default 768 london.txt -c 8192 -s 42
+```
+
+
+## Create a vector snapshot
+
 You can create a snapshot of the collection, which can be shared and loaded into a different Qdrant database. You can find the snapshot file in the `qdrant_snapshots` directory.
 
 ```