Skip to content

Commit e4daf84

Browse files
sgup432kolchfa-awsnatebower
authored
Add cache plugin and tiered cache documentation (#6708)
* Adding documentation for cache plugin and tiered cache Signed-off-by: Sagar Upadhyaya <sagar.upadhyaya.121@gmail.com> * Adding tiered cache github link and fixing typos Signed-off-by: Sagar Upadhyaya <sagar.upadhyaya.121@gmail.com> * Refactor tiered cache doc Signed-off-by: Sagar Upadhyaya <sagar.upadhyaya.121@gmail.com> * Doc review Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Review comments Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Add plugin instructions Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Renamed topic to caching Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Apply suggestions from code review Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _search-plugins/caching/index.md Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _search-plugins/caching/index.md Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> --------- Signed-off-by: Sagar Upadhyaya <sagar.upadhyaya.121@gmail.com> Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Fanit Kolchina <kolchfa@amazon.com> Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Nathan Bower <nbower@amazon.com>
1 parent 9ef3118 commit e4daf84

File tree

2 files changed

+114
-0
lines changed

2 files changed

+114
-0
lines changed

_search-plugins/caching/index.md

+32
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
---
2+
layout: default
3+
title: Caching
4+
parent: Improving search performance
5+
has_children: true
6+
nav_order: 100
7+
---
8+
9+
# Caching
10+
11+
OpenSearch relies heavily on different on-heap cache types to accelerate data retrieval, providing significant improvement in search latencies. However, cache size is limited by the amount of memory available on a node. If you are processing a larger dataset that can potentially be cached, the cache size limit causes a lot of cache evictions and misses. The increasing number of evictions impacts performance because OpenSearch needs to process the query again, causing high resource consumption.
12+
13+
Prior to version 2.13, OpenSearch supported the following on-heap cache types:
14+
15+
- **Request cache**: Caches the local results on each shard. This allows frequently used (and potentially resource-heavy) search requests to return results almost instantly.
16+
- **Query cache**: The shard-level query cache caches common data from similar queries. The query cache is more granular than the request cache and can cache data that is reused in different queries.
17+
- **Field data cache**: The field data cache contains field data and global ordinals, which are both used to support aggregations on certain field types.
18+
19+
## Additional cache stores
20+
**Introduced 2.13**
21+
{: .label .label-purple }
22+
23+
This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/OpenSearch/issues/10024).
24+
{: .warning}
25+
26+
In addition to existing OpenSearch custom on-heap cache stores, cache plugins provide the following cache stores:
27+
28+
- **Disk cache**: This cache stores the precomputed result of a query on disk. You can use a disk cache to cache much larger datasets, provided that the disk latencies are acceptable.
29+
- **Tiered cache**: This is a multi-level cache, in which each tier has its own characteristics and performance levels. For example, a tiered cache can contain on-heap and disk tiers. By combining different tiers, you can achieve a balance between cache performance and size. To learn more, see [Tiered cache]({{site.url}}{{site.baseurl}}/search-plugins/caching/tiered-cache/).
30+
31+
In OpenSearch 2.13, the request cache is integrated with cache plugins. You can use a tiered or disk cache as a request-level cache.
32+
{: .note}
+82
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
---
2+
layout: default
3+
title: Tiered cache
4+
parent: Caching
5+
grand_parent: Improving search performance
6+
nav_order: 10
7+
---
8+
9+
# Tiered cache
10+
11+
This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/OpenSearch/issues/10024).
12+
{: .warning}
13+
14+
A tiered cache is a multi-level cache, in which each tier has its own characteristics and performance levels. By combining different tiers, you can achieve a balance between cache performance and size.
15+
16+
## Types of tiered caches
17+
18+
OpenSearch 2.13 provides an implementation of _tiered spillover cache_. This implementation spills the evicted items from upper to lower tiers. The upper tier is smaller in size but offers better latency, like the on-heap tier. The lower tier is larger in size but is slower in terms of latency compared to the upper tier. A disk cache is an example of a lower tier. OpenSearch 2.13 offers on-heap and disk tiers.
19+
20+
## Enabling a tiered cache
21+
22+
To enable a tiered cache, configure the following setting:
23+
24+
```yaml
25+
opensearch.experimental.feature.pluggable.caching.enabled: true
26+
```
27+
{% include copy.html %}
28+
29+
For more information about ways to enable experimental features, see [Experimental feature flags]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/experimental/).
30+
31+
## Installing required plugins
32+
33+
A tiered cache provides a way to plug in any disk or on-heap tier implementation. You can install the plugins you intend to use in the tiered cache. As of OpenSearch 2.13, the available cache plugin is the `cache-ehcache` plugin. This plugin provides a disk cache implementation to use within a tiered cache as a disk tier.
34+
35+
A tiered cache will fail to initialize if the `cache-ehcache` plugin is not installed or disk cache properties are not set.
36+
{: .warning}
37+
38+
## Tiered cache settings
39+
40+
In OpenSearch 2.13, a request cache can use a tiered cache. To begin, configure the following settings in the `opensearch.yml` file.
41+
42+
### Cache store name
43+
44+
Set the cache store name to `tiered_spillover` to use the OpenSearch-provided tiered spillover cache implementation:
45+
46+
```yaml
47+
indices.request.cache.store.name: tiered_spillover: true
48+
```
49+
{% include copy.html %}
50+
51+
### Setting on-heap and disk store tiers
52+
53+
The `opensearch_onheap` setting is the built-in on-heap cache available in OpenSearch. The `ehcache_disk` setting is the disk cache implementation from [Ehcache](https://www.ehcache.org/). This requires installing the `cache-ehcache` plugin:
54+
55+
```yaml
56+
indices.request.cache.tiered_spillover.onheap.store.name: opensearch_onheap
57+
indices.request.cache.tiered_spillover.disk.store.name: ehcache_disk
58+
```
59+
{% include copy.html %}
60+
61+
For more information about installing non-bundled plugins, see [Additional plugins]({{site.url}}{{site.baseurl}}/install-and-configure/plugins/#additional-plugins).
62+
63+
### Configuring on-heap and disk stores
64+
65+
The following table lists the cache store settings for the `opensearch_onheap` store.
66+
67+
Setting | Default | Description
68+
:--- | :--- | :---
69+
`indices.request.cache.opensearch_onheap.size` | 1% of the heap | The size of the on-heap cache. Optional.
70+
`indices.request.cache.opensearch_onheap.expire` | `MAX_VALUE` (disabled) | Specify a time-to-live (TTL) for the cached results. Optional.
71+
72+
The following table lists the disk cache store settings for the `ehcache_disk` store.
73+
74+
Setting | Default | Description
75+
:--- | :--- | :---
76+
`indices.request.cache.ehcache_disk.max_size_in_bytes` | `1073741824` (1 GB) | Defines the size of the disk cache. Optional.
77+
`indices.request.cache.ehcache_disk.storage.path` | `""` | Defines the storage path for the disk cache. Required.
78+
`indices.request.cache.ehcache_disk.expire_after_access` | `MAX_VALUE` (disabled) | Specify a time-to-live (TTL) for the cached results. Optional.
79+
`indices.request.cache.ehcache_disk.alias` | `ehcacheDiskCache#INDICES_REQUEST_CACHE` (this is an example of request cache) | Specify an alias for the disk cache. Optional.
80+
`indices.request.cache.ehcache_disk.segments` | `16` | Defines the number of segments the disk cache is separated into. Used for concurrency. Optional.
81+
`indices.request.cache.ehcache_disk.concurrency` | `1` | Defines the number of distinct write queues created for the disk store, where a group of segments share a write queue. Optional.
82+

0 commit comments

Comments
 (0)