Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] composite_terms-keyword has increased latency with Lucene 10 index #17388

Open
expani opened this issue Feb 18, 2025 · 6 comments
Open
Labels
bug Something isn't working Search:Performance

Comments

@expani
Copy link
Contributor

expani commented Feb 18, 2025

Describe the bug

The latency for composite_terms-keyword operation in Big5 has increased by 10-15% with the upgrade to Lucene 10. When using the index from OS 2.19 but searching using OS 3.0 server the latency is better. So, we have narrowed down the cause to something wrong with Index format or we misconfiguring/not configuring something with the Lucene 10 upgrade.

OS 3.0 * OS 2.19 indicates that the OpenSearch server was running 3.0 whereas the index used was created in OS 2.19. This was done to eliminate any suspicion of the bug arising from an indexing change in Lucene.

Metric Name Operation/Query name OS 2.19 OS 3.0 OS 3.0 * OS 2.19 Unit
50th percentile latency composite_terms-keyword 385.519 437.748 363.538 ms
90th percentile latency composite_terms-keyword 393.628 441.061 372.952 ms
99th percentile latency composite_terms-keyword 425.658 451.375 382.752 ms
100th percentile latency composite_terms-keyword 447.978 457.927 383.355 ms
50th percentile service time composite_terms-keyword 384.585 436.741 362.469 ms
90th percentile service time composite_terms-keyword 392.626 440.274 372.171 ms
99th percentile service time composite_terms-keyword 424.494 450.533 381.558 ms
100th percentile service time composite_terms-keyword 447.026 456.523 382.67 ms

Related component

Search:Performance

To Reproduce

Run composite_terms-keyword with Big5 workload and compare with OS 2.19

Expected behavior

composite_terms-keyword should perform the same with OS 3.0 index as it's doing with OS 2.19 index

Additional Details

Meta

@mgodwan
Copy link
Member

mgodwan commented Feb 27, 2025

Can have the same cause as #17404 where we are expecting the merge policy/segment topology to be a probable cause for regression.

@dhwanilpatel
Copy link
Contributor

I can see different number of segment in 2.19/3.0. This might be suspect causing the regression, In big5 run I can see 2.19 version had 18 segments while 3.0 had 22 segments.

Metric Name Operation OS 2.19 OS 3.0
50th percentile latency composite_terms-keyword 367.625 407.282
90th percentile latency composite_terms-keyword 373.459 411.84
99th percentile latency composite_terms-keyword 376.171 417.157
100th percentile latency composite_terms-keyword 381.592 423.62
Number of Segments 18 22

After performing force_merge with 10 segments, I can see equal[p50] or improved[P90/P99] latencies.

Metric Name Operation OS 2.19 OS 3.0
50th percentile latency composite_terms-keyword 354.385 351.592
90th percentile latency composite_terms-keyword 400.532 356.63
99th percentile latency composite_terms-keyword 424.343 367.392
100th percentile latency composite_terms-keyword 435.697 404.564
Number of Segments 10 10

Will trigger the multiple run and update the results.

@dhwanilpatel
Copy link
Contributor

dhwanilpatel commented Mar 10, 2025

I have triggered multiple runs for big5 workload for both Opensearch 3.0 and Opensearch 2.19 version. Below is the results for the composite_terms-keyword operation for both version.

Setup

Instance Type: c5.2xlarge
Allocated Heap: 1GB
Number of Nodes: 1
Shard Count: 1
Replica Count: 1

Opensearch 3.0

MetricName Run1 Run2 Run3 Run4
50th percentile latency 366.051 379.044 370.327 322.434
90th percentile latency 368.973 383.112 373.968 325.9
99th percentile latency 371.303 387.362 379.496 328.61
100th percentile latency 372.353 387.975 389.81 332.489
50th percentile service time 364.688 377.597 369.078 320.752
90th percentile service time 367.698 381.847 372.858 324.44
99th percentile service time 369.605 386.058 378.149 327.236
100th percentile service time 371.574 386.517 388.734 331.249
Number of Segments 16 18 15 12

Opensearch 2.19

MetricName Run1 Run2 Run3 Run4
50th percentile latency 353.721 323.328 336.394 357.056
90th percentile latency 358.518 326.532 340.155 388.582
99th percentile latency 362.467 329.777 342.727 415.51
100th percentile latency 364.568 330.178 342.888 424.402
50th percentile service time 352.316 321.542 335.087 355.832
90th percentile service time 357.086 324.94 339.169 386.827
99th percentile service time 360.725 328.378 341.498 414.117
100th percentile service time 363.381 328.608 341.628 423.306
Number of Segments 16 13 15 17

We can see that query latency of the composite_terms-keyword depends on the number of segment count. In workload I can see composite_terms-keyword queries for the 10 hrs of data. (https://github.com/opensearch-project/opensearch-benchmark-workloads/blob/a593f0ce7099550c2ccaa65ef8d45447877e36e5/big5/operations/default.json#L767-L779)

With log_byte_size merge policy, number of segments for this date range can impact the results and number of segments can be different across different runs.

We can see results for same version itself has different numbers based on segment count. Based on this results I don't think regression on 3.0 version but the random nature of workload causing this.

@mgodwan / @backslasht Please provide your thoughts on this, in case I am missing something.

@mgodwan
Copy link
Member

mgodwan commented Mar 10, 2025

Thanks @dhwanilpatel. These are helpful insights.

@expani Could you confirm if the tests results you reported in the issue followed force merge before running searches to ensure a consistent benchmark setup for search?

@expani
Copy link
Contributor Author

expani commented Mar 10, 2025

Thanks for the thorough analysis @dhwanilpatel

I had a similar observation that OS 3.0 created more segments than OS 2.19 in the initial run.

Could you confirm if the tests results you reported in the issue followed force merge before running searches to ensure a consistent benchmark setup for search?

I had forced merged OS 3.0 to 19 segments ( same as OS 2.19 had w/o force merge ) and posted the initial numbers. Also, we had used bulk_indexing_clients as 1 to ensure there is no variance because of the same.

OS 3.0 and OS 2.19.0 both had 19 segments

Luckily, I still had those indices present in my r5.xlarge instance.
So, I ran the benchmark again 3 times for OS 3.0 with OS 3.0 index and OS 2.19.0 with OS 2.19.0 index.

for i in `seq 1 3`; do opensearch-benchmark execute-test --target-hosts http://127.0.0.1:9200 --workload big5 --client-options timeout:120 --kill-running-processes --include-tasks composite_terms-keyword; done

The numbers below are from the 3rd run ( although OSB warmup takes care of JIT C2 Compiler optimisations but sometimes the number vary so recording the 3rd run where all optimisations should be done )

OS 3.0

|                                                     Store size |                         |     24.0489 |     GB |
|                                                  Translog size |                         | 5.12227e-08 |     GB |
|                                         Heap used for segments |                         |           0 |     MB |
|                                       Heap used for doc values |                         |           0 |     MB |
|                                            Heap used for terms |                         |           0 |     MB |
|                                            Heap used for norms |                         |           0 |     MB |
|                                           Heap used for points |                         |           0 |     MB |
|                                    Heap used for stored fields |                         |           0 |     MB |
|                                                  Segment count |                         |          19 |        |
|                                                 Min Throughput | composite_terms-keyword |           2 |  ops/s |
|                                                Mean Throughput | composite_terms-keyword |           2 |  ops/s |
|                                              Median Throughput | composite_terms-keyword |           2 |  ops/s |
|                                                 Max Throughput | composite_terms-keyword |           2 |  ops/s |
|                                        50th percentile latency | composite_terms-keyword |     334.184 |     ms |
|                                        90th percentile latency | composite_terms-keyword |      341.49 |     ms |
|                                        99th percentile latency | composite_terms-keyword |     364.008 |     ms |
|                                       100th percentile latency | composite_terms-keyword |     364.176 |     ms |
|                                   50th percentile service time | composite_terms-keyword |     333.226 |     ms |
|                                   90th percentile service time | composite_terms-keyword |     340.509 |     ms |
|                                   99th percentile service time | composite_terms-keyword |     362.583 |     ms |
|                                  100th percentile service time | composite_terms-keyword |     363.015 |     ms |
|                                                     error rate | composite_terms-keyword |           0 |      % |

OS 2.19.0

|                                                     Store size |                         |     24.0039 |     GB |
|                                                  Translog size |                         | 5.12227e-08 |     GB |
|                                         Heap used for segments |                         |           0 |     MB |
|                                       Heap used for doc values |                         |           0 |     MB |
|                                            Heap used for terms |                         |           0 |     MB |
|                                            Heap used for norms |                         |           0 |     MB |
|                                           Heap used for points |                         |           0 |     MB |
|                                    Heap used for stored fields |                         |           0 |     MB |
|                                                  Segment count |                         |          19 |        |
|                                                 Min Throughput | composite_terms-keyword |           2 |  ops/s |
|                                                Mean Throughput | composite_terms-keyword |           2 |  ops/s |
|                                              Median Throughput | composite_terms-keyword |           2 |  ops/s |
|                                                 Max Throughput | composite_terms-keyword |           2 |  ops/s |
|                                        50th percentile latency | composite_terms-keyword |     327.405 |     ms |
|                                        90th percentile latency | composite_terms-keyword |     336.342 |     ms |
|                                        99th percentile latency | composite_terms-keyword |     342.603 |     ms |
|                                       100th percentile latency | composite_terms-keyword |     343.608 |     ms |
|                                   50th percentile service time | composite_terms-keyword |     325.864 |     ms |
|                                   90th percentile service time | composite_terms-keyword |     334.355 |     ms |
|                                   99th percentile service time | composite_terms-keyword |     341.713 |     ms |
|                                  100th percentile service time | composite_terms-keyword |     342.731 |     ms |
|                                                     error rate | composite_terms-keyword |           0 |      % |

OS 2.19 seems to perform better than OS 3.0

OS 3.0 with OS 2.19.0 index is better

|                                                     Store size |                         |     24.0039 |     GB |
|                                                  Translog size |                         | 5.12227e-08 |     GB |
|                                         Heap used for segments |                         |           0 |     MB |
|                                       Heap used for doc values |                         |           0 |     MB |
|                                            Heap used for terms |                         |           0 |     MB |
|                                            Heap used for norms |                         |           0 |     MB |
|                                           Heap used for points |                         |           0 |     MB |
|                                    Heap used for stored fields |                         |           0 |     MB |
|                                                  Segment count |                         |          19 |        |
|                                                 Min Throughput | composite_terms-keyword |           2 |  ops/s |
|                                                Mean Throughput | composite_terms-keyword |           2 |  ops/s |
|                                              Median Throughput | composite_terms-keyword |           2 |  ops/s |
|                                                 Max Throughput | composite_terms-keyword |           2 |  ops/s |
|                                        50th percentile latency | composite_terms-keyword |     325.905 |     ms |
|                                        90th percentile latency | composite_terms-keyword |     333.594 |     ms |
|                                        99th percentile latency | composite_terms-keyword |      342.35 |     ms |
|                                       100th percentile latency | composite_terms-keyword |     347.512 |     ms |
|                                   50th percentile service time | composite_terms-keyword |      324.58 |     ms |
|                                   90th percentile service time | composite_terms-keyword |     332.423 |     ms |
|                                   99th percentile service time | composite_terms-keyword |     341.288 |     ms |
|                                  100th percentile service time | composite_terms-keyword |     345.907 |     ms |
|                                                     error rate | composite_terms-keyword |           0 |      % |

When we use OS 2.19.0 index with OS 3.0, it performs better which is the issue I had seen earlier.

The setup details are captured at the meta I have also updated the segment count there, thanks for bringing it up.

@dhwanilpatel
Copy link
Contributor

We have observed some skweness in segment sizes with force merging to 5/10 segments across both versions. some segments become as hugh as 23gb, which might skew the results of the perf runs.

We have triggered the force merge to 1 and compared the results of 2.19/3.0. 3.0 Seems to be performing batter with forced merge to 1 segment compare to 2.19.
3.0 has latency around 240ms while 2.19 has latency around 255ms

Below are the result of the runs with one segment for composite_terms-keyword.

Opensearch 3.0

MetricName Run1 Run2 Run3 Run4 Run5
50th percentile latency 242.761 241.351 240.904 239.761 239.28
90th percentile latency 260.406 257.365 260.409 253.7 252.469
99th percentile latency 272.212 272.393 263.501 270.534 270.86
100th percentile latency 277.806 273.662 273.142 271.934 270.962
50th percentile service time 241.76 240.005 239.269 238.51 237.977
90th percentile service time 259.013 255.846 258.993 252.852 251.001
99th percentile service time 270.599 270.486 262.097 269.388 269.682
100th percentile service time 276.174 272.351 271.382 270.676 269.775

Opensearch 2.19

MetricName Run1 Run2 Run3 Run4 Run5
50th percentile latency 260.963 260.199 254.521 254.469 254.111
90th percentile latency 276.657 278.762 270.078 268.891 269.231
99th percentile latency 283.79 291.543 285.353 285.605 274.279
100th percentile latency 288.058 292.433 286.576 286.533 285.54
50th percentile service time 259.512 259.018 253.24 253.107 253.009
90th percentile service time 275.836 277.27 268.195 267.496 268.0
99th percentile service time 282.864 290.598 284.072 284.209 272.754
100th percentile service time 286.442 290.666 285.099 285.475 283.859

cc: @backslasht / @mgodwan / @expani

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Search:Performance
Projects
Status: 🆕 New
Development

No branches or pull requests

4 participants