[FEA] Add cuvs-lucene and elastic backends to cuvs_bench #1856

New issue

Open

Assignees

Labels

feature request

Description

afourniernv

opened

on Feb 26, 2026

Is your feature request related to a problem? Please describe.

cuvs_bench has a pluggable backend system but currently only supports C++ executables (cpp_gbench). Users need to benchmark (1) cuvs-lucene (Lucene connector for cuVS CAGRA) and (2) Elasticsearch GPU (cuVS-backed HNSW over HTTP)--two different runtimes that both use cuVS under the hood. There's no unified way to benchmark these alongside the existing C++ algorithms.

Describe the solution you'd like

Add two pluggable backends to cuvs_bench:

CuvsLuceneBackend -- Subprocess to run the cuvs-lucene JAR (java -jar ). The JAR performs build and search; cuvs_bench parses output into BuildResult/SearchResult. Config via jar_path (or CUVS_LUCENE_JAR), dataset, dataset_path. Supports investigations into concurrent searches on cuvs-lucene.
ElasticBackend -- HTTP client to talk to an Elasticsearch GPU instance. Create index, bulk index vectors, run kNN search, compute recall. Config via host, port, HNSW params (m, ef_construction, num_candidates). Reuse logic from cuvs-bencher.

Both implement BenchmarkBackend, have ConfigLoaders, return BuildResult/SearchResult, and plug into the existing orchestrator. Registration keys: cuvs_lucene and elastic.

Describe alternatives you've considered

HTTP for cuvs-lucene: Rejected--cuvs-lucene is a JAR, not a server; subprocess matches the CppGoogleBenchmarkBackend pattern.
JAR for Elasticsearch: Rejected--Elasticsearch runs as a service; benchmarking requires an HTTP client, not a JAR.
Separate tools (e.g. cuvs-bencher): Guidance is to contribute backends to cuvs_bench instead of maintaining a separate SDK.

Additional context

PRD: CuVs-smoke/docs/benchmarking/CUVS-LUCENE-BACKEND-PRD.md
cuvs_bench structure: backends/base.py (BenchmarkBackend ABC), backends/cpp_gbench.py (subprocess pattern), orchestrator/config_loaders.py (ConfigLoader ABC). Base class already supports network backends (requires_network, initialize(), cleanup(), _check_network_available()).
cuvs-bencher: ElasticsearchBackend.run_benchmark() in packages/cuvs_bencher_elastic/ provides reusable ES HTTP benchmarking logic.
CAGRA params for cuvs-lucene: graph_degree, intermediate_graph_degree, itopk, search_width (see config/algos/cuvs_cagra.yaml, bug_issue_93_reproducer.cu).
Elasticsearch HNSW params: m, ef_construction, num_candidates (see ES-GPU-API-REFERENCE.md).

Metadata

Assignees

afourniernv

Labels

feature request

Type

No type

Projects

Vector Search, ML, & Data Mining Release Board

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Add cuvs-lucene and elastic backends to cuvs_bench #1856

Description

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions