Dark Mode

Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[FEA] Add cuvs-lucene and elastic backends to cuvs_bench #1856

Open
Open
[FEA] Add cuvs-lucene and elastic backends to cuvs_bench#1856
Assignees
Labels

Description

Is your feature request related to a problem? Please describe.

cuvs_bench has a pluggable backend system but currently only supports C++ executables (cpp_gbench). Users need to benchmark (1) cuvs-lucene (Lucene connector for cuVS CAGRA) and (2) Elasticsearch GPU (cuVS-backed HNSW over HTTP)--two different runtimes that both use cuVS under the hood. There's no unified way to benchmark these alongside the existing C++ algorithms.

Describe the solution you'd like

Add two pluggable backends to cuvs_bench:

  1. CuvsLuceneBackend -- Subprocess to run the cuvs-lucene JAR (java -jar ). The JAR performs build and search; cuvs_bench parses output into BuildResult/SearchResult. Config via jar_path (or CUVS_LUCENE_JAR), dataset, dataset_path. Supports investigations into concurrent searches on cuvs-lucene.

  2. ElasticBackend -- HTTP client to talk to an Elasticsearch GPU instance. Create index, bulk index vectors, run kNN search, compute recall. Config via host, port, HNSW params (m, ef_construction, num_candidates). Reuse logic from cuvs-bencher.

Both implement BenchmarkBackend, have ConfigLoaders, return BuildResult/SearchResult, and plug into the existing orchestrator. Registration keys: cuvs_lucene and elastic.

Describe alternatives you've considered

  • HTTP for cuvs-lucene: Rejected--cuvs-lucene is a JAR, not a server; subprocess matches the CppGoogleBenchmarkBackend pattern.
  • JAR for Elasticsearch: Rejected--Elasticsearch runs as a service; benchmarking requires an HTTP client, not a JAR.
  • Separate tools (e.g. cuvs-bencher): Guidance is to contribute backends to cuvs_bench instead of maintaining a separate SDK.

Additional context

  • PRD: CuVs-smoke/docs/benchmarking/CUVS-LUCENE-BACKEND-PRD.md
  • cuvs_bench structure: backends/base.py (BenchmarkBackend ABC), backends/cpp_gbench.py (subprocess pattern), orchestrator/config_loaders.py (ConfigLoader ABC). Base class already supports network backends (requires_network, initialize(), cleanup(), _check_network_available()).
  • cuvs-bencher: ElasticsearchBackend.run_benchmark() in packages/cuvs_bencher_elastic/ provides reusable ES HTTP benchmarking logic.
  • CAGRA params for cuvs-lucene: graph_degree, intermediate_graph_degree, itopk, search_width (see config/algos/cuvs_cagra.yaml, bug_issue_93_reproducer.cu).
  • Elasticsearch HNSW params: m, ef_construction, num_candidates (see ES-GPU-API-REFERENCE.md).

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions