Dark Mode

Skip to content

Navigation Menu

Sign in
Appearance settings
llm-d-incubation

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

llm-d incubation

Incubating components of llm-d, a Kubernetes-native high-performance distributed LLM inference framework

Popular repositories Loading

  1. llm-d-infra llm-d-infra Public

    llm-d helm charts and deployment examples

    Go Template 50 55

  2. llm-d-modelservice llm-d-modelservice Public

    helm charts for deploying models with llm-d

    Go Template 29 53

  3. llm-d-fast-model-actuation llm-d-fast-model-actuation Public

    Kubernetes controllers for fast model actuation using vLLM sleep/wake and launcher-based model swapping

    Go 9 12

  4. batch-gateway batch-gateway Public

    The batch gateway is an llm-d implementation of the OpenAI batch inference API

    Go 7 12

  5. secure-inference secure-inference Public

    Go 3 3

  6. ig-wva ig-wva Public

    Workload Variant Autoscaler is a service to compute the cost-optimal provisioning of heterogeneous accelerators for inference workloads with varying request latency objectives

    Jupyter Notebook 2 2

Repositories

Loading
Type
Select type
Language
Select language
Sort
Select order
Showing 9 of 9 repositories

Top languages

Loading...

Most used topics

Loading...