Welcome to llm-d: a Kubernetes-native high-performance distributed LLM inference framework
llm-d is a Kubernetes-native high-performance distributed LLM inference framework that provides the fastest time-to-value and competitive performance per dollar. Built on vLLM, Kubernetes, and Inference Gateway, llm-d offers modular solutions for distributed inference with features like KV-cache aware routing and disaggregated serving.
Quick Start Guide
New to llm-d? Here's how to get started:
- Join our Slack - Get your invite and visit llm-d.slack.com
- Explore our code - GitHub Organization
- Join a meeting - Add calendar
- Pick your area - Browse Special Interest Groups.
Key Resources
- Documentation: llm-d.ai
- Architecture: Architecture docs
- Project Details: PROJECT.md
- Releases: GitHub Releases
- Upcoming Events: Upcoming Events
Communication Channels
- Slack: llm-d Workspace - Daily conversations and Q&A
- GitHub: llm-d Organization - Code, issues, and discussions
- Google Group: llm-d-contributors - Architecture diagrams and updates
- Google Drive: Public Documentation - Meeting recordings and project docs
Regular Meetings
All meetings are open to the public!
- Weekly Standup: Every Wednesday at 12:30pm ET - Project updates and open discussion
- SIG Meetings: Various times throughout the week - See SIG details for schedules
Join to participate, ask questions, or just listen and learn!
Special Interest Groups (SIGs)
Want to dive deeper into specific areas? Our Special Interest Groups are focused teams working on different aspects of llm-d:
- Inference Scheduler - Intelligent request routing and load balancing
- Benchmarking - Performance testing and optimization
- PD-Disaggregation - Prefill/decode separation patterns
- KV-Disaggregation - KV caching and distributed storage
- Installation - Kubernetes integration and deployment
- Autoscaling - Traffic-aware autoscaling and resource management
- Observability - Monitoring, logging, and metrics
How to Contribute
Getting Involved
- Upcoming Events - Meetups, talks, and conferences
- Contributing Guidelines - Complete guide to contributing code, docs, and ideas
- Special Interest Groups (SIGs) - Join focused teams working on specific areas
- Code of Conduct - Our community standards and values
Contributing Code
- Read Guidelines: Review our Code of Conduct and contribution process
- Sign Commits: All commits require DCO sign-off (
git commit -s)
Ways to Contribute
- Bug fixes and small features - Submit PRs directly to component repos
- New features with APIs - Require project proposals
- Documentation - Help improve guides and examples
- Testing & Benchmarking - Contribute to our test coverage
- Experimental features - Start in llm-d-incubation org
Security & Safety
- Security Policy - How to report vulnerabilities and security issues
- Security Announcements - Join the llm-d-security-announce group for emails about security and major API announcements.
Connect With Us
Follow llm-d across social platforms for updates, discussions, and community highlights:
- LinkedIn: @llm-d
- X (Twitter): @_llm_d_
- Bluesky: @llm-d.ai
- Reddit: r/llm_d
- YouTube: @llm-d-project
Need Help?
Questions? Ideas? Just want to chat? We're here to help! The llm-d community team is friendly and responsive.
- Slack: Join our Slack workspace and mention
@community-teamfor quick response - GitHub Issues: Open an issue for bug reports, feature requests, or general questions
- Mailing List: llm-d-contributors for broader community discussions
License: Apache 2.0