Hello devs, I'm Vishvesh (NerdyVisky)
About Me
I'm a Machine Learning Engineer + Researcher currently pursuing my M.S. in Computer Science at NYU Courant (GPA: 4.0).
My work sits at the intersection of:
- LLM reasoning, retrieval & attention mechanisms
- Healthcare ML & safety for deployed clinical AI
- Document intelligence, VLMs, synthetic data generation
- Production ML systems & model monitoring
Current Roles
Graduate Research Assistant -- NYU CILVR Lab
Working with Prof. Eunsol Choi on multilingual LLM retrieval:
- Improving in-context fact retrieval across 5 languages
- Modifying attention mechanisms in LLaMa-3.2-8B, Qwen-2.5-7B, Phi-3.5
- Achieved 15% retrieval gains with 30% lower KV-cache
Machine Learning Engineer -- NYU Langone Health (BioDASH)
Building production-grade safety systems for ML models powering clinical workflows across 23 hospitals.
- Designed drift detection pipelines (K-S, PSI, DeLong)
- Real-time monitoring with Prometheus + Grafana
- Extensive work with HIPAA-compliant datasets (EPIC COSMOS, OMOP CDM, Caboodle, Clarity)
- Co-authored NIH & PCORi grant proposals
ML Researcher -- CVIT, IIIT Hyderabad
Published at ICDAR 2025 (Oral, Top 2%).
- Generated 18k synthetic slides using novel LLM pipeline
- Boosted VLM performance by 13% mAP and 10% Recall@K
- HuggingFace model reached 500+ downloads
Publication
ICDAR 2025 (Oral, Top 2%)
AI-Generated Lecture Slides for Improving Slide Element Detection and Retrieval
Maniyar, Trivedi et al.
Project: https://synslidegen.github.io
DOI: https://doi.org/10.1007/978-3-032-04614-7_11
Featured Projects
Adaptive & Warp-Cooperative GPU Hash Table
Code: https://github.com/NerdyVisky/adaptive-gpu-hashtable
- Built a high-performance adaptive GPU hash table in C++/CUDA using cooperative groups and elected-lane atomics
- Achieved 21x faster inserts and 20x faster lookups, outperforming naive GPU hashing at scale
- Implemented epoch-based dynamic resizing + compaction for non-blocking concurrency
- Sustains stable throughput on 100M+ operations, even at 0.99 load factor
Attention-Aware DPO for Multi-Image VQA
- Designed Attention-Aware DPO improving multi-image VQA accuracy by 8.5%
- Applied AdaptVis for inference boosts - 10% over base model
- Built LLM-as-a-judge with Gemini-2.5-Pro
Code: https://github.com/harsh-sutariya/AA-DPO
Website: https://nerdyvisky.github.io/projects/AttnDPO/
Open-Source Contribution: Retrieval Heads (ICLR 2025 spotlight)
- Refactored full codebase for faster, leaner execution
- Built vectorized dataloaders, added flash-attention, integrated vLLM
- Reduced inference time 2 hrs - 30 mins (4x faster)
Code: https://github.com/NerdyVisky/multilingual-retrieval-translation-heads
Tech Stack
Languages
Python * C/C++ * R * SQL (Postgres, MySQL) * JavaScript * TypeScript * Bash/Zsh
Frameworks & Libraries
PyTorch * TensorFlow * HuggingFace * LangChain
NumPy * Pandas * scikit-learn * Matplotlib
DevOps & Tools
AWS * GCP * Azure * Databricks
Docker * Git * Redis * MongoDB
Prometheus * Grafana
GitHub Stats
Thanks for stopping by! Feel free to reach out if you're working on LLMs, retrieval, ML safety, or healthcare AI.
[Nov 2025] - I am looking for fulltime roles related to SDE/MLE and Applied Science based in the US starting Summer 2026. I am a US Permanent Resident (Green Card), and hence require no visa sponsorship. If you're hiring and like my work, feel free to connect on my email : vishvesh106@gmail.com