LLM infrastructure, generative AI, production ML systems, and 176+ open-source repositories on GitHub.
LLM infrastructure, generative AI, and production ML systems
LLMs, diffusion models, efficient inference, and production ML systems
A comprehensive, chapter-by-chapter guide to LLMs — from probability basics to scaling laws, with hands-on fine-tuning code.
PyQt5 desktop GUI for fine-tuning, evaluating, and deploying LLMs using torchtune — no command-line required.
Comprehensive guides for working with Large Language Models — architectures, training, and deployment strategies.
Curated collection of 150+ research papers on KV Cache Management, KV Cache Compression, and LLM Compression for efficient inference.
Implementation of LoRA: Low-Rank Adaptation of Large Language Models — parameter-efficient fine-tuning from scratch.
Diffusion-style denoising for text generation — iteratively refining noisy sequences into coherent text, an alternative to autoregressive LLMs.
Refining Gated Linear Attention — efficient alternative to softmax attention for scalable sequence modeling.
Three in-depth surveys covering efficient transformer architectures, attention variants, and optimization techniques.
End-to-end object detection system with PyQt5 desktop app — LoRA/QLoRA fine-tuning, knowledge distillation, ONNX export, INT8 quantization. 0.49M–2.44M params, 100+ FPS.
Research paper on data drift in production ML — taxonomy (covariate/concept/label shift), mathematical formulations (KL, PSI, Wasserstein), monitoring architectures, and 200+ curated papers.
Comprehensive guide to ML System Design — covering LLM serving, training pipelines, scaling, and real-world architecture patterns.
Remove objects from photos including shadows and reflections using generative inpainting — end-to-end diffusion-based restoration.
170+ stars — well-organized Data Structures and Algorithms covering fundamentals to advanced topics for coding interviews.